bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Debug Info / PR22309: Allow union types to be emitted as unsigned constants.	Adrian Prantl	2015-01-23	1	-0/+66
\| \| \| \|	llvm-svn: 226919
*	[mips] Add new error message and improve testing for parsing the .module ↵	Toma Tabacu	2015-01-23	1	-13/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	directive. Summary: We used to silently ignore any empty .module's and we used to give an error saying that we found an "unexpected token at start of statement" when the value of the option wasn't an identifier (e.g. if it was a number). We now give an error saying that we "expected .module option identifier" in both of those cases. I also fixed the other tests in mips-abi-bad.s, which all seemed to be broken. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7095 llvm-svn: 226905
*	This patch fixes issue with lowering below mentioned pattern :-	Jyoti Allur	2015-01-23	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	_foo: smull r0, r1, r1, r0 smull r2, r3, r3, r2 adds r0, r2, r0 adc r1, r3, r1 bx lr to _foo: smull r0, r1, r1, r0 smlal r0, r1, r3, r2 bx lr llvm-svn: 226904
*	[x86] Change u8imm operands to always print as unsigned. This makes shuffle ↵	Craig Topper	2015-01-23	9	-50/+50
\| \| \| \| \| \|	masks and the like make way more sense. llvm-svn: 226902
*	Add STB_GNU_UNIQUE to the ELF writer.	Rafael Espindola	2015-01-23	1	-1/+11
\| \| \| \| \| \|	This lets llvm-mc assemble files produced by gcc. llvm-svn: 226895
*	R600: Try to use lower types for 64bit division if possible	Jan Vesely	2015-01-22	2	-2/+354
\| \| \| \| \| \| \| \|	v2: add and enable tests for SI Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226881
*	[X86][AVX] Added (V)MOVDDUP / (V)MOVSLDUP / (V)MOVSHDUP memory folding + tests.	Simon Pilgrim	2015-01-22	2	-17/+35
\| \| \| \| \| \| \| \|	Minor tweak now that D7042 is complete, we can enable stack folding for (V)MOVDDUP and do proper testing. Added missing AVX ymm folding patterns and fixed alignment for AVX VMOVSLDUP / VMOVSHDUP. llvm-svn: 226873
*	Line endings fixes. NFC.	Simon Pilgrim	2015-01-22	1	-344/+344
\| \| \| \|	llvm-svn: 226872
*	[X86][SSE] Simplified PSUBUS tests	Simon Pilgrim	2015-01-22	1	-189/+199
\| \| \| \| \| \| \| \|	Removed loops from PSUBUS tests - ensures folding is tested. Also renamed SSE2 tests SSSE3 to match cpu. This is a follow up commit agreed in http://reviews.llvm.org/D7094 llvm-svn: 226871
*	[PM] Actually add the new pass manager support for the assumption cache.	Chandler Carruth	2015-01-22	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I had already factored this analysis specifically to enable doing this, but hadn't actually committed the necessary wiring to get at this from the new pass manager. This also nicely shows how the separate cache object can be directly managed by the new pass manager. This analysis didn't have any direct tests and so I've added a printer pass and a boring test case. I chose to print the i1 value which is being assumed rather than the call to llvm.assume as that seems much more useful for testing... but suggestions on an even better printing strategy welcome. My main goal was to make sure things actually work. =] llvm-svn: 226868
*	IR: Update references to temporaries before deleting	Duncan P. N. Exon Smith	2015-01-22	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During `MDNode::deleteTemporary()`, call `replaceAllUsesWith(nullptr)` to update all tracking references to `nullptr`. This fixes PR22280, where inverted destruction order between tracking references and the temporaries themselves caused a use-after-free in `LLParser`. An alternative fix would be to add an assertion that there are no users, and continue to fix inverted destruction order in clients (like `LLParser`), but instead I decided to make getting-teardown-right easy. (If someone disagrees let me know.) llvm-svn: 226866
*	Intrinsics: introduce llvm_any_ty aka ValueType Any	Ramkumar Ramachandra	2015-01-22	2	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Specifically, gc.result benefits from this greatly. Instead of: gc.result.int.* gc.result.float.* gc.result.ptr.* ... We now have a gc.result.* that can specialize to literally any type. Differential Revision: http://reviews.llvm.org/D7020 llvm-svn: 226857
*	Revert "Don't remove a landing pad if the invoke requires a table entry."	Reid Kleckner	2015-01-22	1	-77/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r176827. Björn Steinbrink pointed out that this didn't actually fix the bug (PR15555) it was attempting to fix. With this reverted, we can now remove landingpad cleanups that immediately resume unwinding, converting the invoke to a call. llvm-svn: 226850
*	Add the option, -indirect-symbols, used with -macho to print the Mach-O ↵	Kevin Enderby	2015-01-22	1	-0/+12
\| \| \| \| \| \|	indirect symbol table to llvm-objdump. llvm-svn: 226848
*	merge consecutive stores of extracted vector elements (PR21711)	Sanjay Patel	2015-01-22	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a 2nd try at the same optimization as http://reviews.llvm.org/D6698. That patch was checked in at r224611, but reverted at r225031 because it caused a failure outside of the regression tests. The cause of the crash was not recognizing consecutive stores that have mixed source values (loads and vector element extracts), so this patch adds a check to bail out if any store value is not coming from a vector element extract. This patch also refactors the shared logic of the constant source and vector extracted elements source cases into a helper function. Differential Revision: http://reviews.llvm.org/D6850 llvm-svn: 226845
*	Revert "PR21408: Workaround the appearance of duplicate variables due to ↵	David Blaikie	2015-01-22	1	-117/+0
\| \| \| \| \| \| \| \| \| \| \|	problems when inlining two calls to the same function from the same call site." The underlying bug has been fixed in r226736 so there's no need to workaround this anymore. This reverts commit r220923. llvm-svn: 226842
*	[pr21886] Change MCJIT/ELF to support MSVC C++ mangled symbol.	Rafael Espindola	2015-01-22	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \|	The ELF format is used on Windows by the MCJIT engine. Thus, on Windows, the ELFObjectWriter can encounter symbols mangled using the MS Visual Studio C++ name mangling. Symbols mangled using the MSVC C++ name mangling can legally have "@@@" as a substring. The EFLObjectWriter should not interpret the "@@@" substring as specifying GNU-style symbol versioning. The ELFObjectWriter therefore check for the MSVC C++ name mangling prefix which is either "?", "@?", "imp_?" or "imp_?@". llvm-svn: 226830
*	[DAGCombine] Produce better code for constant splats	Michael Kuperstein	2015-01-22	3	-4/+44
\| \| \| \| \| \| \| \| \| \| \|	This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 Fixed recommit of r226811. llvm-svn: 226816
*	Revert r226811, MSVC accepts code sane compilers don't.	Michael Kuperstein	2015-01-22	3	-44/+4
\| \| \| \|	llvm-svn: 226814
*	[DAGCombine] Produce better code for constant splats	Michael Kuperstein	2015-01-22	3	-4/+44
\| \| \| \| \| \| \| \| \|	This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 llvm-svn: 226811
*	Fixed a bug in type legalizer for masked load/store intrinsics.	Elena Demikhovsky	2015-01-22	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \|	The problem occurs when after vectorization we have type <2 x i32>. This type is promoted to <2 x i64> and then requires additional efforts for expanding loads and truncating stores. I added EXPAND / TRUNCATE attributes to the masked load/store SDNodes. The code now contains additional shuffles. I've prepared changes in the cost estimation for masked memory operations, it will be submitted separately. llvm-svn: 226808
*	Fixed a bug in narrowing store operation.	Elena Demikhovsky	2015-01-22	1	-0/+10
\| \| \| \| \| \| \| \| \|	Type MVT::i1 became legal in KNL, but store operation can't be narrowed to this type, since the size of VT (1 bit) is not equal to its actual store size(8 bits). Added a test provided by David (dag@cray.com) llvm-svn: 226805
*	Fix crashes in IRCE caused by mismatched types	Sanjoy Das	2015-01-22	1	-0/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are places where the inductive range check elimination pass depends on two llvm::Values or llvm::SCEVs to be of the same llvm::Type when they do not need to be. This patch relaxes those restrictions (by bailing out of the optimization if the types mismatch), and adds test cases to trigger those paths. These issues were found by bootstrapping clang with IRCE running in the -O3 pass ordering. Differential Revision: http://reviews.llvm.org/D7082 llvm-svn: 226793
*	Fixed a bug in masked load/store in reversed loop.	Elena Demikhovsky	2015-01-22	1	-0/+82
\| \| \| \| \| \| \| \| \|	Added a test. The bug was submitted to bugzilla: http://llvm.org/bugs/show_bug.cgi?id=22225 llvm-svn: 226791
*	[canonicalize] Teach InstCombine to canonicalize loads which are only	Chandler Carruth	2015-01-22	2	-3/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ever stored to always use a legal integer type if one is available. Regardless of whether this particular type is good or bad, it ensures we don't get weird differences in generated code (and resulting performance) from "equivalent" patterns that happen to end up using a slightly different type. After some discussion on llvmdev it seems everyone generally likes this canonicalization. However, there may be some parts of LLVM that handle it poorly and need to be fixed. I have at least verified that this doesn't impede GVN and instcombine's store-to-load forwarding powers in any obvious cases. Subtle cases are exactly what we need te flush out if they remain. Also note that this IR pattern should already be hitting LLVM from Clang at least because it is exactly the IR which would be produced if you used memcpy to copy a pointer or floating point between memory instead of a variable. llvm-svn: 226781
*	ARM: fail less catastrophically on invalid Windows input	Saleem Abdulrasool	2015-01-22	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \|	Windows supports a restricted set of relocations (compared to ARM ELF). In some cases, we may end up generating an unsupported relocation. This can occur with bad input to the assembler in particular (the frontend should never generate code that cannot be compiled). Generate an error rather than just aborting. The change in the API is driven by the desire to provide a slightly more helpful message for debugging purposes. llvm-svn: 226779
*	SEH: Finish writing the catch-all test case	Reid Kleckner	2015-01-22	1	-1/+5
\| \| \| \|	llvm-svn: 226768
*	Win64 SEH: Emit the constant 1 for catch-all into xdata	Reid Kleckner	2015-01-22	1	-0/+29
\| \| \| \|	llvm-svn: 226767
*	Make ScalarEvolution less aggressive with respect to no-wrap flags.	Sanjoy Das	2015-01-22	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \|	ScalarEvolution currently lowers a subtraction recurrence to an add recurrence with the same no-wrap flags as the subtraction. This is incorrect because `sub nsw X, Y` is not the same as `add nsw X, -Y` and `sub nuw X, Y` is not the same as `add nuw X, -Y`. This patch fixes the issue, and adds two test cases demonstrating the bug. Differential Revision: http://reviews.llvm.org/D7081 llvm-svn: 226755
*	[X86][SSE] Missing SSE/AVX1 memory folding integer instructions	Simon Pilgrim	2015-01-21	4	-219/+647
\| \| \| \| \| \| \| \| \| \|	Added most of the missing integer vector folding patterns for SSE (to SSE42) and AVX1. The most useful of these are probably the i32/i64 extraction, i8/i16/i32/i64 insertions, zero/sign extension, unsigned saturation subtractions, i64 subtractions and the variable mask blends (pblendvb) - others include CLMUL, SSE42 string comparisons and bit tests. Differential Revision: http://reviews.llvm.org/D7094 llvm-svn: 226745
*	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))	Tim Northover	2015-01-21	3	-16/+57
\| \| \| \| \| \| \|	It can help with argument juggling on some targets, and is generally a good idea. llvm-svn: 226740
*	DebugInfo: Use distinct inlinedAt MDLocations to avoid separate inlined ↵	David Blaikie	2015-01-21	6	-9/+132
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	calls being coalesced When two calls from the same MDLocation are inlined they currently get treated as one inlined function call (creating difficulty debugging, duplicate variables, etc). Clang worked around this by including column information on inline calls which doesn't address LTO inlining or calls to the same function from the same line and column (such as through a macro). It also didn't address ctor and member function calls. By making the inlinedAt locations distinct, every call site has an explicitly distinct location that cannot be coalesced with any other call. This can produce linearly (2x in the worst case where every call is inlined and the call instruction has a non-call instruction at the same location) more debug locations. Any increase beyond that are in cases where the Clang workaround was insufficient and the new scheme is creating necessary distinct nodes that were being erroneously coalesced previously. After this change to LLVM the incomplete workarounds in Clang. That should reduce the number of debug locations (in a build without column info, the default on Darwin, not the default on Linux) by not creating pseudo-distinct locations for every call to an inline function. (oh, and I made the inlined-at chain rebuilding iterative instead of recursive because I was having trouble wrapping my head around it the way it was - open to discussion on the right design for that function (including going back to a recursive solution)) llvm-svn: 226736
*	R600: Add checks for urem/srem by a constant	Matt Arsenault	2015-01-21	2	-1/+29
\| \| \| \| \| \| \|	Make sure this uses the faster expansion using magic constants to avoid the full division path. llvm-svn: 226734
*	[X86][SSE] Added support for SSE3 lane duplication shuffle instructions	Simon Pilgrim	2015-01-21	12	-503/+493
\| \| \| \| \| \| \| \| \| \| \| \|	This patch adds shuffle matching for the SSE3 MOVDDUP, MOVSLDUP and MOVSHDUP instructions. The big use of these being that they avoid many single source shuffles from needing to use (pre-AVX) dual source instructions such as SHUFPD/SHUFPS: causing extra moves and preventing load folds. Adding these instructions uncovered an issue in XFormVExtractWithShuffleIntoLoad which crashed on single operand shuffle instructions (now fixed). It also involved fixing getTargetShuffleMask to correctly identify theses instructions as unary shuffles. Also adds a missing tablegen pattern for MOVDDUP. Differential Revision: http://reviews.llvm.org/D7042 llvm-svn: 226716
*	R600: Add missing tests for i64 srem	Matt Arsenault	2015-01-21	1	-0/+48
\| \| \| \|	llvm-svn: 226713
*	Fix load-store optimizer on thumbv4t	Jonathan Roelofs	2015-01-21	1	-0/+55
\| \| \| \| \| \| \| \| \| \|	Thumbv4t does not have lo->lo copies other than MOVS, and that can't be predicated. So emit MOVS when needed and bail if there's a predicate. http://reviews.llvm.org/D6592 llvm-svn: 226711
*	Added test to cover the CFLAA bitset indexing bug.	George Burgess IV	2015-01-21	1	-0/+33
\| \| \| \|	llvm-svn: 226710
*	InstCombine: Don't strip bitcasts off of callsites marked 'thunk'	David Majnemer	2015-01-21	1	-0/+11
\| \| \| \| \| \| \|	The return type of a thunk is meaningless, we just want the arguments and return value to be forwarded. llvm-svn: 226708
*	R600/SI: Custom lower fround	Matt Arsenault	2015-01-21	2	-27/+124
\| \| \| \| \| \| \| \| \|	This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. llvm-svn: 226682
*	[Hexagon] Converting multiply and accumulate with immediate intrinsics to ↵	Colin LeMahieu	2015-01-21	1	-0/+120
\| \| \| \| \| \|	patterns. llvm-svn: 226681
*	[X86] Declare SSE4.1/AVX2 vector extloads covered by PMOV[SZ]X legal.	Ahmed Bougacha	2015-01-21	3	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we can fully specify extload legality, we can declare them legal for the PMOVSX/PMOVZX instructions. This for instance enables a DAGCombine to fire on code such as (and (<zextload-equivalent> ...), <redundant mask>) to turn it into: (zextload ...) as seen in the testcase changes. There is one regression, in widen_load-2.ll: we're no longer able to do store-to-load forwarding with illegal extload memory types. This will be addressed separately. Differential Revision: http://reviews.llvm.org/D6533 llvm-svn: 226676
*	Revert "DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))"	Tim Northover	2015-01-21	3	-57/+16
\| \| \| \| \| \| \| \|	It hadn't gone through review yet, but was still on my local copy. This reverts commit r226663 llvm-svn: 226665
*	AArch64: add backend option to reserve x18 (platform register)	Tim Northover	2015-01-21	1	-7/+8
\| \| \| \| \| \| \| \| \|	AAPCS64 says that it's up to the platform to specify whether x18 is reserved, and a first step on that way is to add a flag controlling it. From: Andrew Turner <andrew@fubar.geek.nz> llvm-svn: 226664
*	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))	Tim Northover	2015-01-21	3	-16/+57
\| \| \| \|	llvm-svn: 226663
*	[x32] Fast ISel should use LEA64_32r instead of LEA32r to adjust addresses ↵	Michael Kuperstein	2015-01-21	1	-0/+10
\| \| \| \| \| \|	in x32 mode. llvm-svn: 226661
*	Use a smaller pragma unroll threshold to reduce test execution time.	Alexander Potapenko	2015-01-21	1	-2/+2
\| \| \| \| \| \| \|	When opt is compiled with AddressSanitizer it takes more than 30 seconds to unroll the loop in unroll_1M(). llvm-svn: 226660
*	[msan] Update origin for the entire destination range on memory store.	Evgeniy Stepanov	2015-01-21	1	-0/+89
\| \| \| \| \| \| \| \| \|	Previously we always stored 4 bytes of origin at the destination address even for 8-byte (and longer) stores. This should fix rare missing, or incorrect, origin stacks in MSan reports. llvm-svn: 226658
*	[mips][microMIPS] MicroMIPS 16-bit unconditional branch instruction B	Jozef Kolek	2015-01-21	7	-34/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement microMIPS 16-bit unconditional branch instruction B. Implemented 16-bit microMIPS unconditional instruction has real name B16, and B is an alias which expands to either B16 or BEQ according to the rules: b 256 --> b16 256 # R_MICROMIPS_PC10_S1 b 12256 --> beq $zero, $zero, 12256 # R_MICROMIPS_PC16_S1 b label --> beq $zero, $zero, label # R_MICROMIPS_PC16_S1 Differential Revision: http://reviews.llvm.org/D3514 llvm-svn: 226657
*	[mips][microMIPS] Implement ADDIUPC instruction	Jozef Kolek	2015-01-21	4	-0/+30
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D6582 llvm-svn: 226656
*	[Mips][Disassembler]When disassembler meets load/store from coprocessor 2 ↵	Vladimir Medic	2015-01-21	6	-8/+16
\| \| \| \| \| \|	instructions for mips r6 it crashes as the access to operands array is out of range. This patch adds dedicated decoder method that properly handles decoding of these instructions. llvm-svn: 226652