bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Refactor backend diagnostics for unsupported features	Oliver Stannard	2016-01-27	9	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. The implementation of DiagnosticInfoUnsupported::print must be in lib/Codegen rather than in the existing file in lib/IR/ to avoid introducing a dependency from IR to CodeGen. Differential Revision: http://reviews.llvm.org/D16590 llvm-svn: 258951
*	[X86][SSE] Test insertps instrinsic calls with masks that can't combine to ↵	Simon Pilgrim	2016-01-27	2	-6/+6
\| \| \| \| \| \| \| \|	something simpler For these basic tests of the intrinsic, make sure the mask can't simplify to movss, blend-with-zero or something else llvm-svn: 258941
*	[DebugInfo] Support zero-length CIE in the _eh_frame parser	Igor Laevsky	2016-01-27	2	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \|	MCJIT emits zero-length CIE at the end of the _eh_frame section. This change ensures that parser inside DebugInfo will not crash and correctly record such cases. We are now recording DW_EH_PE_omit as a default value for FDE and LSDA encodings. Also Offset != EndAugmentationOffset assertion check will only happen if augmentation string had 'z' letter in it. Differential Revision: http://reviews.llvm.org/D16588 llvm-svn: 258931
*	Reapply commit r258404 with fix	Matthew Simpson	2016-01-27	1	-13/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is the second attempt to reapply commit r258404. There was bug in the initial patch and subsequent fix (mentioned below). The initial patch caused an assertion because we were computing smaller type sizes for instructions that cannot be demoted. The fix first determines the instructions that will be demoted, and then applies the smaller type size to only those instructions. This should fix PR26239 and PR26307. llvm-svn: 258929
*	Revert "Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed."	Benjamin Kramer	2016-01-27	5	-69/+22
\| \| \| \| \| \| \| \| \|	and "Add a missing test case for r258847." This reverts commit r258847, r258848. Causes miscompilations and backend errors. llvm-svn: 258927
*	Add missing build attribute regression tests for Cortex-A8	Sjoerd Meijer	2016-01-27	1	-8/+69
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16576 llvm-svn: 258923
*	AMDGPU/SI: Stoney has only 16 LDS banks	Marek Olsak	2016-01-27	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a candidate for stable, along with all patches that add the "stoney" processor. Reviewers: tstellarAMD Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16485 llvm-svn: 258922
*	AVX512: Fix vpmovzxbw predicate for AVX1/2 instructions.	Igor Breger	2016-01-27	1	-0/+7
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16595 llvm-svn: 258915
*	AVX512: Add store mask patterns.	Igor Breger	2016-01-27	1	-0/+180
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16596 llvm-svn: 258914
*	[IndVarSimplify] Rewrite loop exit values with their initial values from ↵	Chen Li	2016-01-27	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	loop preheader Summary: This is a revised version of D13974, and the following quoted summary are from D13974 "This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch." D13974 was committed but failed one lnt test. The bug was that we only checked the condition from loop exit's incoming block was a loop invariant. But there could be another condition from loop header to that incoming block not being a loop invariant. This would produce miscompiled code. This patch fixes the issue by checking if the incoming block is loop header, and if not, don't perform the rewrite. The could be further improved by recursively checking all conditions leading to loop exit block, but I'd like to check in this simple version first and improve it with future patches. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16570 llvm-svn: 258912
*	Revert "Revert "[SimplifyCFG] allow speculation of exactly one expensive ↵	David Majnemer	2016-01-27	2	-43/+23
\| \| \| \| \| \| \| \| \|	instruction (PR24818)"" This reverts commit r258903 which reverted r255660. r258903 was an accidental commit and should not have been committed. llvm-svn: 258905
*	[SimplifyCFG] Don't mistake icmp of and for a tree of comparisons	David Majnemer	2016-01-27	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SimplifyCFG tries to turn complex branch conditions into a switch. Some of it's logic attempts to reason about bitwise arithmetic produced by InstCombine. InstCombine can turn things like (X == 2) \|\| (X == 3) into (X & 1) == 2 and so SimplifyCFG tries to detect when this occurs so that it can produce a switch instruction. However, the legality checking was not sufficient to determine whether or not this had occured. Correctly check this case by requiring that the right-hand side of the comparison be a power of two. This fixes PR26323. llvm-svn: 258904
*	Revert "[SimplifyCFG] allow speculation of exactly one expensive instruction ↵	David Majnemer	2016-01-27	2	-23/+43
\| \| \| \| \| \| \| \|	(PR24818)" This reverts commit r255660. llvm-svn: 258903
*	AMDGPU: Fix default device handling	Matt Arsenault	2016-01-27	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When no device name is specified, default to kaveri for HSA since SI is not supported and it woud fail. Default to "tahiti" instead of "SI" since these are effectively the same, and tahiti is an actual device. Move default device handling to the TargetMachine rather than the AMDGPUSubtarget. The module ISA version is computed from the device name provided with the target machine, so the attributes printed by the AsmPrinter were inconsistent with those computed in the subtarget. Also remove DevName field from subtarget since it's redundant with getCPU() in the superclass. llvm-svn: 258901
*	[WebAssembly] Add a test for the mem-intrinsic code in WebAssemblyPeephole.cpp	Dan Gohman	2016-01-27	1	-0/+31
\| \| \| \|	llvm-svn: 258895
*	Fix identify_magic() to check that a file that starts with MH_MAGIC is	Kevin Enderby	2016-01-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	at least as big as the mach header to be identified as a Mach-O file and make sure smaller files are not identified as a Mach-O files but as unknown files. Also fix identify_magic() so it looks at all 4 bytes of the filetype field when determining the type of the Mach-O file. Then fix the macho-invalid-header test case to check that it is an unknown file and make sure it does not get the error for object_error::parse_failed. And also update the unit tests. llvm-svn: 258883
*	[llvm-tblgen] Stop emitting the intrinsic name matching code	Reid Kleckner	2016-01-26	1	-36/+0
\| \| \| \| \| \| \| \| \|	The AMDGPU backend was the last user of the old StringMatcher recognition code. Move it over to the new lookupLLVMIntrinsicName funciton, which is now improved to handle all of the interesting edge cases exposed by AMDGPU intrinsic names. llvm-svn: 258875
*	[WebAssembly] Omit no-op adds for non-mem uses of FrameIndex	Derek Schuff	2016-01-26	2	-0/+23
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16554 llvm-svn: 258872
*	[X86][SSE] Added 8i8 to 8i64 sext/zext tests	Simon Pilgrim	2016-01-26	2	-0/+153
\| \| \| \|	llvm-svn: 258868
*	[X86] Add support for zeroed shuffle elements to getShuffleScalarElt	Simon Pilgrim	2016-01-26	1	-0/+20
\| \| \| \| \| \|	Enable handling of SM_SentinelZero shuffle elements to getShuffleScalarElt. Improves VZEXT_LOAD matches in EltsFromConsecutiveLoads. llvm-svn: 258865
*	Remove autoconf support	Chris Bieneman	2016-01-26	3	-257/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861
*	WebAssembly: don't optimize memcpy/memmove/memcpy to frame index	JF Bastien	2016-01-26	1	-0/+11
\| \| \| \| \| \|	r258781 optimized memcpy/memmove/memcpy so the intrinsic call can return its first argument, but missed the frame index case. Teach it to ignore that case so C code doesn't assert out in these cases. llvm-svn: 258851
*	Add a missing test case for r258847.	Cong Hou	2016-01-26	1	-0/+21
\| \| \| \|	llvm-svn: 258848
*	Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.	Cong Hou	2016-01-26	4	-22/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 258847
*	[llvm-readobj] Add -elf-section-groups option	Hemant Kulkarni	2016-01-26	2	-0/+39
\| \| \| \| \| \| \| \| \|	Adds a way to inspect SHT_GROUP sections in ELF objects. Displays signature, member sections of these sections. Differential revision: http://reviews.llvm.org/D16555 llvm-svn: 258845
*	Reassociate: Reprocess RedoInsts after each inst	Aditya Nandakumar	2016-01-26	3	-5/+62
\| \| \| \| \| \| \| \| \| \|	Previously the RedoInsts was processed at the end of the block. However it was possible that it left behind some instructions that were not canonicalized. This should guarantee that any previous instruction in the basic block is canonicalized before we process a new instruction. llvm-svn: 258830
*	[x86, AVX] tighten checks	Sanjay Patel	2016-01-26	1	-26/+59
\| \| \| \|	llvm-svn: 258828
*	Update the comments for the macho-invalid-zero-ncmds test and fix	Kevin Enderby	2016-01-26	1	-2/+6
\| \| \| \| \| \| \| \|	llvm-objdump when printing the Mach Header to print the unknown cputype and cpusubtype fields as decimal instead of not printing them at all. And change the test to check for that. llvm-svn: 258826
*	[LibCallSimplifier] fold memset(malloc(x), 0, x) --> calloc(1, x)	Sanjay Patel	2016-01-26	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is a step towards solving PR25892: https://llvm.org/bugs/show_bug.cgi?id=25892 It won't handle the reported case. As noted by the 'TODO' comments in the patch, we need to relax the hasOneUse() constraint and also match patterns that include memset_chk() and the llvm.memset() intrinsic in addition to memset(). Differential Revision: http://reviews.llvm.org/D16337 llvm-svn: 258816
*	Revert "Reapply commit r258404 with fix"	Matthew Simpson	2016-01-26	1	-18/+13
\| \| \| \| \| \| \| \|	This commit exposes a crash in computeKnownBits on the Chromium buildbots. Reverting to investigate. Reference: https://llvm.org/bugs/show_bug.cgi?id=26307 llvm-svn: 258812
*	Re-submit r256008 "Improve DWARFDebugFrame::parse to also handle __eh_frame."	Igor Laevsky	2016-01-26	1	-22/+22
\| \| \| \| \| \| \|	Originally this change was causing failures on windows buildbots. But those problems were fixed in r258806. llvm-svn: 258811
*	[X86][SSE] Add zero element and general 64-bit VZEXT_LOAD support to ↵	Simon Pilgrim	2016-01-26	1	-45/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	EltsFromConsecutiveLoads This patch adds support for trailing zero elements to VZEXT_LOAD loads (and checks that no zero elts occur within the consecutive load). It also generalizes the 64-bit VZEXT_LOAD load matching to work for loads other than 2x32-bit loads. After this patch it will also be easier to add support for other basic load patterns like 32-bit VZEXT_LOAD loads, PMOVZX and subvector load insertion. Differential Revision: http://reviews.llvm.org/D16217 llvm-svn: 258798
*	AMDGPU: Make v32i8/v64i8 illegal types	Matt Arsenault	2016-01-26	3	-76/+187
\| \| \| \| \| \| \| \|	Old intrinsics were forcing these, but they have now all been removed. This fixes large i8 vector operations generally being broken. llvm-svn: 258788
*	AMDGPU: Remove old sample intrinsics	Matt Arsenault	2016-01-26	11	-662/+138
\| \| \| \| \| \| \| \| \| \| \|	I did my best to try to update all the uses in tests that just happened to use the old ones to the newer intrinsics. I'm not sure I got all of the immediate operand conversions correct, since the value seems to have been ignored by the old pattern but I don't think it really matters. llvm-svn: 258787
*	AMDGPU: Add new amdgcn intrinsics for cube instructions	Matt Arsenault	2016-01-26	6	-2/+107
\| \| \| \| \| \| \|	More cleanup to try to get all intrinsics using the correct amdgcn prefix that are as close to the instruction as possible. llvm-svn: 258786
*	AMDGPU: Implement read_register and write_register intrinsics	Matt Arsenault	2016-01-26	6	-0/+224
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some of the special intrinsics now that now correspond to a instruction also have special setting of some registers, e.g. llvm.SI.sendmsg sets m0 as well as use s_sendmsg. Using these explicit register intrinsics may be a better option. Reading the exec mask and others may be useful for debugging. For this I'm not sure this is entirely correct because we would want this to be convergent, although it's possible this is already treated sufficently conservatively. llvm-svn: 258785
*	AMDGPU: Restore AMDGPU prefixed rsq intrinsic for now	Matt Arsenault	2016-01-26	2	-0/+56
\| \| \| \| \| \|	Also move into backend intrinsics to discourage use of the old name. llvm-svn: 258783
*	[WebAssembly] Optimize memcpy/memmove/memcpy calls.	Dan Gohman	2016-01-26	2	-2/+62
\| \| \| \| \| \| \| \|	These calls return their first argument, but because LLVM uses an intrinsic with a void return type, they can't use the returned attribute. Generalize the store results pass to optimize these calls too. llvm-svn: 258781
*	[WebAssembly] Implement unaligned loads and stores.	Dan Gohman	2016-01-26	4	-10/+556
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16534 llvm-svn: 258779
*	[LIR] Add support for structs and hand unrolled loops	Haicheng Wu	2016-01-26	3	-0/+487
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a recommit of r258620 which causes PR26293. The original message: Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258777
*	Followup to 258750; update more tests to use .p2align .	Dan Gohman	2016-01-26	4	-5/+5
\| \| \| \|	llvm-svn: 258755
*	Followup to 258750; update all MC tests to use .p2align .	Dan Gohman	2016-01-26	10	-75/+75
\| \| \| \|	llvm-svn: 258754
*	Followup to 258750; update this test to use .p2align .	Dan Gohman	2016-01-26	1	-2/+2
\| \| \| \|	llvm-svn: 258752
*	[MC] Use .p2align instead of .align	Dan Gohman	2016-01-26	72	-233/+233
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For historic reasons, the behavior of .align differs between targets. Fortunately, there are alternatives, .p2align and .balign, which make the interpretation of the parameter explicit, and which behave consistently across targets. This patch teaches MC to use .p2align instead of .align, so that people reading code for multiple architectures don't have to remember which way each platform does its .align directive. Differential Revision: http://reviews.llvm.org/D16549 llvm-svn: 258750
*	[cfi] Cross-DSO CFI diagnostic mode (LLVM part).	Evgeniy Stepanov	2016-01-25	1	-10/+15
\| \| \| \| \| \| \| \|	* __cfi_check gets a 3rd argument: ubsan handler data * Instead of trapping on failure, call __cfi_check_fail which must be present in the module (generated in the frontend). llvm-svn: 258746
*	X86ISelLowering: Fix cmov(cmov) special lowering bug	Matthias Braun	2016-01-25	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's a special case in EmitLoweredSelect() that produces an improved lowering for cmov(cmov) patterns. However this special lowering is currently broken if the inner cmov has multiple users so this patch stops using it in this case. If you wonder why this wasn't fixed by continuing to use the special lowering and inserting a 2nd PHI for the inner cmov: I believe this would incur additional copies/register pressure so the special lowering does not improve upon the normal one anymore in this case. This fixes http://llvm.org/PR26256 (= rdar://24329747) llvm-svn: 258729
*	[ThinLTO] Find all needed metadata when linking metadata as postpass	Teresa Johnson	2016-01-25	1	-3/+13
\| \| \| \| \| \| \| \| \|	For metadata postpass linking, after importing all functions, we need to recursively walk through any nodes reached via imported functions to locate needed subprogram metadata. Some might only be reached indirectly via the variable list for an inlined function. llvm-svn: 258728
*	[X86][AVX] Add commutation support for VPERM2X128 instructions	Simon Pilgrim	2016-01-25	1	-0/+171
\| \| \| \| \| \| \| \|	Its main use is to allow memory folding of the 1st operand Differential Revision: http://reviews.llvm.org/D16521 llvm-svn: 258726
*	[ThinLTO] Handle DISubprogram reached indirectly from DIImportedEntity	Teresa Johnson	2016-01-25	1	-2/+24
\| \| \| \| \| \| \|	Extend fix for PR26037 to identify DISubprogram reached from a DIImportedEntity via a DILexicalBlock. llvm-svn: 258722
*	Enable loopreroll to rerool loop with pointer induction variable.	Lawrence Hu	2016-01-25	1	-0/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Example: while (buf !=end ) { S += buf[0]; S += buf[1]; buf +=2; }; Differential Revision: http://reviews.llvm.org/D13151 llvm-svn: 258709