bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 ↵	Simon Pilgrim	2017-05-15	3	-0/+1998
\| \| \| \| \| \|	add/sub/mul llvm-svn: 303074
*	[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 shifts	Simon Pilgrim	2017-05-15	3	-0/+2589
\| \| \| \|	llvm-svn: 303069
*	Fix two tests that weren't correctly copied.	Daniel Jasper	2017-05-14	1	-1/+0
\| \| \| \| \| \| \|	One didn't correctly fine the regex variable, the other still had a RUN line for FNOBUILTIN-checks, which weren't copied to the file. llvm-svn: 303025
*	[InstSimplify] Add patterns for folding (A & B) \| (~A ^ B) -> (~A ^ B) and ↵	Craig Topper	2017-05-14	1	-32/+16
\| \| \| \| \| \| \| \|	its commuted variants. We already had (A & ~B) \| (A ^ B), but we missed the cases where the not was part of the xor. llvm-svn: 303004
*	foo	Craig Topper	2017-05-14	1	-0/+122
\| \| \| \|	llvm-svn: 303003
*	Renable test that was disabled due to cost analysis	Xinliang David Li	2017-05-14	1	-1/+1
\| \| \| \|	llvm-svn: 303000
*	[LoopOptimizer][Fix]PR32859, PR24738	Simon Pilgrim	2017-05-13	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form. Here it is: before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ] and after this change we have: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ] Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D33055 llvm-svn: 302988
*	[InstCombine] Prevent InstCombine from triggering an extra iteration if ↵	Craig Topper	2017-05-13	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	something changed in the initial Worklist creation Summary: If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop. This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now. This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation. I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this. This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that. Reviewers: davide, majnemer, spatel, chandlerc Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32453 llvm-svn: 302982
*	ConstProp: Split x86 SSE intrinsic tests out of calls.ll	Justin Bogner	2017-05-13	2	-206/+209
\| \| \| \| \| \| \|	This allows us to mark this as `REQUIRES: x86`, since it uses x86 target specific intrinsics. llvm-svn: 302980
*	InstCombine: Move tests that use target intrinsics into subdirectories	Justin Bogner	2017-05-13	38	-175/+190
\| \| \| \| \| \| \| \|	Tests with target intrinsics are inherently target specific, so it doesn't actually make sense to run them if we've excluded their target. llvm-svn: 302979
*	Disable llvm/test/Transforms/NewGVN/pr32934.ll while Davide is investigating.	NAKAMURA Takumi	2017-05-13	1	-1/+1
\| \| \| \|	llvm-svn: 302977
*	[NewGVN] XFAIL a flaky test until I find out what's going on.	Davide Italiano	2017-05-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	I bet the change is correct but this test seems to expose some underlying problem that manifest only on some buildbots, and I'm not able to reproduce locally. Unfortunately I can't debug right now but I don't want to annoy people with spurious failures, so I'll XFAIL until I can take a look (over the weekend). llvm-svn: 302976
*	[PartialInlining] Profile based cost analysis	Xinliang David Li	2017-05-12	9	-12/+160
\| \| \| \| \| \| \| \| \| \| \| \|	Implemented frequency based cost/saving analysis and related options. The pass is now in a state ready to be turne on in the pipeline (in follow up). Differential Revision: http://reviews.llvm.org/D32783 llvm-svn: 302967
*	[TLI] Add mapping for various '__<func>_finite' forms of the math routines ↵	Andrew Kaylor	2017-05-12	1	-0/+187
\| \| \| \| \| \| \| \| \| \|	to SVML routines Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31789 llvm-svn: 302957
*	[ConstantFolding] Add folding for various math '__<func>_finite' routines ↵	Andrew Kaylor	2017-05-12	1	-0/+83
\| \| \| \| \| \| \| \| \| \|	generated from -ffast-math Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31788 llvm-svn: 302956
*	[TLI] Add declarations for various math header file routines from ↵	Andrew Kaylor	2017-05-12	2	-0/+252
\| \| \| \| \| \| \| \| \| \|	math-finite.h that create '__<func>_finite as functions Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31787 llvm-svn: 302955
*	[LoopUnroll] Fix a test. REQUIRE should be REQUIRES.	Davide Italiano	2017-05-12	1	-1/+1
\| \| \| \| \| \|	Found by inspection. llvm-svn: 302909
*	[NewGVN] Don't incorrectly reset the memory leader.	Davide Italiano	2017-05-12	1	-0/+68
\| \| \| \| \| \| \| \| \| \|	This code was missing a check for stores, so we were thinking the congruency class didn't have any memory members, and reset the memory leader. Differential Revision: https://reviews.llvm.org/D33056 llvm-svn: 302905
*	[PM/Unswitch] Teach the new simple loop unswitch to handle loop	Chandler Carruth	2017-05-12	1	-0/+199
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	invariant PHI inputs and to rewrite PHI nodes during the actual unswitching. The checking is quite easy, but rewriting the PHI nodes is somewhat surprisingly challenging. This should handle both branches and switches. I think this is now a full featured trivial unswitcher, and more full featured than the trivial cases in the old pass while still being (IMO) somewhat simpler in how it works. Next up is to verify its correctness in more widespread testing, and then to add non-trivial unswitching. Thanks to Davide and Sanjoy for the excellent review. There is one remaining question that I may address in a follow-up patch (see the review thread for details) but it isn't related to the functionality specifically. Differential Revision: https://reviews.llvm.org/D32699 llvm-svn: 302867
*	Restrict call metadata based hotness detection to Sample PGO mode	Teresa Johnson	2017-05-11	3	-20/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Don't use the metadata on call instructions for determining hotness unless we are in sample PGO mode, where it is needed because profile counts are not accurate. In instrumentation mode this is not necessary and does more harm than good when calls have VP metadata that hasn't been properly scaled after transformations or dropped after constant prop based devirtualization (both should be fixed, but we don't need to do this in the first place for instrumentation PGO). This required adjusting a number of tests to distinguish between sample and instrumentation PGO handling, and to add in profile summary metadata so that getProfileCount can get the summary. Reviewers: davidxl, danielcdh Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32877 llvm-svn: 302844
*	Decrease inlinecold-threshold to 45	Easwaran Raman	2017-05-11	1	-18/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I ran the test-suite (including SPEC 2006) in PGO mode comparing cold thresholds of 225 and 45. Here are some stats on the text size: Out of 904 tests that ran, 197 see a change in text size. The average text size reduction (of all the 904 binaries) is 1.07%. Of the 197 binaries, 19 see a text size increase, as high as 18%, but most of them are small single source benchmarks. There are 3 multisource benchmarks with a >0.5% size increase (0.7, 1.3 and 2.1 are their % increases). On the other side of the spectrum, 31 benchmarks see >10% size reduction and 6 of them are MultiSource. I haven't run the test-suite with other values of inlinecold-threshold. Since we have a cold callsite threshold of 45, I picked this value. Differential revision: https://reviews.llvm.org/D33106 llvm-svn: 302829
*	[SLP] Emit optimization remarks	Adam Nemet	2017-05-11	3	-4/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The approach I followed was to emit the remark after getTreeCost concludes that SLP is profitable. I initially tried emitting them after the vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely remember missing a few cases for example in HorizontalReduction::tryToReduce. ORE is placed in BoUpSLP so that it's available from everywhere (notably HorizontalReduction::tryToReduce). We use the first instruction in the root bundle as the locator for the remark. In order to get a sense how far the tree is spanning I've include the size of the tree in the remark. This is not perfect of course but it gives you at least a rough idea about the tree. Then you can follow up with -view-slp-tree to really see the actual tree. llvm-svn: 302811
*	[InstCombine] remove fold that swaps xor/or with constants; NFCI	Sanjay Patel	2017-05-10	1	-0/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	// (X ^ C1) \| C2 --> (X \| C2) ^ (C1&~C2) This canonicalization was added at: https://reviews.llvm.org/rL7264 By moving xors out/down, we can more easily combine constants. I'm adding tests that do not change with this patch, so we can verify that those kinds of transforms are still happening. This is no-functional-change-intended because there's a later fold: // (X^C)\|Y -> (X\|Y)^C iff Y&C == 0 ...and demanded-bits appears to guarantee that any fold that would have hit the fold we're removing here would be caught by that 2nd fold. Similar reasoning was used in: https://reviews.llvm.org/rL299384 The larger motivation for removing this code is that it could interfere with the fix for PR32706: https://bugs.llvm.org/show_bug.cgi?id=32706 Ie, we're not checking if the 'xor' is actually a 'not', so we could reverse a 'not' optimization and cause an infinite loop by altering an 'xor X, -1'. Differential Revision: https://reviews.llvm.org/D33050 llvm-svn: 302733
*	[InstSimplify, InstCombine] move 'or' simplification tests; NFC	Sanjay Patel	2017-05-10	3	-181/+181
\| \| \| \| \| \| \|	Surprisingly, I don't think these are redundant for InstSimplify. They were just misplaced as InstCombine tests. llvm-svn: 302684
*	[AArch64] Enable use of reduction intrinsics.	Amara Emerson	2017-05-10	2	-55/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new experimental reduction intrinsics can now be used, so I'm enabling this for AArch64. We will need this for SVE anyway, so it makes sense to do this for NEON reductions as well. The existing code to match shufflevector patterns are replaced with a direct lowering of the reductions to AArch64-specific nodes. Tests updated with the new, simpler, representation. Differential Revision: https://reviews.llvm.org/D32247 llvm-svn: 302678
*	[InstCombine] remove redundant tests	Sanjay Patel	2017-05-10	1	-34/+0
\| \| \| \| \| \| \| \| \| \| \|	The first test in this file is duplicated exactly in and.ll -> test33. We have commuted and vector variants there too. The second test is a composite of 2 folds. The first fold is tested independently in add.ll -> flip_and_mask (including vector variant). After that transform fires, the IR is identical to the first transform. llvm-svn: 302676
*	[InstCombine] fix auto-generated FileCheck-captured variable refs	Sanjay Patel	2017-05-10	3	-6/+6
\| \| \| \| \| \| \|	The script at utils/update_test_checks.py has (had?) a bug when variables start with the same sequence of letters (clearly, not all of the time). llvm-svn: 302674
*	[InstCombine] fix typo in test comment; NFC	Sanjay Patel	2017-05-10	1	-1/+1
\| \| \| \|	llvm-svn: 302669
*	[InstCombine] add (ashr (shl i32 X, 31), 31), 1 --> and (not X), 1	Sanjay Patel	2017-05-10	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is another step towards favoring 'not' ops over random 'xor' in IR: https://bugs.llvm.org/show_bug.cgi?id=32706 This transformation may have occurred in longer IR sequences using computeKnownBits, but that could be much more expensive to calculate. As the scalar result shows, we do not currently favor 'not' in all cases. The 'not' created by the transform is transformed again (unnecessarily). Vectors don't have this problem because vectors are (wrongly) excluded from several other combines. llvm-svn: 302659
*	Revert r301950: SpeculativeExecution: Stop using whitelist for costs	Chandler Carruth	2017-05-10	2	-105/+0
\| \| \| \| \| \| \| \| \| \|	This pass doesn't correctly handle testing for when it is legal to hoist arbitrary instructions. The whitelist happens to make it safe, so before it is removed the pass's legality checks will need to be enhanced. Details have been added to the code review thread for the patch. llvm-svn: 302640
*	[InstCombine] add tests for andn; NFC	Sanjay Patel	2017-05-09	1	-0/+28
\| \| \| \|	llvm-svn: 302599
*	[GVN] Fix a crash on encountering non-integral pointers	Keno Fischer	2017-05-09	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes the immediate crash caused by introducing an incorrect inttoptr before attempting the conversion. There may still be a legality check missing somewhere earlier for non-integral pointers, but this change seems necessary in any case. Reviewers: sanjoy, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32623 llvm-svn: 302587
*	[InstCombine] update test file to use FileCheck; NFC	Sanjay Patel	2017-05-09	1	-14/+22
\| \| \| \|	llvm-svn: 302585
*	[NewGVN] Fix a consistent order for phi nodes operands.	Davide Italiano	2017-05-09	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The way we currently define congruency for two PHIExpression(s) is: 1) The operands to the phi functions are congruent 2) The PHIs are defined in the same BasicBlock. NewGVN works under the assumption that phi operands are in predecessor order, or at least in some consistent order. OTOH, is valid IR: patatino: %meh = phi i16 [ %0, %winky ], [ %conv1, %tinky ] %banana = phi i16 [ %0, %tinky ], [ %conv1, %winky ] br label %end and the in-memory representations of the two SSA registers have an inconsistent order. This violation of NewGVN assumptions results into two PHIs found congruent when they're not. While we think it's useful to have always a consistent order enforced, let's fix this in NewGVN sorting uses in predecessor order before creating a PHI expression. Differential Revision: https://reviews.llvm.org/D32990 llvm-svn: 302552
*	[InstCombineCasts] Fix checks in sext->lshr->trunc pattern.	Sanjay Patel	2017-05-09	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The comment says to avoid the case where zero bits are shifted into the truncated value, but the code checks that the shift is smaller than the truncated value instead of the number of bits added by the sign extension. Fixing this allows a shift by more than the value size to be introduced, which is undefined behavior, so the shift is capped at the value size minus one, which has the expected behavior of filling the value with the sign bit. Patch by Jacob Young! Differential Revision: https://reviews.llvm.org/D32285 llvm-svn: 302548
*	[LV] Fix insertion point for shuffle vectors in first order recurrence	Anna Thomas	2017-05-09	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In first order recurrence vectorization, when the previous value is a phi node, we need to set the insertion point to the first non-phi node. We can have the previous value being a phi node, due to the generation of new IVs as part of trunc optimization [1]. [1] https://reviews.llvm.org/rL294967 Reviewers: mssimpso, mkuper Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32969 llvm-svn: 302532
*	Fix code section prefix for proper layout	Teresa Johnson	2017-05-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: r284533 added hot and cold section prefixes based on profile information, to enable grouping of hot/cold functions at link time. However, it used "cold" as the prefix for cold sections, but gold only recognizes "unlikely" (which is used by gcc for cold sections). Therefore, cold sections were not properly being grouped. Switch to using "unlikely" Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32983 llvm-svn: 302502
*	Add basic test case for -instnamer	Sanjoy Das	2017-05-08	1	-0/+19
\| \| \| \|	llvm-svn: 302482
*	[InstCombine] add tests from D32285 to show current problems; NFC	Sanjay Patel	2017-05-08	1	-0/+38
\| \| \| \|	llvm-svn: 302475
*	[InstCombine] add folds for not-of-shift-right	Sanjay Patel	2017-05-08	1	-6/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is another step towards getting rid of dyn_castNotVal, so we can recommit: https://reviews.llvm.org/rL300977 As the tests show, we were missing the lshr case for constants and both ashr/lshr vector splat folds. The ashr case with constant was being performed inefficiently in 2 steps. It's also possible there was a latent bug in that case because we can't do that fold if the constant is positive: http://rise4fun.com/Alive/Bge llvm-svn: 302465
*	[InstCombine] move/add tests for not(shr (not X), Y); NFC	Sanjay Patel	2017-05-08	2	-17/+60
\| \| \| \|	llvm-svn: 302451
*	ConstantFold: Handle gep nonnull, undef as well	Daniel Berlin	2017-05-08	1	-1/+1
\| \| \| \|	llvm-svn: 302447
*	ConstantFold: Fold getelementptr (i32, i32* null, i64 undef) to null.	Daniel Berlin	2017-05-08	2	-5/+4
\| \| \| \| \| \| \| \|	Transforms/IndVarSimplify/2011-10-27-lftrnull will fail if this regresses. Transforms/GVN/PRE/2011-06-01-NonLocalMemdepMiscompile.ll has been changed to still test what it was trying to test. llvm-svn: 302446
*	[ValueTracking] Use KnownOnes to provide a better bound on known zeros for ↵	Craig Topper	2017-05-08	1	-4/+25
\| \| \| \| \| \| \| \| \| \|	ctlz/cttz intrinics This patch uses KnownOnes of the input of ctlz/cttz to bound the value that can be returned from these intrinsics. This makes these intrinsics more similar to the handling for ctpop which already uses known bits to produce a similar bound. Differential Revision: https://reviews.llvm.org/D32521 llvm-svn: 302444
*	[InstCombine] add another test for PR32949; NFC	Sanjay Patel	2017-05-08	1	-0/+13
\| \| \| \| \| \| \|	A patch for the InstSimplify variant of this bug is up for review here: https://reviews.llvm.org/D32954 llvm-svn: 302434
*	[InstSimplify] add tests for PR32949 miscompile; NFC	Sanjay Patel	2017-05-07	1	-2/+27
\| \| \| \|	llvm-svn: 302374
*	InstructionSimplify: Relanding r301766	Zvi Rackover	2017-05-07	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Re-applying r301766 with a fix to a typo and a regression test. The log message for r301766 was: ================================================================================== InstructionSimplify: Canonicalize shuffle operands. NFC-ish. Summary: Apply canonicalization rules: 1. Input vectors with no elements selected from can be replaced with undef. 2. If only one input vector is constant it shall be the second one. This allows constant-folding to cover more ad-hoc simplifications that were in place and avoid duplication for RHS and LHS checks. There are more rules we may want to add in the future when we see a justification. e.g. mask elements that select undef elements can be replaced with undef. ================================================================================== Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32863 llvm-svn: 302373
*	[InstSimplify] use ConstantRange to simplify or-of-icmps	Sanjay Patel	2017-05-07	4	-448/+127
\| \| \| \| \| \| \| \| \| \| \| \| \|	We can simplify (or (icmp X, C1), (icmp X, C2)) to 'true' or one of the icmps in many cases. I had to check some of these with Alive to prove to myself it's right, but everything seems to check out. Eg, the deleted code in instcombine was completely ignoring predicates with mismatched signedness. This is a follow-up to: https://reviews.llvm.org/rL301260 https://reviews.llvm.org/D32143 llvm-svn: 302370
*	[InstSimplify] fix copy-paste mistake in test comments; NFC	Sanjay Patel	2017-05-05	1	-200/+200
\| \| \| \|	llvm-svn: 302251
*	[InstSimplify] add tests for (icmp X, C1 \| icmp X, C2); NFC	Sanjay Patel	2017-05-05	1	-0/+3020
\| \| \| \| \| \|	These are the 'or' counterparts for the tests added with r300493. llvm-svn: 302248