bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[DAGCombine] Disable TokenFactor simplifications when optnone.	Nirav Dave	2018-06-27	1	-0/+4
\| \| \| \|	llvm-svn: 335773
*	[DAGCombiner] restrict (float)((int) f) --> ftrunc with no-signed-zeros	Sanjay Patel	2018-06-27	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in the D44909 review, the transform from (fptosi+sitofp) to ftrunc can produce -0.0 where the original code does not: #include <stdio.h> int main(int argc) { float x; x = -0.8 * argc; printf("%f\n", (float)((int)x)); return 0; } $ clang -O0 -mavx fp.c ; ./a.out 0.000000 $ clang -O1 -mavx fp.c ; ./a.out -0.000000 Ideally, we'd use IR/node flags to predicate the transform, but the IR parser doesn't currently allow fast-math-flags on the cast instructions. So for now, just use the function attribute that corresponds to clang's "-fno-signed-zeros" option. Differential Revision: https://reviews.llvm.org/D48085 llvm-svn: 335761
*	[DAGCombiner] visitSDIV - add special case handling for (sdiv X, 1) -> X in ↵	Simon Pilgrim	2018-06-27	1	-11/+7
\| \| \| \| \| \| \| \|	pow2 expansion For divisor = 1, perform a select of X - reduces scalarisation of simple SDIVs llvm-svn: 335727
*	[DAGCombiner] visitSDIV - simplify pow2 handling. NFCI.	Simon Pilgrim	2018-06-27	1	-29/+12
\| \| \| \| \| \|	Use the builtin constant folding of getNode() etc. instead of doing it manually. llvm-svn: 335720
*	[DAGCombiner] Fold SDIV(%X, MIN_SIGNED) -> SELECT(%X == MIN_SIGNED, 1, 0)	Simon Pilgrim	2018-06-27	1	-0/+5
\| \| \| \| \| \|	Fixes PR37569. llvm-svn: 335719
*	[DAGCombiner] Don't accept signbit sdiv divisors in sdiv-by-pow2 vector ↵	Simon Pilgrim	2018-06-27	1	-0/+2
\| \| \| \| \| \|	expansion (PR37569) llvm-svn: 335717
*	[DAGCombiner] use isBitwiseNot to simplify code; NFC	Sanjay Patel	2018-06-26	1	-8/+3
\| \| \| \|	llvm-svn: 335652
*	[DAGCombiner] Don't accept -1 sdiv divisors in sdiv-by-pow2 vector expansion ↵	Simon Pilgrim	2018-06-26	1	-0/+2
\| \| \| \| \| \| \| \|	(PR37119) Temporary fix until I've managed to get D45806 updated - both +1 and -1 special cases need to be properly supported. llvm-svn: 335637
*	[DAGCombiner] Pull out VT bitwidth in visitSDIV. NFCI.	Simon Pilgrim	2018-06-26	1	-4/+4
\| \| \| \|	llvm-svn: 335617
*	Fix -Wparentheses gcc warning. NFCI.	Simon Pilgrim	2018-06-25	1	-1/+1
\| \| \| \|	llvm-svn: 335451
*	[DAGCombiner] eliminate setcc bool math when input is low-bit of some value	Sanjay Patel	2018-06-24	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch has the same motivating example as D48466: define void @foo(i64 %x, i32 %c.0282.in, i32 %d.0280, i32* %ptr0, i32* %ptr1) { %c.0282 = and i32 %c.0282.in, 268435455 %a16 = lshr i64 32508, %x %a17 = and i64 %a16, 1 %tobool = icmp eq i64 %a17, 0 %. = select i1 %tobool, i32 1, i32 2 %.286 = select i1 %tobool, i32 27, i32 26 %shr97 = lshr i32 %c.0282, %. %shl98 = shl i32 %c.0282.in, %.286 %or99 = or i32 %shr97, %shl98 %shr100 = lshr i32 %d.0280, %. %shl101 = shl i32 %d.0280, %.286 %or102 = or i32 %shr100, %shl101 store i32 %or99, i32* %ptr0 store i32 %or102, i32* %ptr1 ret void } ...but I'm trying to kill the setcc bool math sooner rather than later. By matching a larger pattern that includes both the low-bit mask and the trailing add/sub, we can create a universally good fold because we always eliminate the condition code intermediate value. Here are Alive proofs for these (currently instcombine folds the 'add' variants, but misses the 'sub' patterns): https://rise4fun.com/Alive/Gsyp Name: sub of zext cmp mask %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %z = zext i1 %c to i32 %r = sub i32 C1, %z => %optional_cast = zext i8 %a to i32 %r = add i32 %optional_cast, C1-1 Name: add of zext cmp mask %a = and i32 %x, 1 %c = icmp eq i32 %a, 0 %z = zext i1 %c to i8 %r = add i8 %z, C1 => %optional_cast = trunc i32 %a to i8 %r = sub i8 C1+1, %optional_cast All of the tests look like improvements or neutral to me. But it is possible that x86 test+set+bitop is better than what we now show here. I suspect we could do better by adding another fold for the 'sub' variants. We start with select-of-constant in IR in the larger motivating test, so that's why I included tests with selects. Proofs for those variants: https://rise4fun.com/Alive/Bx1 Name: true const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C2, i64 C1 => %z = zext i8 %a to i64 %r = sub i64 C2, %z Name: false const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C1, i64 C2 => %z = zext i8 %a to i64 %r = add i64 C1, %z Differential Revision: https://reviews.llvm.org/D48466 llvm-svn: 335433
*	DAG combine "and\|or (select c, -1, 0), x" -> "select c, x, 0\|-1"	Stanislav Mekhanoshin	2018-06-21	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allowed folding for "and/or" binops with non-constant operand if arguments of select are 0/-1 values. Normally this code with "and" opcode does not get to a DAG combiner and simplified yet in the InstCombine. However AMDGPU produces it during lowering and InstCombine has no chance to optimize it out. In turn the same pattern with "or" opcode can reach DAG. Differential Revision: https://reviews.llvm.org/D48301 llvm-svn: 335250
*	[DAGCombine] Fix alignment for offset loads/stores	David Green	2018-06-21	1	-6/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The alignment parameter to getExtLoad is treated as a base alignment, not the alignment of the load (base + offset). When we infer a better alignment for a Ptr we need to ensure that it applies to the base to prevent the alignment on the load from being wrong. This fixes a bug where the alignment could then be used to incorrectly prove noalias between a load and a store, leading to a miscompile. Differential Revision: https://reviews.llvm.org/D48029 llvm-svn: 335210
*	Allow binop C1, (select cc, CF, CT) -> select folding	Stanislav Mekhanoshin	2018-06-20	1	-10/+28
\| \| \| \| \| \| \| \| \| \|	Previously this folding was done only if select is a first operand. However, for non-commutative operations constant may go before select. Differential Revision: https://reviews.llvm.org/D48223 llvm-svn: 335167
*	[DAG] Fix and-mask folding when narrowing loads.	Nirav Dave	2018-06-20	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Check that and masks are strictly smaller than implicit mask from narrowed load. Fixes PR37820. Reviewers: samparker, RKSimon, nemanjai Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48335 llvm-svn: 335137
*	[DAGCombiner] Add some comments to some true/false arguments to make it ↵	Craig Topper	2018-06-20	1	-2/+2
\| \| \| \| \| \|	obvious what they are. NFC llvm-svn: 335095
*	Utilize new SDNode flag functionality to expand current support for fadd	Michael Berg	2018-06-18	1	-11/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch originated from D46562 and is a proper subset, with some issues addressed. Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar Reviewed By: spatel Subscribers: wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D47909 llvm-svn: 334996
*	refactor of visitFADD for AllowNewConst cases	Michael Berg	2018-06-18	1	-17/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Refactoring for all constant cases which require AllowNewConst and some staging for future fmf usage. Reviewers: spatel, hfinkel, wristow Reviewed By: spatel Subscribers: nhaehnle Differential Revision: https://reviews.llvm.org/D48289 llvm-svn: 334984
*	Utilize new SDNode flag functionality to expand current support for fma	Michael Berg	2018-06-16	1	-18/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch originated from D47388 and is a proper subset of the originating changes, containing only the fmf optimization guard extensions. Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar, rampitec, nhaehnle, nemanjai Reviewed By: rampitec, nhaehnle Subscribers: tpr, nemanjai, wdng Differential Revision: https://reviews.llvm.org/D47918 llvm-svn: 334876
*	Utilize new SDNode flag functionality to expand current support for fdiv	Michael Berg	2018-06-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch originated from D46562 and is a proper subset, with some issues addressed. Reviewers: spatel, hfinkel, wristow, arsenm Reviewed By: spatel Subscribers: wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D47954 llvm-svn: 334862
*	DAG: Fix creating concat_vectors with illegal type	Matt Arsenault	2018-06-15	1	-3/+6
\| \| \| \| \| \| \|	Test passes as is, but fails with future patch to make v4i16/v4f16 legal. llvm-svn: 334823
*	easing the constraint for isNegatibleForFree and GetNegatedExpression	Michael Berg	2018-06-14	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Here we relax the old constraint which utilized unsafe with the TargetOption flag HonorSignDependentRoundingFPMathOption, with the assertion that unsafe is no longer needed or never was required for correctness on FDIV/FMUL. Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar Reviewed By: spatel Subscribers: efriedma, wdng, tpr Differential Revision: https://reviews.llvm.org/D48057 llvm-svn: 334769
*	updating isNegatibleForFree and GetNegatedExpression with fmf for fadd	Michael Berg	2018-06-14	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A FMF constraint is added to FADD with unsafe still available as the fallback Reviewers: spatel, wristow, arsenm, hfinkel Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D48180 llvm-svn: 334753
*	[DAGCombiner] remove hasOneUse() check from fadd constants transform	Sanjay Patel	2018-06-13	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \|	We're constant folding here, so we shouldn't check uses. This matches the IR optimizer behavior. The x86 test shows the expected win. The AArch64 test shows something else. This only seems to happen if the "generic" AArch64 CPU model is used by MachineCombiner, so I'll file a bug report to follow-up. llvm-svn: 334608
*	[DAGCombiner] Recognize more patterns for ABS	Krzysztof Parzyszek	2018-06-12	1	-7/+27
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D47831 llvm-svn: 334553
*	Utilize new SDNode flag functionality to expand current support for fmul	Michael Berg	2018-06-12	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch originated from D46562 and is a proper subset, with some issues addressed for fmul. Reviewers: spatel, hfinkel, wristow, arsenm Reviewed By: spatel Subscribers: nhaehnle, wdng Differential Revision: https://reviews.llvm.org/D47911 llvm-svn: 334514
*	[SelectionDAG] Provide default expansion for rotates	Krzysztof Parzyszek	2018-06-12	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Implement default legalization of rotates: either in terms of the rotation in the opposite direction (if legal), or in terms of shifts and ors. Implement generating of rotate instructions for Hexagon. Hexagon only supports rotates by an immediate value, so implement custom lowering of ROTL/ROTR on Hexagon. If a rotate is not legal, use the default expansion. Differential Revision: https://reviews.llvm.org/D47725 llvm-svn: 334497
*	DAG: Fix extract_subvector combine for a single element	Matt Arsenault	2018-06-11	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This would fail before because 1x vectors aren't legal, so instead just use the scalar type. Avoids regressions in a future AMDGPU commit to add v4i16/v4f16 as legal types. Test update is just the one test that this triggers on in tree now. It wasn't checking anything before. The result is completely changed since the selects are eliminated. Not sure if it's considered better or not. llvm-svn: 334440
*	[DAGCombiner] match vector compare and select sizes with extload operand ↵	Sanjay Patel	2018-06-10	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR37427) This patch started off much more general and ambitious, but it's been a nightmare seeing all the ways x86 vector codegen can go wrong. So the code is still structured to allow extending easily, but it's currently limited in several ways: 1. Only handle cases with an extending load. 2. Only handle cases with a zero constant compare. 3. Ignore setcc with vector bitmask (SetCCWidth != 1) - so AVX512 should be unaffected. The motivating case from PR37427: https://bugs.llvm.org/show_bug.cgi?id=37427 ...is the 1st test, and that shows the expected win - we eliminated the unnecessary intermediate cast. There's a clear regression in the last test (sgt_zero_fp_select) because we longer recognize a 'SHRUNKBLEND' opportunity. I think that general problem is also present in sgt_zero, so I'll try to fix that in a follow-up. We need to match a sign-bit setcc from a sign-extended operand and remove it. Differential Revision: https://reviews.llvm.org/D47330 llvm-svn: 334378
*	Use SmallPtrSet instead of SmallSet in places where we iterate over the set.	Craig Topper	2018-06-09	1	-1/+1
\| \| \| \| \| \| \| \|	SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work. These places were found by hiding the begin/end methods in the SmallSet forwarding llvm-svn: 334343
*	[DAGCombiner] clean up comments; NFC	Sanjay Patel	2018-06-08	1	-8/+5
\| \| \| \|	llvm-svn: 334312
*	Utilize new SDNode flag functionality to expand current support for fsub	Michael Berg	2018-06-08	1	-17/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch originated from D46562 and is a proper subset, with some issues addressed for fsub. Reviewers: spatel, hfinkel, wristow, arsenm Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D47910 llvm-svn: 334306
*	[DAGCombine] Fix for PR37667	Sam Parker	2018-06-08	1	-0/+16
\| \| \| \| \| \| \| \| \| \|	While trying to propagate AND masks back to loads, we currently allow one non-load node to be included as a leaf in chain. This fix now limits that node to produce only a single data value. Differential Revision: https://reviews.llvm.org/D47878 llvm-svn: 334268
*	propagate fast math flags via IR on fma and sub expressions	Michael Berg	2018-06-07	1	-45/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change uses fmf subflags to guard fma optimizations as well as unsafe. These changes originated from D46483 and have been simplified via getNode. Reviewers: spatel, arsenm, hfinkel, javed.absar Reviewed By: spatel Subscribers: nemanjai, wdng Differential Revision: https://reviews.llvm.org/D47388 llvm-svn: 334242
*	DAG: Avoid bitcast/ext/build_vector combine	Matt Arsenault	2018-06-07	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This avoids regressions in a future AMDGPU change to make v4i16/v4f16 legal. For these types, build_vector is implemented as bitcasted operations on v2i32. This combine was creating v4i16s out of what would have been already been a v2i32 build_vector, creating a mess of nodes that never get cleaned up. I'm not sure this is the right condition to check. I initially tried just checking for the legality of the new build_vector. This works for my case, but breaks dozens of x86 tests. A Mips test seems to show some improvement or at least a neutral change. I don't want to think about how long it would take to analyze the set of different x86 vector operations impacted. Test included in future commit. llvm-svn: 334218
*	guard fsqrt with fmf sub flags	Michael Berg	2018-06-06	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change uses fmf subflags to guard optimizations as well as unsafe. These changes originated from D46483. It contains only context for fsqrt. Reviewers: spatel, hfinkel, arsenm Reviewed By: spatel Subscribers: hfinkel, wdng, andrew.w.kaylor, wristow, efriedma, nemanjai Differential Revision: https://reviews.llvm.org/D47749 llvm-svn: 334113
*	Fix -Wcovered-switch-default warning and clang-format it	Reid Kleckner	2018-06-04	1	-10/+8
\| \| \| \|	llvm-svn: 333967
*	[DAGcombine] Teach the combiner about -a = ~a + 1	Amaury Sechet	2018-06-04	1	-1/+60
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This include variant for add, uaddo and addcarry. usubo and subcarry require the carry to be flipped to preserve semantic, but we chose to do the transform anyway in that case as to push the transform down the carry chain. Reviewers: efriedma, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46505 llvm-svn: 333943
*	Get rid of SETCCE	Amaury Sechet	2018-06-04	1	-15/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: It has been deprecated in favor of SETCCCARRY for a year now and isn't used by any in tree backend. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47685 llvm-svn: 333939
*	[SelectionDAG] Add missing closing parentheses in comments, NFC	Krzysztof Parzyszek	2018-06-04	1	-6/+6
\| \| \| \|	llvm-svn: 333907
*	[DAG] Avoid checking for consecutive stores in store merge. NFCI.	Nirav Dave	2018-06-01	1	-319/+340
\| \| \| \|	llvm-svn: 333766
*	[DAG] Simplify Expression. NFC.	Nirav Dave	2018-06-01	1	-9/+3
\| \| \| \|	llvm-svn: 333765
*	[DAG] Remove untriggerable check. NFCI.	Nirav Dave	2018-06-01	1	-10/+0
\| \| \| \| \| \|	Candidate check precludes this check. llvm-svn: 333764
*	[DAG] Prune store merge legal store check to stop invalid size. NFCI.	Nirav Dave	2018-06-01	1	-0/+15
\| \| \| \| \| \|	Do not consider store sizes large than the maximum legal store size. llvm-svn: 333763
*	[SelectionDAG] Expand UADDO/USUBO into ADD/SUBCARRY if legal for target	Krzysztof Parzyszek	2018-06-01	1	-4/+10
\| \| \| \| \| \| \| \| \|	Additionally, implement handling of ADD/SUBCARRY on Hexagon, utilizing the UADDO/USUBO expansion. Differential Revision: https://reviews.llvm.org/D47559 llvm-svn: 333751
*	[DAGCombiner] isAllOnesConstantOrAllOnesSplatConstant(): look through bitcasts	Roman Lebedev	2018-05-21	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: As pointed out in D46528, we errneously transform cases like `xor X, -1`, even though we use said function. It's because the `-1` is actually a bitcast there. So i think we can just look through it in the function. Differential Revision: https://reviews.llvm.org/D47156 llvm-svn: 332905
*	[DAGCombine][X86][AArch64] Masked merge unfolding: vector edition.	Roman Lebedev	2018-05-21	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This appears to be the last missing piece for the masked merge pattern handling in the backend. This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 \| PR37104 ]]. [[ https://bugs.llvm.org/show_bug.cgi?id=6773 \| PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly. Previously, `andps`+`andnps` / `bsl` would be generated. (see `@out`) Now, they would no longer be generated (see `@in`), and we need to make sure that they are generated. Differential Revision: https://reviews.llvm.org/D46528 llvm-svn: 332904
*	[DAGCombiner] Use computeKnownBits to match rotate patterns that have had ↵	Craig Topper	2018-05-21	1	-7/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	their amount masking modified by simplifyDemandedBits SimplifyDemandedBits can remove bits from the masks for the shift amounts we need to see to detect rotates. This patch uses zeroes from computeKnownBits to fill in some of these mask bits to make the match work. As currently written this calls computeKnownBits even when the mask hasn't been simplified because it made the code simpler. If we're worried about compile time performance we can improve this. I know we're talking about making a rotate intrinsic, but hopefully we can go ahead and do this change and just make sure the rotate intrinsic also handles it. Differential Revision: https://reviews.llvm.org/D47116 llvm-svn: 332895
*	[DAG] Prune cycle check in store merge.	Nirav Dave	2018-05-16	1	-18/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of merging stores we check that fusing the nodes does not cause a cycle due to one candidate store being indirectly dependent on another store (this may happen via chained memory copies). This is done by searching if a store is a predecessor to another store's value. Prune the search at the candidate search's root node which is a predecessor to all candidate stores. This reduces the size of the subgraph searched in large basic blocks. Reviewers: jyknight Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D46955 llvm-svn: 332490
*	[DAG] Defer merge store cycle checking to just before merge. NFCI.	Nirav Dave	2018-05-16	1	-8/+20
\| \| \| \|	llvm-svn: 332489