bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[Analysis] Don't assume that unsigned overflow can't happen in EmitGEPOffset ↵	Mikhail Maltsev	2019-10-17	1	-4/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR42699) Summary: Currently when computing a GEP offset using the function EmitGEPOffset for the following instruction getelementptr inbounds i32, i32* %p, i64 %offs we get mul nuw i64 %offs, 4 Unfortunately we cannot assume that unsigned wrapping won't happen here because %offs is allowed to be negative. Making such assumptions can lead to miscompilations: see the new test test24_neg_offs in InstCombine/icmp.ll. Without the patch InstCombine would generate the following comparison: icmp eq i64 %offs, 4611686018427387902; 0x3ffffffffffffffe Whereas the correct value to compare with is -2. This patch replaces the NUW flag with NSW in the multiplication instructions generated by EmitGEPOffset and adjusts the test suite. https://bugs.llvm.org/show_bug.cgi?id=42699 Reviewers: chandlerc, craig.topper, ostannard, lebedev.ri, spatel, efriedma, nlopes, aqjune Reviewed By: lebedev.ri Subscribers: reames, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68342 llvm-svn: 375089
*	[InstCombine] allow icmp+binop folds before min/max bailout (PR43310)	Sanjay Patel	2019-09-22	1	-12/+6
\| \| \| \| \| \| \| \| \|	This has the potential to uncover missed analysis/folds as shown in the min/max code comment/test, but fewer restrictions on icmp folds should be better in general to solve cases like: https://bugs.llvm.org/show_bug.cgi?id=43310 llvm-svn: 372510
*	[InstCombine] add tests for icmp fold hindered by min/max; NFC	Sanjay Patel	2019-09-22	1	-0/+36
\| \| \| \|	llvm-svn: 372509
*	[InstCombine] add/move tests for icmp with add operand; NFC	Sanjay Patel	2019-09-16	1	-86/+20
\| \| \| \|	llvm-svn: 371988
*	[InstCombine] remove unneeded one-use checks for icmp fold	Sanjay Patel	2019-09-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fold and several others were added in: rL125734 <https://reviews.llvm.org/rL125734> ...with no explanation for the one-use checks other than the code comments about register pressure. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. rL371940 is a related patch in this series. llvm-svn: 371981
*	[InstCombine] add icmp tests with extra uses; NFC	Sanjay Patel	2019-09-16	1	-0/+34
\| \| \| \|	llvm-svn: 371979
*	[InstCombine] remove unneeded one-use checks for icmp fold	Sanjay Patel	2019-09-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fold and several others were added in: rL125734 ...with no explanation for the one-use checks other than the code comments about register pressure. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. There are similar checks as noted with the TODO comments. I'm hoping to remove those restrictions too, but if any of these does cause a regression, it should be easier to correct by making small, individual commits. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. llvm-svn: 371940
*	[InstCombine] add icmp tests with extra uses; NFC	Sanjay Patel	2019-09-15	1	-0/+34
\| \| \| \|	llvm-svn: 371939
*	Revert "Temporarily Revert "Add basic loop fusion pass.""	Eric Christopher	2019-04-17	1	-0/+3477
\| \| \| \| \| \| \| \|	The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
*	Temporarily Revert "Add basic loop fusion pass."	Eric Christopher	2019-04-17	1	-3477/+0
\| \| \| \| \| \| \| \|	As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
*	[InstSimplify] SimplifyICmpInst - icmp eq/ne %X, undef -> undef	Simon Pilgrim	2019-03-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed on PR41125 and D59363, we have a mismatch between icmp eq/ne cases with an undef operand: When the other operand is constant we fold to undef (handled in ConstantFoldCompareInstruction) When the other operand is non-constant we fold to a bool constant based on isTrueWhenEqual (handled in SimplifyICmpInst). Neither is really wrong, but this patch changes the logic in SimplifyICmpInst to consistently fold to undef. The NewGVN test change is annoying (as with most heavily reduced tests) but AFAICT I have kept the purpose of the test based on rL291968. Differential Revision: https://reviews.llvm.org/D59541 llvm-svn: 356456
*	[InstCombine] Regenerate + add icmp with undef tests	Simon Pilgrim	2019-03-19	1	-2/+23
\| \| \| \| \| \|	Better test coverage for PR41125 and D59363 llvm-svn: 356448
*	[InstCombine] X \| C == C --> (X & ~C) == 0	Sanjay Patel	2019-02-06	1	-8/+8
\| \| \| \| \| \| \| \| \| \|	We should canonicalize to one of these forms, and compare-with-zero could be more conducive to follow-on transforms. This also leads to generally better codegen as shown in PR40611: https://bugs.llvm.org/show_bug.cgi?id=40611 llvm-svn: 353313
*	[InstCombine] add tests for PR40611 and regenerate checks; NFC	Sanjay Patel	2019-02-06	1	-294/+349
\| \| \| \| \| \|	Lots of unrelated diffs here from the newer version of the script. llvm-svn: 353312
*	[InstCombine] auto-generate full checks for icmp dominator tests; NFC	Sanjay Patel	2018-12-04	1	-119/+0
\| \| \| \|	llvm-svn: 348270
*	[InstCombine] reverse 'trunc X to <N x i1>' canonicalization; 2nd try	Sanjay Patel	2018-10-10	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-trying r344082 because it unintentionally included extra diffs. Original commit message: icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344181
*	revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization	Sanjay Patel	2018-10-10	1	-8/+12
\| \| \| \| \| \|	This commit accidentally included the diffs from D53057. llvm-svn: 344178
*	[InstCombine] reverse 'trunc X to <N x i1>' canonicalization	Sanjay Patel	2018-10-09	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344082
*	[InstCombine] add icmp+logic tests with commuted ops; NFC	Sanjay Patel	2018-10-02	1	-3/+37
\| \| \| \| \| \| \|	The transform in question is located in foldICmpAndConstConst(), but as shown here, it doesn't work if operands are commuted. llvm-svn: 343646
*	[InstCombine] add folds for icmp with xor mask constant	Sanjay Patel	2018-09-11	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are the folds in Alive; Name: xor_ult Pre: isPowerOf2(-C1) %xor = xor i8 %x, C1 %r = icmp ult i8 %xor, C1 => %r = icmp ugt i8 %x, ~C1 Name: xor_ugt Pre: isPowerOf2(C1+1) %xor = xor i8 %x, C1 %r = icmp ugt i8 %xor, C1 => %r = icmp ugt i8 %x, C1 https://rise4fun.com/Alive/Vty The ugt case in its simplest form was already handled by DemandedBits, but that's not ideal as shown in the multi-use test. I'm not sure if these are all of the symmetrical folds, but I adjusted the existing code for one of the folds to try to show the similarities. There's no obvious connection, but this is another preliminary step for PR14613... https://bugs.llvm.org/show_bug.cgi?id=14613 llvm-svn: 341997
*	[InstCombine] add tests for icmp with xor; NFC	Sanjay Patel	2018-09-11	1	-0/+47
\| \| \| \|	llvm-svn: 341993
*	[InstCombine] Add splat vector constant support to foldICmpAddOpConst.	Craig Topper	2018-08-20	1	-6/+43
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D50946 llvm-svn: 340231
*	[InstCombine] Add test cases for an icmp combine that is missing support for ↵	Craig Topper	2018-08-19	1	-0/+51
\| \| \| \| \| \|	splat vector constants. llvm-svn: 340144
*	[InstCombine] canonicalize abs pattern	Chen Zheng	2018-07-27	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48754 llvm-svn: 338092
*	[InstCombine] add more SPFofSPF folding	Chen Zheng	2018-07-16	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49238 llvm-svn: 337143
*	[InstCombine] fold icmp pred (sub 0, X) C for vector type	Chen Zheng	2018-07-16	1	-0/+13
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49283 llvm-svn: 337141
*	[InstCombine] choose 1 form of abs and nabs as canonical	Sanjay Patel	2018-05-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already do this for min/max (see the blob above the diff), so we should do the same for abs/nabs. A sign-bit check (<s 0) is used as a predicate for other IR transforms and it's likely the best for codegen. This might solve the motivating cases for D47037 and D47041, but I think those patches still make sense. We can't guarantee this canonicalization if the icmp has more than one use. Differential Revision: https://reviews.llvm.org/D47076 llvm-svn: 332819
*	[InstCombine] add folds for icmp + sub (PR36969)	Sanjay Patel	2018-04-02	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(A - B) >u A --> A <u B C <u (C - D) --> C <u D https://rise4fun.com/Alive/e7j Name: ugt %sub = sub i8 %x, %y %cmp = icmp ugt i8 %sub, %x => %cmp = icmp ult i8 %x, %y Name: ult %sub = sub i8 %x, %y %cmp = icmp ult i8 %x, %sub => %cmp = icmp ult i8 %x, %y This should fix: https://bugs.llvm.org/show_bug.cgi?id=36969 llvm-svn: 329011
*	[InstCombine] add tests for icmp (sub x, y), x (PR36969); NFC	Sanjay Patel	2018-04-02	1	-0/+30
\| \| \| \|	llvm-svn: 329010
*	[InstCombine] peek through FP casts for sign-bit compares (PR36682)	Sanjay Patel	2018-03-24	1	-26/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This pattern came up in PR36682: https://bugs.llvm.org/show_bug.cgi?id=36682 https://godbolt.org/g/LhuD9A Equality checks are planned as a follow-up enhancement. Differential Revision: https://reviews.llvm.org/D44367 llvm-svn: 328426
*	[InstCombine] add tests for casted sign-bit cmp (PR36682); NFC	Sanjay Patel	2018-03-11	1	-2/+27
\| \| \| \|	llvm-svn: 327243
*	[InstCombine] Don't blow up in foldICmpWithCastAndCast on vector icmp ↵	Daniel Neilson	2018-03-05	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions. Summary: Presently, InstCombiner::foldICmpWithCastAndCast() implicitly assumes that it is only invoked with icmp instructions of integer type. If that assumption is broken, and it is called with an icmp of vector type, then it fails (asserts/crashes). This patch addresses the deficiency. It allows it to simplify icmp (ptrtoint x), (ptrtoint/c) of vector type into a compare of the inputs, much as is done when the type is integer. Reviewers: apilipenko, fedor.sergeev, mkazantsev, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44063 llvm-svn: 326730
*	[InstCombine] safely create a constant of the right type (PR35794)	Sanjay Patel	2018-01-04	1	-0/+16
\| \| \| \|	llvm-svn: 321801
*	[InstCombine] Teach visitICmpInst to not break integer absolute value idioms	Craig Topper	2017-11-12	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds an early out to visitICmpInst if we are looking at a compare as part of an integer absolute value idiom. Similar is already done for min/max. In the particular case I observed in a benchmark we had an absolute value of a load from an indexed global. We simplified the compare using foldCmpLoadFromIndexedGlobal into a magic bit vector, a shift, and an and. But the load result was still used for the select and the negate part of the absolute valute idiom. So we overcomplicated the code and lost the ability to recognize it as an absolute value. I've chosen a simpler case for the test here. Reviewers: spatel, davide, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39766 llvm-svn: 317994
*	[InstCombine] Improve support for ashr in foldICmpAndShift	Craig Topper	2017-10-04	1	-0/+44
\| \| \| \| \| \| \| \|	We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users. Differential Revision: https://reviews.llvm.org/D38521 llvm-svn: 314945
*	[InstCombine] Remove one use restriction on the shift for calls to ↵	Craig Topper	2017-09-26	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \|	foldICmpAndShift. If this transformation succeeds, we're going to remove our dependency on the shift by rewriting the and. So it doesn't matter how many uses the shift has. This distributes the one use check to other transforms in foldICmpAndConstConst that do need it. Differential Revision: https://reviews.llvm.org/D38206 llvm-svn: 314233
*	[InstCombine] Teach foldICmpUsingKnownBits to simplify SLE/SGE/ULE/UGE to ↵	Craig Topper	2017-09-22	1	-8/+4
\| \| \| \| \| \| \| \|	equality comparisons when the min/max ranges intersect in a single value. This is the inverse of what we do for SGT/SLT/UGT/ULT. llvm-svn: 314032
*	[InstCombine] Add test cases for known bits simplifications for comparisons ↵	Craig Topper	2017-09-22	1	-0/+143
\| \| \| \| \| \| \| \|	that don't depend on constant RHS. NFC This shows some missing simplifications for sge/sle/uge/ule relative to their non-equality counterparts. llvm-svn: 314031
*	[InstCombine] Teach getDemandedBitsLHSMask to handle constant splat vectors	Craig Topper	2017-09-20	1	-5/+3
\| \| \| \| \| \| \| \|	This replaces a ConstantInt dyn_cast with m_APInt Differential Revision: https://reviews.llvm.org/D38100 llvm-svn: 313840
*	[InstCombine] Handle (X & C2) < C1 --> (X & C2) == 0	Craig Topper	2017-09-20	1	-2/+2
\| \| \| \| \| \| \| \|	We already did (X & C2) > C1 --> (X & C2) != 0, if any bit set in (X & C2) will produce a result greater than C1. But there is an equivalent inverse condition with <= C1 (which will be canonicalized to < C1+1) Differential Revision: https://reviews.llvm.org/D38065 llvm-svn: 313819
*	[InstCombine] Pre-commit test cases for D38065.	Craig Topper	2017-09-20	1	-0/+22
\| \| \| \|	llvm-svn: 313818
*	[InstCombine] Support vector splats in transformZExtICmp	Craig Topper	2017-08-29	1	-0/+21
\| \| \| \| \| \| \| \| \| \|	This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code. One test didn't vectorize optimize properly due to another bug so a TODO has been added. Differential Revision: https://reviews.llvm.org/D37253 llvm-svn: 312023
*	[InstCombine] look through bswap/bitreverse for equality comparisons	Sanjay Patel	2017-07-02	1	-12/+4
\| \| \| \| \| \| \| \| \|	I noticed this missed bswap optimization in the CGP memcmp() expansion, and then I saw that we don't have the fold in InstCombine. Differential Revision: https://reviews.llvm.org/D34763 llvm-svn: 306980
*	[InstCombine] add tests for icmp with bitreversed ops; NFC	Sanjay Patel	2017-06-28	1	-0/+30
\| \| \| \| \| \|	This is similar enough to bswap that we might as well handle them together in one patch. llvm-svn: 306591
*	[InstCombine] Remove 64-bit bit width restriction from m_ConstantInt(uint64_t*&)	Craig Topper	2017-06-28	1	-8/+3
\| \| \| \| \| \| \| \| \| \|	I think we only need to make sure the value fits in 64-bits not that bit width is 64-bit. This helps places that use this for shift amounts since the shift amount needs to be the same bitwidth as the LHS, but can't be larger than the bit width. Differential Revision: https://reviews.llvm.org/D34737 llvm-svn: 306577
*	[InstCombine] add tests for icmp with bswapped operands; NFC	Sanjay Patel	2017-06-28	1	-0/+30
\| \| \| \|	llvm-svn: 306563
*	[InstCombine] Add test case demonstrating that we don't handle icmp eq ↵	Craig Topper	2017-06-28	1	-0/+21
\| \| \| \| \| \|	(trunc (lshr(X, cst1)), cst->icmp (and X, mask), cst when the shift type is larger than 64-bits. NFC llvm-svn: 306510
*	Revert r306508 "[InstCombine] Add test case demonstrating that we don't ↵	Craig Topper	2017-06-28	1	-21/+0
\| \| \| \| \| \| \| \|	handle icmp eq (trunc (lshr(X, cst1)), cst->icmp (and X, mask), cst when the shift type is larger than 64-bits. NFC" I accidentally had a extra change in there. llvm-svn: 306509
*	[InstCombine] Add test case demonstrating that we don't handle icmp eq ↵	Craig Topper	2017-06-28	1	-0/+21
\| \| \| \| \| \|	(trunc (lshr(X, cst1)), cst->icmp (and X, mask), cst when the shift type is larger than 64-bits. NFC llvm-svn: 306508
*	[InstCombine] canonicalize icmp predicate feeding select	Sanjay Patel	2017-06-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This canonicalization was suggested in D33172 as a way to make InstCombine behavior more uniform. We have this transform for icmp+br, so unless there's some reason that icmp+select should be treated differently, we should do the same thing here. The benefit comes from increasing the chances of creating identical instructions. This is shown in the tests in logical-select.ll (PR32791). InstCombine doesn't fold those directly, but EarlyCSE can simplify the identical cmps, and then InstCombine can fold the selects together. The possible regression for the tests in select.ll raises questions about poison/undef: http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html ...but that transform is just as likely to be triggered by this canonicalization as it is to be missed, so we're just pointing out a commutation deficiency in the pattern matching: https://reviews.llvm.org/rL228409 Differential Revision: https://reviews.llvm.org/D34242 llvm-svn: 306435