bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in ↵	Sanjay Patel	2019-10-13	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	non-default address space Follow-up to D68244 to account for a corner case discussed in: https://bugs.llvm.org/show_bug.cgi?id=43501 Add one more restriction: if the pointer is deref-or-null and in a non-default (non-zero) address space, we can't assume inbounds. Differential Revision: https://reviews.llvm.org/D68706 llvm-svn: 374728
*	[InstCombine] don't assume 'inbounds' for bitcast pointer to GEP transform ↵	Sanjay Patel	2019-10-06	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \|	(PR43501) https://bugs.llvm.org/show_bug.cgi?id=43501 We can't declare a GEP 'inbounds' in general. But we may salvage that information if we have known dereferenceable bytes on the source pointer. Differential Revision: https://reviews.llvm.org/D68244 llvm-svn: 373847
*	[Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)	Guillaume Chatelet	2019-09-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68141 llvm-svn: 373207
*	[InstCombine] rename variable for readability; NFC	Sanjay Patel	2019-09-11	1	-21/+21
\| \| \| \| \| \| \|	There's more that can be done here, but "OpI" doesn't convey that we casted to BinaryOperator. llvm-svn: 371682
*	[InstCombine] recognize bswap disguised as shufflevector	Sanjay Patel	2019-09-02	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bitcast <N x i8> (shuf X, undef, <N, N-1,...0>) to i{N8} --> bswap (bitcast X to i{N8}) In PR43146: https://bugs.llvm.org/show_bug.cgi?id=43146 ...we have a more complicated case where SLP is making a mess of bswap. This patch won't do anything for that currently, but we need to improve bswap recognition in instcombine, SLP, and/or a standalone pass to avoid that problem. This is limited using the data-layout so we don't try to do this transform with actual vector types. The backend does not appear to have folds to convert in either direction, so we don't want to mess up something that is actually better lowered as a shuffle. On x86, we're trading something like this: vmovd %edi, %xmm0 vpshufb LCPI0_0(%rip), %xmm0, %xmm0 ## xmm0 = xmm0[3,2,1,0,u,u,u,u,u,u,u,u,u,u,u,u] vmovd %xmm0, %eax For: movl %edi, %eax bswapl %eax Differential Revision: https://reviews.llvm.org/D66965 llvm-svn: 370659
*	[InstCombine] reduce duplicated code; NFC	Sanjay Patel	2019-08-29	1	-10/+13
\| \| \| \|	llvm-svn: 370399
*	Revert "[InstCombine] try to narrow a truncated load"	Vlad Tsyrklevich	2019-07-25	1	-39/+0
\| \| \| \| \| \| \| \| \|	This reverts commit bc4a63fd3c29c1a8ce22891bf34ee4dccfef578c, this is a speculative revert to fix a number of sanitizer bots (like sanitizer-x86_64-linux-bootstrap-ubsan) that have started to see stage2 compiler crashes, presumably due to a miscompile. llvm-svn: 367029
*	[InstCombine] try to narrow a truncated load	Sanjay Patel	2019-07-25	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	trunc (load X) --> load (bitcast X to narrow type) We have this transform in DAGCombiner::ReduceLoadWidth(), but the truncated load pattern can interfere with other instcombine transforms, so I'd like to allow the fold sooner. Example: https://bugs.llvm.org/show_bug.cgi?id=16739 ...in that report, we have bitcasts bracketing these ops, so those could get eliminated too. We've generally ruled out widening of loads early in IR ( LoadCombine - http://lists.llvm.org/pipermail/llvm-dev/2016-September/105291.html ), but that reasoning may not apply to narrowing if we can preserve information such as the dereferenceable range. Differential Revision: https://reviews.llvm.org/D64432 llvm-svn: 367011
*	[InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg	Cameron McInally	2019-06-11	1	-4/+12
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D62629 llvm-svn: 363080
*	[InstCombine] simplify code for bitcast of insertelement; NFC	Sanjay Patel	2019-06-05	1	-5/+4
\| \| \| \|	llvm-svn: 362655
*	[InstCombine] When turning sext into zext due to known bits, return the new ↵	Craig Topper	2019-05-08	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \|	ZExt instead of calling replaceinstuseswith The worklist loop that we're returning back to should be able to do the repacement itself. This is how we normally do replacements. My main motivation was that I observed that we weren't preserving the name of the result when we do this transform. The replacement code in the worklist loop will call takeName as part of the replacement. Differential Revision: https://reviews.llvm.org/D61695 llvm-svn: 360284
*	Extra processing for BitCast + PHI in InstCombine	Gabor Buella	2019-02-09	1	-11/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For some specific cases with bitcast A->B->A with intervening PHI nodes InstCombiner::optimizeBitCastFromPhi transformation creates extra PHI nodes, which are actually a copy of already created PHI or in another words, they are redundant. These extra PHI nodes could lead to extra move instructions generated after DeSSA transformation. This happens when several conditions are met - SROA kicks in and creates new alloca; - there is a simple assignment L = R, which falls under 'canonicalize loads' done by combineLoadToOperationType (this transformation is by default). Exactly this transformation is the reason of bitcasts generated; - the alloca is then used in A->B->A + PHI chain; - there is a loop unrolling. As a result optimizeBitCastFromPhi creates as many of PHI nodes for each new SROA alloca as loop unrolling factor is. These new extra PHI nodes are redundant actually except of one and should not be created. Moreover the idea of optimizeBitCastFromPhi is to get rid of the cast (when possible) but that doesn't happen in these conditions. The proposed fix is to do the cast replacement for the whole calculated/accumulated PHI closure not for one cast only, which is an argument to the optimizeBitCastFromPhi. These will help to accomplish several things: 1) avoid extra PHI nodes generated as all casts which may trigger optimizeBitCastFromPhi transformation will be replaced, 3) bitcasts will be replaced, and 3) create more opportunities to remove dead code, which appears after the replacement. A new test case shows that it's possible to get rid of all bitcasts completely and get quite good code reduction. Author: Igor Tsimbalist <igor.v.tsimbalist@intel.com> Reviewed By: Carrot Differential Revision: https://reviews.llvm.org/D57053 llvm-svn: 353595
*	[opaque pointer types] Pass value type to GetElementPtr creation.	James Y Knight	2019-02-01	1	-1/+2
\| \| \| \| \| \| \| \| \|	This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 llvm-svn: 352913
*	[opaque pointer types] Pass function types to CallInst creation.	James Y Knight	2019-02-01	1	-2/+2
\| \| \| \| \| \| \| \| \|	This cleans up all CallInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57170 llvm-svn: 352909
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[InstCombine] reduce raw IR narrowing rotate patterns to funnel shift	Sanjay Patel	2019-01-04	1	-16/+8
\| \| \| \| \| \| \| \| \|	Similar to rL350199 - there are no known analysis/codegen holes for funnel shift intrinsics now, so we can canonicalize the 6+ regular instructions to funnel shift to improve vectorization, inlining, unrolling, etc. llvm-svn: 350419
*	[InstCombine] don't widen an arbitrary sequence of vector ops (PR40032)	Sanjay Patel	2018-12-17	1	-11/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem is shown specifically for a case with vector multiply here: https://bugs.llvm.org/show_bug.cgi?id=40032 ...and this might mask the original backend bug for ARM shown in: https://bugs.llvm.org/show_bug.cgi?id=39967 As the test diffs here show, we were (and probably still aren't) doing these kinds of transforms in a principled way. We are producing more or equal wide instructions than we started with in some cases, so we still need to restrict/correct other transforms from overstepping. If there are perf regressions from this change, we can either carve out exceptions to the general IR rules, or improve the backend to do these transforms when we know the transform is profitable. That's probably similar to a change like D55448. Differential Revision: https://reviews.llvm.org/D55744 llvm-svn: 349389
*	[InstCombine] fix rotate narrowing bug for non-pow-2 types	Sanjay Patel	2018-11-15	1	-2/+7
\| \| \| \|	llvm-svn: 346968
*	[InstCombine] narrow width of rotate patterns, part 3	Sanjay Patel	2018-11-12	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a longer variant for the pattern handled in rL346713 This one includes zexts. Eventually, we should canonicalize all rotate patterns to the funnel shift intrinsics, but we need a bit more infrastructure to make sure the vectorizers handle those intrinsics as well as the shift+logic ops. https://rise4fun.com/Alive/FMn Name: narrow rotateright %neg = sub i8 0, %shamt %rshamt = and i8 %shamt, 7 %rshamtconv = zext i8 %rshamt to i32 %lshamt = and i8 %neg, 7 %lshamtconv = zext i8 %lshamt to i32 %conv = zext i8 %x to i32 %shr = lshr i32 %conv, %rshamtconv %shl = shl i32 %conv, %lshamtconv %or = or i32 %shl, %shr %r = trunc i32 %or to i8 => %maskedShAmt2 = and i8 %shamt, 7 %negShAmt2 = sub i8 0, %shamt %maskedNegShAmt2 = and i8 %negShAmt2, 7 %shl2 = lshr i8 %x, %maskedShAmt2 %shr2 = shl i8 %x, %maskedNegShAmt2 %r = or i8 %shl2, %shr2 llvm-svn: 346716
*	[InstCombine] narrow width of rotate patterns, part 2 (PR39624)	Sanjay Patel	2018-11-12	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sub-pattern for the shift amount in a rotate can take on several different forms, and there's apparently no way to canonicalize those without seeing the entire rotate sequence. This is the form noted in: https://bugs.llvm.org/show_bug.cgi?id=39624 https://rise4fun.com/Alive/qnT %zx = zext i8 %x to i32 %maskedShAmt = and i32 %shAmt, 7 %shl = shl i32 %zx, %maskedShAmt %negShAmt = sub i32 0, %shAmt %maskedNegShAmt = and i32 %negShAmt, 7 %shr = lshr i32 %zx, %maskedNegShAmt %rot = or i32 %shl, %shr %r = trunc i32 %rot to i8 => %truncShAmt = trunc i32 %shAmt to i8 %maskedShAmt2 = and i8 %truncShAmt, 7 %shl2 = shl i8 %x, %maskedShAmt2 %negShAmt2 = sub i8 0, %truncShAmt %maskedNegShAmt2 = and i8 %negShAmt2, 7 %shr2 = lshr i8 %x, %maskedNegShAmt2 %r = or i8 %shl2, %shr2 llvm-svn: 346713
*	[InstCombine] refactor code for matching shift amount of a rotate; NFC	Sanjay Patel	2018-11-12	1	-12/+17
\| \| \| \| \| \| \| \|	As shown in existing test cases and with: https://bugs.llvm.org/show_bug.cgi?id=39624 ...we're missing at least 2 more patterns for rotate narrowing. llvm-svn: 346711
*	[FPEnv] Last BinaryOperator::isFNeg(...) to m_FNeg(...) changes	Cameron McInally	2018-10-25	1	-2/+3
\| \| \| \| \| \| \| \| \|	Replacing BinaryOperator::isFNeg(...) to avoid regressions when we separate FNeg from the FSub IR instruction. Differential Revision: https://reviews.llvm.org/D53650 llvm-svn: 345295
*	[InstCombine] reverse 'trunc X to <N x i1>' canonicalization; 2nd try	Sanjay Patel	2018-10-10	1	-4/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-trying r344082 because it unintentionally included extra diffs. Original commit message: icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344181
*	revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization	Sanjay Patel	2018-10-10	1	-27/+4
\| \| \| \| \| \|	This commit accidentally included the diffs from D53057. llvm-svn: 344178
*	[InstCombine] reverse 'trunc X to <N x i1>' canonicalization	Sanjay Patel	2018-10-09	1	-4/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344082
*	Fix InstCombine address space assert	Ewan Crawford	2018-07-31	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Workaround bug where the InstCombine pass was asserting on the IR added in lit test, where we have a bitcast instruction after a GEP from an addrspace cast. The second bitcast in the test was getting combined into `bitcast <16 x i32>* %0 to <16 x i32> addrspace(3)`, which looks like it should be an addrspace cast instruction instead. Otherwise if control flow is allowed to continue as it is now we create a GEP instruction `<badref> = getelementptr inbounds <16 x i32>, <16 x i32> %0, i32 0`. However because the type of this instruction doesn't match the address space we hit an assert when replacing the bitcast with that GEP. ``` void llvm::Value::doRAUW(llvm::Value*, bool): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed. ``` Differential Revision: https://reviews.llvm.org/D50058 llvm-svn: 338395
*	[InstCombine] Preserve debug value when simplifying cast-of-select	Vedant Kumar	2018-07-17	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	InstCombine has a cast transform that matches a cast-of-select: Orig = cast (Src = select Cond TV FV) And tries to replace it with a select which has the cast folded in: NewSel = select Cond (cast TV) (cast FV) The combiner does RAUW(Orig, NewSel), so any debug values for Orig would survive the transform. But debug values for Src would be lost. This patch teaches InstCombine to replace all debug uses of Src with NewSel (taking care of doing any necessary DIExpression rewriting). Differential Revision: https://reviews.llvm.org/D49270 llvm-svn: 337310
*	[Local] replaceAllDbgUsesWith: Update debug values before RAUW	Vedant Kumar	2018-07-06	1	-15/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The replaceAllDbgUsesWith utility helps passes preserve debug info when replacing one value with another. This improves upon the existing insertReplacementDbgValues API by: - Updating debug intrinsics in-place, while preventing use-before-def of the replacement value. - Falling back to salvageDebugInfo when a replacement can't be made. - Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into common utility code. Along with the API change, this teaches replaceAllDbgUsesWith how to create DIExpressions for three basic integer and pointer conversions: - The no-op conversion. Applies when the values have the same width, or have bit-for-bit compatible pointer representations. - Truncation. Applies when the new value is wider than the old one. - Zero/sign extension. Applies when the new value is narrower than the old one. Testing: - check-llvm, check-clang, a stage2 `-g -O3` build of clang, regression/unit testing. - This resolves a number of mis-sized dbg.value diagnostics from Debugify. Differential Revision: https://reviews.llvm.org/D48676 llvm-svn: 336451
*	[InstCombine] allow narrowing of min/max/abs	Sanjay Patel	2018-07-04	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have bailout hacks based on min/max in various places in instcombine that shouldn't be necessary. The affected test was added for: D48930 ...which is a consequence of the improvement in: D48584 (https://reviews.llvm.org/rL336172) I'm assuming the visitTrunc bailout in this patch was added specifically to avoid a change from SimplifyDemandedBits, so I'm just moving that below the EvaluateInDifferentType optimization. A narrow min/max is still a min/max. llvm-svn: 336293
*	[DebugInfo][InstCombine] Preserve DI after combining zext	Anastasis Grammenos	2018-07-04	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \|	When zext is EvaluatedInDifferentType, InstCombine drops the dbg.value intrinsic. This patch tries to preserve said DI, by inserting the zext's old DI in the resulting instruction. (Only for integer type for now) Differential Revision: https://reviews.llvm.org/D48331 llvm-svn: 336254
*	[InstCombine] Avoid creating mis-sized dbg.values in commonCastTransforms()	Vedant Kumar	2018-06-27	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This prevents InstCombine from creating mis-sized dbg.values when replacing a sequence of casts with a simpler cast. For example, in: (fptrunc (floor (fpext X))) -> (floorf X) We no longer emit dbg.value(X) (with a 32-bit float operand) to describe (fpext X) (which is a 64-bit float). This was diagnosed by the debugify check added in r335682. llvm-svn: 335696
*	[Local] Add a convenient insertReplacementDbgValues overload, NFC	Vedant Kumar	2018-06-26	1	-5/+1
\| \| \| \| \| \| \|	Add an overload for the common case where the replacement dbg.values have the same DIExpressions as the originals. llvm-svn: 335643
*	[InstCombine] use constant pattern matchers with icmp+sext	Sanjay Patel	2018-06-21	1	-14/+11
\| \| \| \| \| \| \| \|	The previous code worked with vectors, but it failed when the vector constants contained undef elements. The matchers handle those cases. llvm-svn: 335262
*	[Local] Add a utility to insert replacement dbg.values, NFC	Vedant Kumar	2018-06-20	1	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The purpose of this utility is to make it easier for optimizations to insert replacement dbg.values for instructions they are deleting. This is useful in situations where salvageDebugInfo is inapplicable, say, because the new dbg.value cannot refer to an operand of the dying value. The utility is called insertReplacementDbgValues. It assumes that the instruction 'From' is going to be deleted, and inserts replacement dbg.values for each debug user of 'From'. The newly-inserted dbg.values refer to 'To' instead of 'From'. Each replacement dbg.value has the same location and variable as the debug user it replaces, has a DIExpression determined by the result of 'RewriteExpr' applied to an old debug user of 'From', and is placed before 'InsertBefore'. This should simplify future patches, like D48331. llvm-svn: 335144
*	[InstCombine] don't change the size of a select if it would mismatch its ↵	Sanjay Patel	2018-05-31	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	condition operands' sizes Don't always: cast (select (cmp x, y), z, C) --> select (cmp x, y), (cast z), C' This is something that came up as far back as D26556, and I lost track of it. I suspect that this transform is part of the underlying problem that is inspiring some of the recent proposals that seek to match larger patterns that include a cast op. Even if that's not true, this transform causes problems for codegen (particularly with vector types). A transform to actively match the size of cmp and select operand sizes should follow. This patch just removes the harmful canonicalization in the other direction. Differential Revision: https://reviews.llvm.org/D47163 llvm-svn: 333611
*	[InstCombine] remove fptrunc (select) code; NFCI	Sanjay Patel	2018-05-21	1	-17/+0
\| \| \| \| \| \| \|	This pattern is handled within commonCastTransforms(), so the code here is dead AFAICT. llvm-svn: 332887
*	Rename DEBUG macro to LLVM_DEBUG.	Nicola Zaghen	2018-05-14	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
*	[InstCombine] Replace an 'if' that should always be true with an assert.	Craig Topper	2018-05-10	1	-7/+6
\| \| \| \| \| \|	The bitwidth of the operation should always be wider than the result width of the truncate since we don't recurse through any width changing operations. llvm-svn: 332055
*	[InstCombine] Reorder an if condition to put a cheap check in front of a ↵	Craig Topper	2018-05-10	1	-3/+3
\| \| \| \| \| \|	computeKnownBits call. NFC llvm-svn: 331948
*	[InstCombine] Use APInt::getBitsSetFrom to shortern a line and fix an 80 ↵	Craig Topper	2018-05-10	1	-2/+2
\| \| \| \| \| \| \| \|	columns violation. NFC Fix a similar line in the same function. llvm-svn: 331947
*	Remove @brief commands from doxygen comments, too.	Adrian Prantl	2018-05-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-up to r331272. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done https://reviews.llvm.org/D46290 llvm-svn: 331275
*	[InstCombine] simplify code that propagates FMF; NFC	Sanjay Patel	2018-04-07	1	-12/+4
\| \| \| \|	llvm-svn: 329503
*	[InstCombine] fix formatting; NFC	Sanjay Patel	2018-03-24	1	-37/+30
\| \| \| \|	llvm-svn: 328425
*	[InstCombine] Add constant vector support to getMinimumFPType for visitFPTrunc.	Craig Topper	2018-03-05	1	-0/+34
\| \| \| \| \| \| \| \|	This patch teaches getMinimumFPType to support shrinking a vector of ConstantFPs. This should improve our ability to combine vector fptrunc with fp binops. Differential Revision: https://reviews.llvm.org/D43774 llvm-svn: 326729
*	[InstCombine] Rewrite the binary op shrinking in visitFPTrunc to avoid ↵	Craig Topper	2018-03-02	1	-47/+43
\| \| \| \| \| \| \| \| \| \|	creating overly small ConstantFPs that we'll just need to extend again. Instead of returning the smaller FP constant we now return the minimal Type the constant can fit into. We also return the Type of the input to any fp extends. The legality checks are then done on just the size of these Types. If we find something profitable we then emit FPTruncs in front of the smaller binop and assume those FPTruncs will be constant folded or combined with any ConstantFPs or fpextends. Differential Revision: https://reviews.llvm.org/D44038 llvm-svn: 326617
*	[InstCombine] Split the FP constant code out of lookThroughFPExtensions and ↵	Craig Topper	2018-02-28	1	-15/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	use nullptr as a sentinel Currently this code's control flow very much assumes that there are no meaningful checks after determining that it's a ConstantFP. So whenever it wants to stop it just does "return V". But V is also the variable name it uses when it wants to return a new value. So 'return V' appears multiple times with different meanings. This patch just moves all the code into a helper function and returns nullptr when it wants to stop. I've split this from D43774 while I try to figure out how to best handle the vector case there. But this change by itself at least seemed like a readability improvement. Differential Revision: https://reviews.llvm.org/D43833 llvm-svn: 326361
*	Adding a width of the GEP index to the Data Layout.	Elena Demikhovsky	2018-02-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size. Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size. Differential Revision: https://reviews.llvm.org/D42123 llvm-svn: 325102
*	[InstCombine] don't try to evaluate instructions with >1 use (revert r324014)	Sanjay Patel	2018-02-05	1	-17/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This example causes a compile-time explosion: define i16 @foo(i16 %in) { %x = zext i16 %in to i32 %a1 = mul i32 %x, %x %a2 = mul i32 %a1, %a1 %a3 = mul i32 %a2, %a2 %a4 = mul i32 %a3, %a3 %a5 = mul i32 %a4, %a4 %a6 = mul i32 %a5, %a5 %a7 = mul i32 %a6, %a6 %a8 = mul i32 %a7, %a7 %a9 = mul i32 %a8, %a8 %a10 = mul i32 %a9, %a9 %a11 = mul i32 %a10, %a10 %a12 = mul i32 %a11, %a11 %a13 = mul i32 %a12, %a12 %a14 = mul i32 %a13, %a13 %a15 = mul i32 %a14, %a14 %a16 = mul i32 %a15, %a15 %a17 = mul i32 %a16, %a16 %a18 = mul i32 %a17, %a17 %a19 = mul i32 %a18, %a18 %a20 = mul i32 %a19, %a19 %a21 = mul i32 %a20, %a20 %a22 = mul i32 %a21, %a21 %a23 = mul i32 %a22, %a22 %a24 = mul i32 %a23, %a23 %T = trunc i32 %a24 to i16 ret i16 %T } llvm-svn: 324276
*	[InstCombine] only allow narrow/wide evaluation of values with >1 use if ↵	Sanjay Patel	2018-02-05	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \|	that user is a binop There was a logic hole in D42739 / rL324014 because we're not accounting for select and phi instructions that might have repeated operands. This is likely a source of an infinite loop. I haven't manufactured a test case to prove that, but it should be safe to speculatively limit this transform to binops while we try to create that test. llvm-svn: 324252
*	[InstCombine] allow multi-use values in canEvaluate* if all uses are in 1 inst	Sanjay Patel	2018-02-01	1	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the enhancement suggested in D42536 to fix a shortcoming in regular InstCombine's canEvaluate* functionality. When we have multiple uses of a value, but they're all in one instruction, we can allow that expression to be narrowed or widened for the same cost as a single-use value. AFAICT, this can only matter for multiply: sub/and/or/xor/select would be simplified away if the operands are the same value; add becomes shl; shifts with a variable shift amount aren't handled. Differential Revision: https://reviews.llvm.org/D42739 llvm-svn: 324014