summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* Merge DebugLoc on combined stores; in this case, when combining storesPaul Robinson2017-02-061-1/+4
| | | | | | | | from the end of two blocks, merge instead of arbitrarily picking one. Differential Revision: http://reviews.llvm.org/D29504 llvm-svn: 294251
* [InstCombine] simplify dyn_cast + isa; NFCISanjay Patel2017-02-061-6/+4
| | | | llvm-svn: 294198
* [InstCombine] treat i1 as a special type in shouldChangeType()Sanjay Patel2017-02-031-4/+8
| | | | | | | | | | | | | | | | | | | | This patch is based on the llvm-dev discussion here: http://lists.llvm.org/pipermail/llvm-dev/2017-January/109631.html Folding to i1 should always be desirable because that's better for value tracking and we have special folds for i1 types. I checked for other users of shouldChangeType() where this might have an effect, but we already handle the i1 case differently than other types in all of those cases. Side note: the default datalayout includes i1, so it seems we only find this gap in shouldChangeType + phi folding for the case when there is (1) an explicit datalayout without i1, (2) casting to i1 from a legal type, and (3) a phi with exactly 2 incoming casted operands (as Björn mentioned). Differential Revision: https://reviews.llvm.org/D29336 llvm-svn: 294066
* [InstCombine] fix operand-complexity-based canonicalization (PR28296)Sanjay Patel2017-02-031-7/+15
| | | | | | | | | | | | | | | | | | | The code comments didn't match the code logic, and we didn't actually distinguish the fake unary (not/neg/fneg) operators from arguments. Adding another level to the weighting scheme provides more structure and can help simplify the pattern matching in InstCombine and other places. I fixed regressions that would have shown up from this change in: rL290067 rL290127 But that doesn't mean there are no pattern-matching logic holes left; some combines may just be missing regression tests. Should fix: https://llvm.org/bugs/show_bug.cgi?id=28296 Differential Revision: https://reviews.llvm.org/D27933 llvm-svn: 294049
* [InstCombine] move folds for shift-shift pairs; NFCISanjay Patel2017-02-011-48/+34
| | | | | | | | | | | Although this is 'no-functional-change-intended', I'm adding tests for shl-shl and lshr-lshr pairs because there is no existing test coverage for those folds. It seems like we should be able to remove some code from foldShiftedShift() at this point because we're handling those patterns on the general path. llvm-svn: 293814
* [InstCombine] Allow InstCombine to merge adjacent guardsSanjoy Das2017-02-011-6/+14
| | | | | | | | | | | | | | | | | | | | Summary: If there are two adjacent guards with different conditions, we can remove one of them and include its condition into the condition of another one. This patch allows InstCombine to merge them by the following pattern: guard(a); guard(b) -> guard(a & b). Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29378 llvm-svn: 293778
* [Instcombine] Combine consecutive identical fencesDavide Italiano2017-01-312-0/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D29314 llvm-svn: 293661
* Don't combine stores to a swifterror pointer operand to a different typeArnold Schwaighofer2017-01-311-1/+2
| | | | llvm-svn: 293658
* fix formatting; NFCSanjay Patel2017-01-315-17/+17
| | | | llvm-svn: 293652
* [InstCombine] Make sure that LHS and RHS have the same type inSilviu Baranga2017-01-311-0/+4
| | | | | | | | | | | | | | | | | transformToIndexedCompare If they don't have the same type, the size of the constant index would need to be adjusted (and this wouldn't be always possible). Alternatively we could try the analysis with the initial RHS value, which would guarantee that the two sides have the same type. However it is unlikely that in practice this would pass our transformation requirements. Fixes PR31808 (https://llvm.org/bugs/show_bug.cgi?id=31808). llvm-svn: 293629
* [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors ↵Sanjay Patel2017-01-301-54/+19
| | | | | | with splat constants llvm-svn: 293570
* [InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat ↵Sanjay Patel2017-01-301-23/+17
| | | | | | constants llvm-svn: 293562
* [InstCombine] enable (X >>?exact C1) << C2 --> X >>?exact (C1-C2) for ↵Sanjay Patel2017-01-301-24/+22
| | | | | | vectors with splat constants llvm-svn: 293524
* [InstCombine] use auto with obvious type; NFCSanjay Patel2017-01-301-3/+3
| | | | llvm-svn: 293508
* [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1-C2) for vectors ↵Sanjay Patel2017-01-301-20/+16
| | | | | | with splat constants llvm-svn: 293507
* [InstCombine] fixed to propagate 'exact' on lshrSanjay Patel2017-01-301-1/+1
| | | | | | | | | | | | | | | | | The original shift is bigger, so this may qualify as 'obvious', but here's an attempt at an Alive-based proof: Name: exact Pre: (C1 u< C2) %a = shl i8 %x, C1 %b = lshr exact i8 %a, C2 => %c = lshr exact i8 %x, C2 - C1 %b = and i8 %c, ((1 << width(C1)) - 1) u>> C2 Optimization is correct! llvm-svn: 293498
* [InstCombine] enable lshr(shl X, C1), C2 folds for vectors with splat constantsSanjay Patel2017-01-301-25/+25
| | | | llvm-svn: 293489
* [InstCombine] enable (X >>?,exact C1) << C2 --> X << (C2 - C1) for vectors ↵Sanjay Patel2017-01-291-17/+17
| | | | | | with splats llvm-svn: 293435
* [InstCombine] move icmp transforms that might be recognized as min/max and ↵Sanjay Patel2017-01-271-10/+21
| | | | | | | | | | | | | | | | | | | | | | inf-loop (PR31751) This is a minimal patch to avoid the infinite loop in: https://llvm.org/bugs/show_bug.cgi?id=31751 But the general problem is bigger: we're not canonicalizing all of the min/max forms reported by value tracking's matchSelectPattern(), and we don't define min/max consistently. Some code uses matchSelectPattern(), other code uses matchers like m_Umax, and others have their own inline definitions which may be subtly different from any of the above. The reason that the test cases in this patch need a cast op to trigger is because we don't (yet) canonicalize all min/max forms based on matchSelectPattern() in canonicalizeMinMaxWithConstant(), but we do make min/max+cast transforms based on matchSelectPattern() in visitSelectInst(). The location of the icmp transforms that trigger the inf-loop seems arbitrary at best, so I'm moving those behind the min/max fence in visitICmpInst() as the quick fix. llvm-svn: 293345
* [NVPTX] [InstCombine] Add llvm_unreachable to appease MSVC.Justin Lebar2017-01-271-0/+1
| | | | llvm-svn: 293253
* [NVPTX] Fix use-after-stack-free bug in InstCombineCalls.Justin Lebar2017-01-271-1/+1
| | | | | | Introduced in r293244. llvm-svn: 293251
* [NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.Justin Lebar2017-01-271-0/+250
| | | | | | | | | | | | | | | | | | | | | | | | Summary: There are many NVVM intrinsics that we can't entirely get rid of, but that nonetheless often correspond to target-generic LLVM intrinsics. For example, if flush denormals to zero (ftz) is enabled, we can convert @llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32. On the other hand, if ftz is disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a non-ftz PTX instruction. In this case, we can, however, simplify the non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32. These transformations are particularly useful because they let us constant fold instructions that appear in libdevice, the bitcode library that ships with CUDA and essentially functions as its libm. Reviewers: tra Subscribers: hfinkel, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D28794 llvm-svn: 293244
* Revert a couple of InstCombine/Guard checkinsSanjoy Das2017-01-261-29/+0
| | | | | | | | | | | | | | | | | | | | | | | | | This change reverts: r293061: "[InstCombine] Canonicalize guards for NOT OR condition" r293058: "[InstCombine] Canonicalize guards for AND condition" They miscompile cases like: ``` declare void @llvm.experimental.guard(i1, ...) define void @test_guard_not_or(i1 %A, i1 %B) { %C = or i1 %A, %B %D = xor i1 %C, true call void(i1, ...) @llvm.experimental.guard(i1 %D, i32 20, i32 30)[ "deopt"() ] ret void } ``` because they do transfer the `i32 20, i32 30` parameters to newly created guard instructions. llvm-svn: 293227
* [InstCombine] fold (X >>u C) << C --> X & (-1 << C)Sanjay Patel2017-01-261-18/+17
| | | | | | | | | | | | | | | We already have this fold when the lshr has one use, but it doesn't need that restriction. We may be able to remove some code from foldShiftedShift(). Also, move the similar: (X << C) >>u C --> X & (-1 >>u C) ...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst(). That whole function seems questionable since it is called by commonShiftTransforms(), but there's really not much in common if we're checking the shift opcodes for every fold. llvm-svn: 293215
* [InstCombine] use m_APInt to allow (X << C) >>u C --> X & (-1 >>u C) with ↵Sanjay Patel2017-01-261-16/+24
| | | | | | splat vectors llvm-svn: 293208
* [X86] Add demanded elts support for the inputs to pclmul intrinsicCraig Topper2017-01-261-0/+38
| | | | | | | | This intrinsic uses bit 0 and bit 4 of an immediate argument to determine which bits of its inputs to read. This patch uses this information to simplify the demanded elements of the input vectors. Differential Revision: https://reviews.llvm.org/D28979 llvm-svn: 293151
* [InstCombine] Canonicalize guards for NOT OR conditionArtur Pilipenko2017-01-251-0/+12
| | | | | | | | | | | | This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine Reviewed By: apilipenko Differential Revision: https://reviews.llvm.org/D29075 Patch by Maxim Kazantsev. llvm-svn: 293061
* [InstCombine][SSE] Add support for PACKSS/PACKUS constant foldingSimon Pilgrim2017-01-251-0/+94
| | | | | | Differential Revision: https://reviews.llvm.org/D28949 llvm-svn: 293060
* [InstCombine] Canonicalize guards for AND conditionArtur Pilipenko2017-01-251-0/+17
| | | | | | | | | | | | This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine Reviewed By: apilipenko Differential Revision: https://reviews.llvm.org/D29074 Patch by Maxim Kazantsev. llvm-svn: 293058
* [InstCombine] Allow InstrCombine to remove one of adjacent guards if they ↵Artur Pilipenko2017-01-251-0/+10
| | | | | | | | | | | | | | are equivalent This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine Reviewed By: majnemer, apilipenko Differential Revision: https://reviews.llvm.org/D29071 Patch by Maxim Kazantsev. llvm-svn: 293056
* Use InstCombine's builder in foldSelectCttzCtlz instead of creating a new one.Amaury Sechet2017-01-241-3/+2
| | | | | | | | | | Summary: As per title. This will add the instructiions we are interested in in the worklist. Reviewers: mehdi_amini, majnemer, andreadb Differential Revision: https://reviews.llvm.org/D29081 llvm-svn: 292957
* Fix formating in foldSelectCttzCtlz. NFCAmaury Sechet2017-01-241-1/+1
| | | | llvm-svn: 292934
* [InstCombine][X86] MULDQ/MULUDQ undef -> zeroSimon Pilgrim2017-01-242-7/+1
| | | | | | | | Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst llvm-svn: 292913
* SimplifyLibCalls: Replace more unary libcalls with intrinsicsMatt Arsenault2017-01-232-2/+16
| | | | llvm-svn: 292855
* [InstCombine][X86] Add MULDQ/MULUDQ constant folding supportSimon Pilgrim2017-01-231-3/+40
| | | | llvm-svn: 292793
* [InstCombine][X86] MULDQ/MULUDQ undef -> zeroSimon Pilgrim2017-01-231-2/+2
| | | | | | Match generic mul behaviour so that <X x i64> multiply and muldq/muludq pattern act the same llvm-svn: 292784
* [InstCombine] use m_APInt to allow ashr folds for vectors with splat constantsSanjay Patel2017-01-211-21/+28
| | | | | | | We may be able to assert that no shl-shl or lshr-lshr pairs ever get here because we should have already handled those in foldShiftedShift(). llvm-svn: 292726
* [InstCombine][X86] Add MULDQ/MULUDQ undef handlingSimon Pilgrim2017-01-202-0/+21
| | | | llvm-svn: 292627
* [InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructionsSimon Pilgrim2017-01-201-0/+54
| | | | | | | | Simplify a packss/packus truncation based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28777 llvm-svn: 292591
* [InstCombine] Simplify gep (gep p, a), (b-a)Davide Italiano2017-01-191-19/+13
| | | | | | | | Patch by Andrea Canciani. Differential Revision: https://reviews.llvm.org/D27413 llvm-svn: 292506
* [InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1Sanjay Patel2017-01-191-24/+43
| | | | | | | | | | | | | | Try harder to fold icmp with shl nsw as discussed here: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html This is similar to the 'shl nuw' transforms that were added with D25913. This may eventually help solve: https://llvm.org/bugs/show_bug.cgi?id=30773 Differential Revision: https://reviews.llvm.org/D28406 llvm-svn: 292492
* [InstCombine] add an assert to make a shl+icmp transform assumption ↵Sanjay Patel2017-01-181-1/+9
| | | | | | explicit; NFCI llvm-svn: 292440
* [InstCombine] remove a redundant check; NFCISanjay Patel2017-01-181-2/+0
| | | | | | | I missed deleting this check when I refactored this chunk in: https://reviews.llvm.org/rL292260 llvm-svn: 292433
* [InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shufflesSimon Pilgrim2017-01-181-1/+4
| | | | | | Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292371
* [InstCombine] Remove unnecessary intrinsics demanded elts handlingSimon Pilgrim2017-01-181-22/+2
| | | | | | As discussed on D28777 - we don't need to handle 'all element' shuffles inside InstCombiner::visitCallInst as InstCombiner::SimplifyDemandedVectorElts will do everything we need. llvm-svn: 292365
* [InstCombine] refactor foldICmpShlConstant(); NFCISanjay Patel2017-01-171-32/+35
| | | | | | | This reduces the size of and increases the symmetry with the planned functional change in: https://reviews.llvm.org/D28406 llvm-svn: 292260
* [InstCombine] Fold ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2)David Majnemer2017-01-171-15/+28
| | | | | | | This further extends r292179 to support additional binary operators beyond subtraction. llvm-svn: 292238
* [InstCombine] reduce indent; NFCISanjay Patel2017-01-171-133/+131
| | | | llvm-svn: 292230
* [InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS ↵Simon Pilgrim2017-01-172-2/+20
| | | | | | | | instructions Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292209
* [InstCombine] Don't DSE across readnone functions that may throwSanjoy Das2017-01-171-5/+5
| | | | | | | | | | | | Summary: Depends on D28740 Reviewers: dberlin, chandlerc, hfinkel, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D28742 llvm-svn: 292197
OpenPOWER on IntegriCloud