summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] fix bug when offsetting case values of a switch (PR31260)Sanjay Patel2016-12-121-25/+15
| | | | | | | | | | | | | We could truncate the condition and then try to fold the add into the original condition value causing wrong case constants to be used. Move the offset transform ahead of the truncate transform and return after each transform, so there's no chance of getting confused values. Fix for: https://llvm.org/bugs/show_bug.cgi?id=31260 llvm-svn: 289442
* [InstCombine] clean up range-for-loops in visitSwitchInst(); NFCISanjay Patel2016-12-121-7/+7
| | | | llvm-svn: 289439
* [InstCombine][XOP] The instructions for the scalar frcz intrinsics are ↵Craig Topper2016-12-111-2/+14
| | | | | | defined to put 0 in the upper bits, not pass bits through like other intrinsics. So we should return a zero vector instead. llvm-svn: 289411
* [X86][InstCombine] Add support for scalar FMA intrinsics to ↵Craig Topper2016-12-111-0/+29
| | | | | | | | SimplifyDemandedVectorElts. This teaches SimplifyDemandedElts that the FMA can be removed if the lower element isn't used. It also teaches it that if upper elements of the first operand aren't used then we can simplify them. llvm-svn: 289377
* [X86][InstCombine] Teach InstCombineCalls to simplify demanded elements for ↵Craig Topper2016-12-111-0/+8
| | | | | | | | scalar FMA intrinsics. These intrinsics don't read the upper bits of their second and third inputs so we can try to simplify them. llvm-svn: 289372
* [AVX-512][InstCombine] Teach InstCombineCalls how to simplify demanded for ↵Craig Topper2016-12-111-1/+3
| | | | | | | | scalar cmp intrinsics with masking and rounding. These intrinsics don't read the upper elements of their first and second input. These are slightly different the the SSE version which does use the upper bits of its first element as passthru bits since the result goes to an XMM register. For AVX-512 the result goes to a mask register instead. llvm-svn: 289371
* [AVX-512][InstCombine] Teach InstCombineCalls how to simplify demanded ↵Craig Topper2016-12-111-0/+31
| | | | | | | | elements for scalar add,div,mul,sub,max,min intrinsics with masking and rounding. These intrinsics don't read the upper bits of their second input. And the third input is the passthru for masking and that only uses the lower element as well. llvm-svn: 289370
* [AVX-512][InstCombine] Add 512-bit vpermilvar intrinsics to InstCombineCalls ↵Craig Topper2016-12-111-10/+10
| | | | | | to match 128 and 256-bit. llvm-svn: 289354
* [X86][InstCombine] Teach InstCombineCalls to turn pshufb intrinsic into a ↵Craig Topper2016-12-111-2/+3
| | | | | | shufflevector if the indices are constant. llvm-svn: 289348
* [InstCombine] add helper for shift-by-shift folds; NFCISanjay Patel2016-12-101-150/+162
| | | | | | | These are currently limited to integer types, but we should be able to extend to splat vectors and possibly general vectors. llvm-svn: 289343
* [InstCombine] change select type to eliminate bitcastsSanjay Patel2016-12-031-0/+47
| | | | | | | | | | | | | | This solves a secondary problem seen in PR6137: https://llvm.org/bugs/show_bug.cgi?id=6137#c6 This is similar to the bitwise logic op fold added with: https://reviews.llvm.org/rL287707 And like that patch, I'm artificially restricting the transform from vector <-> scalar types until we're sure that the backend can handle that. llvm-svn: 288584
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-022-4/+4
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* [PR29121] Don't fold if it would produce atomic vector loads or storesPhilip Reames2016-12-011-14/+28
| | | | | | | | The instcombine code which folds loads and stores into their use types can trip up if the use is a bitcast to a type which we can't directly load or store in the IR. In principle, such types shouldn't exist, but in practice they do today. This is a workaround to avoid a bug while we work towards the long term goal. Differential Revision: https://reviews.llvm.org/D24365 llvm-svn: 288415
* [InstCombine] allow more narrowing transforms for logic opsSanjay Patel2016-11-302-9/+24
| | | | | | | We had a limited version of this for scalar 'and'; this expands the transform to 'or' and 'xor' and allows vectors types too. llvm-svn: 288273
* [InstSimplify] allow integer vector types to use computeKnownBitsSanjay Patel2016-11-271-5/+5
| | | | | | | | Note that the non-splat lshr+lshr test folded, but that does not work in general. Something is missing or wrong in computeKnownBits as the non-splat shl+shl test still shows. llvm-svn: 288005
* [InstCombine] don't drop metadata in FoldOpIntoSelect()Sanjay Patel2016-11-261-3/+3
| | | | llvm-svn: 287980
* add optional param to copy metadata when creating selects; NFCSanjay Patel2016-11-261-7/+3
| | | | | | | | | | | There are other spots where we can use this; we're currently dropping metadata in some places, and there are proposed changes where we will want to propagate metadata. IRBuilder's CreateSelect() already has a parameter like this, so this change makes the regular 'Create' API line up with that. llvm-svn: 287976
* Replace some callers of setTailCall with setTailCallKindDavid Majnemer2016-11-251-6/+5
| | | | | | | We were a little sloppy with adding tailcall markers. Be more consistent by using setTailCallKind instead of setTailCall. llvm-svn: 287955
* add and use isBitwiseLogicOp() helper function; NFCISanjay Patel2016-11-223-33/+14
| | | | llvm-svn: 287712
* [InstCombine] change bitwise logic type to eliminate bitcastsSanjay Patel2016-11-221-0/+43
| | | | | | | | | | | | | | | | | | | | In PR27925: https://llvm.org/bugs/show_bug.cgi?id=27925 ...we proposed adding this fold to eliminate a bitcast. In D20774, there was some concern about changing the type of a bitwise op as well as creating bitcasts that might not be free for a target. However, if we're strictly eliminating an instruction (by limiting this to one-use ops), then we should be able to do this in InstCombine. But we're cautiously restricting the transform for now to vector types to avoid possible backend problems. A transform to make sure the logic op is legal for the target should be added to reverse this transform and improve codegen. Differential Revision: https://reviews.llvm.org/D26641 llvm-svn: 287707
* [InstCombine] canonicalize min/max constant to select's false valueSanjay Patel2016-11-211-0/+42
| | | | | | | | | | | | | | | | | | | | This is a first step towards canonicalization and improved folding/codegen for integer min/max as discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html Here, we're just matching the simplest min/max patterns and adjusting the icmp predicate while swapping the select operands. I've included FIXME tests in test/Transforms/InstCombine/select_meta.ll so it's easier to see how this might be extended (corresponds to the TODO comment in the code). That's also why I'm using matchSelectPattern() rather than a simpler check; once the backend is patched, we can just remove some of the restrictions to allow the obfuscated min/max patterns in the FIXME tests to be matched. Differential Revision: https://reviews.llvm.org/D26525 llvm-svn: 287585
* fix formatting; NFCSanjay Patel2016-11-211-1/+0
| | | | llvm-svn: 287582
* Fix spelling mistakes in Transforms comments. NFC.Simon Pilgrim2016-11-201-1/+1
| | | | | | Identified by Pedro Giffuni in PR27636. llvm-svn: 287488
* [InstCombine][AVX-512] Teach InstCombineCalls how to handle the intrinsics ↵Craig Topper2016-11-181-0/+18
| | | | | | | | for variable shift with 16-bit elements. This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches. llvm-svn: 287316
* [CMake] NFC. Updating CMake dependency specificationsChris Bieneman2016-11-171-2/+3
| | | | | | This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system. llvm-svn: 287206
* [InstCombine] replace unreachable with assert and remove unreachable code; NFCISanjay Patel2016-11-161-20/+9
| | | | llvm-svn: 287147
* [InstCombine] fix formatting and add FIXMEs to ↵Sanjay Patel2016-11-161-11/+15
| | | | | | foldOperationIntoSelectOperand(); NFC llvm-svn: 287145
* [X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmulCraig Topper2016-11-162-72/+0
| | | | | | | | | | | | Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file. Reviewers: zvi, delena, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26660 llvm-svn: 287083
* [InstCombine][AVX-512] Teach InstCombineCalls to handle the new unmasked ↵Craig Topper2016-11-131-4/+18
| | | | | | AVX-512 variable shift intrinsics. llvm-svn: 286755
* [InstCombine][AVX-512] Expand vector shift handling to work on the AVX-512 ↵Craig Topper2016-11-131-1/+45
| | | | | | | | shift by immediate and shift by single value. This does not include support for the AVX-512 variable shifts. That will be coming in a future patch. llvm-svn: 286739
* [InstCombine] use dyn_cast rather isa+cast; NFCSanjay Patel2016-11-111-2/+2
| | | | | | Follow-up to r286664 cleanup as suggested by Eli. Thanks! llvm-svn: 286671
* [InstCombine] clean up foldSelectOpOp(); NFCSanjay Patel2016-11-111-10/+4
| | | | llvm-svn: 286664
* [InstCombine] fix formatting of FoldOpIntoSelect(); NFCISanjay Patel2016-11-111-41/+43
| | | | llvm-svn: 286604
* [InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923)Sanjay Patel2016-11-101-0/+8
| | | | | | | | | | | | Removing the limitation in visitInsertElementInst() causes several regressions because we're not prepared to fold sequences of shuffles or inserts and extracts separated by shuffles. Fixing that appears to be a difficult mission because we are purposely trying to avoid creating shuffles with arbitrary shuffle masks because some targets may choke on those. https://llvm.org/bugs/show_bug.cgi?id=30923 llvm-svn: 286423
* [InstCombine] fix profitability equation for max-of-nots transformSanjay Patel2016-11-091-7/+6
| | | | | | | | | | As the test change shows, we can increase the critical path by adding a 'not' instruction, so make sure that we're actually removing an instruction if we do this transform. This transform could also cause us to miss folds of min/max pairs. llvm-svn: 286315
* [InstCombine] reduce indentation; NFCSanjay Patel2016-11-081-23/+20
| | | | llvm-svn: 286314
* [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)Sanjay Patel2016-11-071-14/+12
| | | | | | | | This was reverted at r285866 because there was a crash handling a scalar select of vectors. I added a check for that pattern and a test case based on the example provided in the post-commit thread for r285732. llvm-svn: 286113
* Revert "[InstCombine] allow splat vector folds in adjustMinMax()"Greg Bedwell2016-11-021-10/+14
| | | | | | | | | | | | | | | | | This reverts commit r285732. This change introduced a new assertion failure in the following testcase at -O2: typedef short __v8hi __attribute__((__vector_size__(16))); __v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) { __v8hi Result = V1; if (mask & 0x80) Result[0] = V2[0]; return Result; } llvm-svn: 285866
* [InstCombine] allow splat vector folds in adjustMinMax()Sanjay Patel2016-11-011-14/+10
| | | | llvm-svn: 285732
* [InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons.Sanjay Patel2016-11-011-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This transforms %a = shl nuw %x, c1 %b = icmp {ugt|ule} %a, c0 into %b = icmp {ugt|ule} %x, (c0 >> c1) z3: (declare-const x (_ BitVec 64)) (declare-const c0 (_ BitVec 64)) (declare-const c1 (_ BitVec 64)) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvugt (bvshl x c1) c0) (bvugt x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvule (bvshl x c1) c0) (bvule x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) Patch by bryant! Differential Revision: https://reviews.llvm.org/D25913 llvm-svn: 285729
* [InstCombine] clean up adjustMinMax(); NFCISanjay Patel2016-11-011-92/+87
| | | | | | | | | 1. Change param names for readability 2. Change pointer param to ref 3. Early exit to reduce indent 4. Change switch to if/else llvm-svn: 285718
* [InstCombine] add helper function for adjustMinMax(); NFCISanjay Patel2016-11-011-6/+19
| | | | | | This is just a cut and paste; clean-up and enhancements to follow. llvm-svn: 285715
* [InstCombine] Folding of shifts by the sum of positive valuesSimon Pilgrim2016-11-011-1/+10
| | | | | | | | | | | | | | | This patch introduces the combine: (C1 shift (A add C2)) -> ((C1 shift C2) shift A) iff A and C2 are both positive If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift. Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive). Differential Revision: https://reviews.llvm.org/D26000 llvm-svn: 285696
* [InstCombine] re-use bitcasted compare operands in selects (PR28001)Sanjay Patel2016-10-291-0/+50
| | | | | | | | | | | These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495
* [InstCombine] fix foldSPFofSPF() to handle vector splatsSanjay Patel2016-10-271-22/+18
| | | | llvm-svn: 285345
* [InstCombine] handle simple vector integer constants in IsFreeToInvertSanjay Patel2016-10-271-0/+18
| | | | llvm-svn: 285318
* [InstCombine] clean up commonCastTransforms; NFCSanjay Patel2016-10-261-11/+9
| | | | | | | | 1. Use 'auto' with dyn_cast. 2. Variables start with a capital letter. 3. Use proper punctuation in comments. llvm-svn: 285200
* [InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996Guozhi Wei2016-10-252-0/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996. The problem is with following code xB = load (type B); xA = load (type A); +yA = (A)xB; B -> A +zAn = PHI[yA, xA]; PHI +zBn = (B)zAn; // A -> B store zAn; store zBn; optimizeBitCastFromPhi generates +zBn = (B)zAn; // A -> B and expects it will be combined with the following store instruction to another store zAn Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely. optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions. Differential Revision: https://reviews.llvm.org/D23896 llvm-svn: 285116
* [InstCombine] Ensure that truncated int types are legal.Sanjay Patel2016-10-251-4/+2
| | | | | | | | | | Fixes the FIXMEs in D25952 and rL285075. Patch by bryant! Differential Revision: https://reviews.llvm.org/D25955 llvm-svn: 285108
* fix formatting; NFCSanjay Patel2016-10-251-13/+13
| | | | llvm-svn: 285078
OpenPOWER on IntegriCloud