summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombine] narrowExtractedVectorBinOp - pull out repeated getOpcode(). NFCI.Simon Pilgrim2019-06-211-2/+2
| | | | llvm-svn: 364076
* [DAGCombine] narrowInsertExtractVectorBinOp - reuse "extract from insert" ↵Simon Pilgrim2019-06-211-11/+15
| | | | | | | | detection code. Move the "extract from insert detection code" into a lambda helper function. llvm-svn: 364059
* [DAGCombiner] Use getAPIntValue() instead of getZExtValue() where possible.Simon Pilgrim2019-06-201-21/+20
| | | | | | Better handling of out-of-i64-range values due to large integer types or from fuzz tests. llvm-svn: 363955
* [DAGCombiner][NFC] Remove unused varJordan Rupprecht2019-06-201-1/+0
| | | | llvm-svn: 363954
* [DAGCombiner] Support (shl (zext (srl x, C)), C) -> (zext (shl (srl x, C), ↵Simon Pilgrim2019-06-201-17/+19
| | | | | | | | C)) non-uniform folds. Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. llvm-svn: 363929
* [DAGCombine] Add TODOs for some combines that should support non-uniform vectorsSimon Pilgrim2019-06-201-0/+15
| | | | | | We tend to only test for scalar/scalar consts when really we could support non-uniform vectors using ISD::matchUnaryPredicate/matchBinaryPredicate etc. llvm-svn: 363924
* [DAGCombine] Reduce scope of ShAmtVal variable. NFCI.Simon Pilgrim2019-06-201-2/+1
| | | | | | | | Fixes cppcheck warning. Use the more capable getAPIntVal() instead of getZExtValue() as well since I'm here. llvm-svn: 363921
* [DAGCombine] Use ConstantSDNode::getAPIntValue() instead of getZExtValue().Simon Pilgrim2019-06-191-2/+2
| | | | | | Use getAPIntValue() in a few more places. Most of the time getZExtValue() is fine, but occasionally there's fuzzed code or someone decides to create i65536 or something..... llvm-svn: 363887
* [DAGCombiner] Support (shl (ext (shl x, c1)), c2) -> (shl (ext x), (add c1, ↵Simon Pilgrim2019-06-191-16/+15
| | | | | | | | c2)) non-uniform folds. Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. llvm-svn: 363793
* [DAGCombiner] Support (shl (ext (shl x, c1)), c2) -> 0 non-uniform folds.Simon Pilgrim2019-06-191-7/+23
| | | | | | | | Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. This requires us to tweak matchBinaryPredicate to allow it to (optionally) handle constants with different type widths. llvm-svn: 363792
* [DAGCombiner] visitSHL - pull out repeated shift amount VT. NFCI.Simon Pilgrim2019-06-191-6/+6
| | | | llvm-svn: 363789
* [DAGCombine] Fix (shl (ext (shl x, c1)), c2) -> (shl (ext x), (add c1, c2)) ↵Simon Pilgrim2019-06-191-1/+2
| | | | | | | | comment. NFCI. We pre-extend, not post. llvm-svn: 363787
* [DAGCombiner] [CodeGenPrepare] More comprehensive GEP splittingLuis Marques2019-06-171-3/+63
| | | | | | | | | | | | | | | Some GEPs were not being split, presumably because that split would just be undone by the DAGCombiner. Not performing those splits can prevent important optimizations, such as preventing the element indices / member offsets from being (partially) folded into load/store instruction immediates. This patch: - Makes the splits also occur in the cases where the base address and the GEP are in the same BB. - Ensures that the DAGCombiner doesn't reassociate them back again. Differential Revision: https://reviews.llvm.org/D60294 llvm-svn: 363544
* adding more fmf propagation for selects plus updated testsMichael Berg2019-06-151-20/+34
| | | | llvm-svn: 363484
* Revert "adding more fmf propagation for selects plus tests"Fangrui Song2019-06-151-34/+20
| | | | | | | | | | | This reverts rL363474. -debug-only=isel was added to some tests that don't specify `REQUIRES: asserts`. This causes failures on -DLLVM_ENABLE_ASSERTIONS=off builds. I chose to revert instead of fixing the tests because I'm not sure whether we should add `REQUIRES: asserts` to more tests. llvm-svn: 363482
* adding more fmf propagation for selects plus testsMichael Berg2019-06-141-20/+34
| | | | llvm-svn: 363474
* [TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests ↵Simon Pilgrim2019-06-121-1/+2
| | | | | | | | | | | | | | (PR42123) As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space. This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them. If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores. Differential Revision: https://reviews.llvm.org/D63075 llvm-svn: 363179
* [TargetLowering] Add allowsMemoryAccess(MachineMemOperand) helper wrapper. NFCI.Simon Pilgrim2019-06-111-32/+30
| | | | | | As suggested by @arsenm on D63075 - this adds a TargetLowering::allowsMemoryAccess wrapper that takes a Load/Store node's MachineMemOperand to handle the AddressSpace/Alignment arguments and will also implicitly handle the MachineMemOperand::Flags change in D63075. llvm-svn: 363048
* [DAGCombine] GetNegatedExpression - constant float vector support (PR42105)Simon Pilgrim2019-06-111-9/+40
| | | | | | | | Add support for negation of constant build vectors. Differential Revision: https://reviews.llvm.org/D62963 llvm-svn: 363040
* [DAGCombine] Match a pattern where a wide type scalar value is stored by ↵QingShan Zhang2019-06-101-0/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | several narrow stores This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c static void store64(u64 x, unsigned char* y) { for(int i = 0; i != 8; ++i) y[i] = (x >> ((7-i) * 8)) & 255; } static u64 load64(const unsigned char* y) { u64 res = 0; for(int i = 0; i != 8; ++i) res |= (u64)(y[i]) << ((7-i) * 8); return res; } The load64 has been implemented by https://reviews.llvm.org/D26149 This patch is trying to implement the store pattern. Match a pattern where a wide type scalar value is stored by several narrow stores. Fold it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 *p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; > *((i32)p) = val; i8 *p = ... i32 val = ... p[0] = (val >> 24) & 0xFF; p[1] = (val >> 16) & 0xFF; p[2] = (val >> 8) & 0xFF; p[3] = (val >> 0) & 0xFF; > *((i32)p) = BSWAP(val); Differential Revision: https://reviews.llvm.org/D62897 llvm-svn: 362921
* Use for-range loop. NFCI.Simon Pilgrim2019-06-091-3/+1
| | | | llvm-svn: 362897
* [DAGCombine] visitAND - merge (zext_inreg ((s)extload x)) -> (zextload x) ↵Simon Pilgrim2019-06-081-21/+4
| | | | | | | | combines. NFCI. Same codegen, only differ by the oneuse limit for the sextload case. llvm-svn: 362880
* [DAGCombine] visitAND - fix local shadow variable warnings. NFCI.Simon Pilgrim2019-06-071-24/+24
| | | | llvm-svn: 362825
* [DAGCombine] Use APInt::extractBits in "sub-splat" constant mask detection. ↵Simon Pilgrim2019-06-071-3/+3
| | | | | | NFCI. llvm-svn: 362820
* [DAGCombine] MergeConsecutiveStores - improve non-temporal load\store ↵Simon Pilgrim2019-06-061-7/+23
| | | | | | | | | | | | | | handling (PR42123) This patch is the first step towards ensuring MergeConsecutiveStores correctly handles non-temporal loads\stores: 1 - When merging load\stores we must ensure that they all have the same non-temporal flag. This is unlikely to occur, but can in strange cases where we're storing at the end of one page and the beginning of another. 2 - The merged load\store node must retain the non-temporal flag. Differential Revision: https://reviews.llvm.org/D62910 llvm-svn: 362723
* [DAGCombine] Cleanup isNegatibleForFree/GetNegatedExpression. NFCI.Simon Pilgrim2019-06-061-20/+21
| | | | | | Prep work for PR42105 - clang-format, use auto for cast and merge nested if()s llvm-svn: 362695
* Fix shadow local variable warning. NFCI.Simon Pilgrim2019-06-051-6/+6
| | | | llvm-svn: 362622
* Revert r362472 as it is breaking PPC build botsNemanja Ivanovic2019-06-041-179/+0
| | | | | | | The patch https://reviews.llvm.org/rL362472 broke PPC LNT buildbots. Reverting it to bring the bots back to green. llvm-svn: 362539
* [DAGCombiner][X86] Fold (not (neg X)) -> (add X, -1)Craig Topper2019-06-041-0/+10
| | | | | | | | | | This is a special case of a more general transform (not (sub Y, X)) -> (add X, ~Y). InstCombine knows the general form. I've restricted to the special case to fix the motivating case PR42118. I tried handling any case where Y was constant, but got some changes on some Mips tests that I couldn't quickly prove where beneficial. Fixes PR42118 Differential Revision: https://reviews.llvm.org/D62828 llvm-svn: 362533
* [SelectionDAG][x86] limit post-legalization store merging by typeSanjay Patel2019-06-041-1/+1
| | | | | | | | | | | The proposal in D62498 showed that x86 would benefit from vector store splitting, but that may conflict with the generic DAG combiner's store merging transforms. Add memory type to the existing TLI hook that enables the merging transforms, so we can limit those changes to scalars only for x86. llvm-svn: 362507
* [DAGCombine][X86][AArch64][MIPS][LANAI] (C - x) - y -> C - (x + y) fold ↵Roman Lebedev2019-06-041-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | (PR41952) Summary: This *might* be the last fold for `sink-addsub-of-const.ll`, but i'm not sure yet. As far as i can tell, there are no regressions here (ignoring x86-32), all changes are either good or neutral. This, almost surprisingly to me, fixes the motivational tests (in `shift-amount-mod.ll`) `@reg32_lshr_by_sub_from_negated` from [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]]. https://rise4fun.com/Alive/vMd3 Reviewers: RKSimon, t.p.northover, craig.topper, spatel, efriedma Reviewed By: RKSimon Subscribers: sdardis, javed.absar, arichardson, kristof.beyls, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62774 llvm-svn: 362488
* [DAGCombine][X86][AArch64][ARM] (C - x) + y -> (y - x) + C foldRoman Lebedev2019-06-041-0/+7
| | | | | | | | | | | | | | | | | | | | | Summary: All changes except ARM look **great**. https://rise4fun.com/Alive/R2M The regression `test/CodeGen/ARM/addsubcarry-promotion.ll` is recovered fully by D62392 + D62450. Reviewers: RKSimon, craig.topper, spatel, rogfer01, efriedma Reviewed By: efriedma Subscribers: dmgreen, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62266 llvm-svn: 362487
* [SelectionDAG] Add fpto[us]i(undef) --> undef constant foldSimon Pilgrim2019-06-041-0/+8
| | | | | | | | Follow up to D62807. Differential Revision: https://reviews.llvm.org/D62811 llvm-svn: 362483
* [DAGCombine] Match a pattern where a wide type scalar value is stored by ↵QingShan Zhang2019-06-041-0/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | several narrow stores This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c static void store64(u64 x, unsigned char* y) { for(int i = 0; i != 8; ++i) y[i] = (x >> ((7-i) * 8)) & 255; } static u64 load64(const unsigned char* y) { u64 res = 0; for(int i = 0; i != 8; ++i) res |= (u64)(y[i]) << ((7-i) * 8); return res; } The load64 has been implemented by https://reviews.llvm.org/D26149 This patch is trying to implement the store pattern. Match a pattern where a wide type scalar value is stored by several narrow stores. Fold it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 *p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; > *((i32)p) = val; i8 *p = ... i32 val = ... p[0] = (val >> 24) & 0xFF; p[1] = (val >> 16) & 0xFF; p[2] = (val >> 8) & 0xFF; p[3] = (val >> 0) & 0xFF; > *((i32)p) = BSWAP(val); Differential Revision: https://reviews.llvm.org/D61843 llvm-svn: 362472
* Propagate fmf for setcc/select foldsMichael Berg2019-06-031-3/+10
| | | | | | | | | | | | | | Summary: This change facilitates propagating fmf which was placed on setcc from fcmp through folds with selects so that back ends can model this path for arithmetic folds on selects in SDAG. Reviewers: qcolombet, spatel Reviewed By: qcolombet Subscribers: nemanjai, jsji Differential Revision: https://reviews.llvm.org/D62552 llvm-svn: 362439
* [SelectionDAG] Add [us]itofp(undef) --> 0 constant fold (PR39205)Simon Pilgrim2019-06-031-0/+8
| | | | | | | | We were missing this fold in the DAG, which I've copied directly from llvm::ConstantFoldCastInstruction Differential Revision: https://reviews.llvm.org/D62807 llvm-svn: 362397
* Recommit r360171: [DAGCombiner] Avoid creating large tokenfactors in ↵Florian Hahn2019-06-031-3/+21
| | | | | | | | | | | | | | | | visitTokenFactor. If we hit the limit, we do expand the outstanding tokenfactors. Otherwise, we might drop nodes with users in the unexpanded tokenfactors. This fixes the crashes reported by Jordan Rupprecht. Reviewers: niravd, spatel, craig.topper, rupprecht Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D62633 llvm-svn: 362350
* [DAGCombiner][X86] Fold away masked store and scatter with all zeroes mask.Craig Topper2019-06-021-11/+18
| | | | | | Similar to what was done for masked load and gather. llvm-svn: 362342
* [DAGCombiner] Replace masked loads with a zero mask with the passthru valueCraig Topper2019-06-021-3/+7
| | | | | | Similar to what was recently done for gathers in r362015. llvm-svn: 362337
* [DAGCombine] Fold insert_subvector(bitcast(x),bitcast(y),c1) -> ↵Simon Pilgrim2019-06-021-0/+37
| | | | | | | | | | bitcast(insert_subvector(x,y),c2) Move this combine from x86 into generic DAGCombine, which currently only manages cases where the bitcast is between types of the same scalarsize. Differential Revision: https://reviews.llvm.org/D59188 llvm-svn: 362324
* [DAGCombiner] Replace two unchecked dyn_casts with casts.Craig Topper2019-06-021-2/+2
| | | | | | | | | | The results of the dyn_casts were immediately dereferenced on the next line so they had better not be null. I don't think there's any way for these dyn_casts to fail, so use a cast of adding null check. llvm-svn: 362315
* [DAGCombine] Limit 'hoist add/sub binop w/ constant op' to non-opaque constsRoman Lebedev2019-05-301-6/+8
| | | | | | | | | I don't have a test case for these, but there is a test case for D62266 where, even after all the constant-folding patches, we still end up with endless combine loop. Which makes sense, since we don't constant fold for opaque constants. llvm-svn: 362156
* [DAGCombiner][X86][AArch64] (x - C) + y -> (x + y) - C fold. Try 2Roman Lebedev2019-05-301-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Only vector tests are being affected here, since subtraction by scalar constant is rewritten as addition by negated constant. No surprising test changes. https://rise4fun.com/Alive/pbT This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62257 llvm-svn: 362146
* [DAGCombine] (x - C) - y -> (x - y) - C fold. Try 3Roman Lebedev2019-05-301-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Again only vectors affected. Frustrating. Let me take a look into that.. https://rise4fun.com/Alive/AAq This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs, and then reverted in rL362109 to fix missing constant folds that were causing endless combine loops. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, JDevlieghere, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62294 llvm-svn: 362145
* [DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x ↵Roman Lebedev2019-05-301-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fold. Try 3 Summary: This prevents regressions in next patch, and somewhat recovers from the regression to AMDGPU test in D62223. It is indeed not great that we leave vector decrement, don't transform it into vector add all-ones.. https://rise4fun.com/Alive/ZRl This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs, and then reverted in rL362109 to fix missing constant folds that were causing endless combine loops. Reviewers: RKSimon, craig.topper, spatel, arsenm Reviewed By: RKSimon, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62263 llvm-svn: 362144
* [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C ↵Roman Lebedev2019-05-301-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fold. Try 3 Summary: Direct sibling of D62223 patch. While i don't have a direct motivational pattern for this, it would seem to make sense to handle both patterns (or none), for symmetry? The aarch64 changes look neutral; sparc and systemz look like improvement (one less instruction each); x86 changes - 32bit case improves, 64bit case shows that LEA no longer gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea` https://rise4fun.com/Alive/ffh This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs, and then reverted in rL362109 to fix missing constant folds that were causing endless combine loops. Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62252 llvm-svn: 362143
* [DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold. Try 3Roman Lebedev2019-05-301-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The main motivation is shown by all these `neg` instructions that are now created. In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test. AArch64 test changes all look good (`neg` created), or neutral. X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created). I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill is now hoisted into preheader (which should still be good?), 2 4-byte reloads become 1 8-byte reload, and are elsewhere, but i'm not sure how that affects that loop. I'm unable to interpret AMDGPU change, looks neutral-ish? This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]]. https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later) This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs, and then reverted in rL362109 to fix missing constant folds that were causing endless combine loops. Reviewers: craig.topper, RKSimon, spatel, arsenm Reviewed By: RKSimon Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62223 llvm-svn: 362142
* [DAGCombine] ((c1-A)-c2) -> ((c1-c2)-A) constant-foldRoman Lebedev2019-05-301-0/+10
| | | | | | | | | | | | | | | | Summary: https://rise4fun.com/Alive/B0A Reviewers: t.p.northover, RKSimon, spatel, craig.topper Reviewed By: RKSimon Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62691 llvm-svn: 362135
* [DAGCombine] (A-C1)-C2 -> A-(C1+C2) constant-foldRoman Lebedev2019-05-301-0/+10
| | | | | | | | | | | | | | | | Summary: https://rise4fun.com/Alive/Mb1M Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62689 llvm-svn: 362134
* [DAGCombine] (A+C1)-C2 -> A+(C1-C2) constant-foldRoman Lebedev2019-05-301-0/+10
| | | | | | | | | | | | | | | | | | | Summary: Direct sibling of D62662, the root cause of the endless combine loop in D62257 https://rise4fun.com/Alive/d3W Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62664 llvm-svn: 362133
OpenPOWER on IntegriCloud