summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Correct behavior of f16 buffer loadsMatt Arsenault2019-08-051-0/+6
| | | | | | | Don't assume format loads for f16. Also fixes support for targets without i16. llvm-svn: 367879
* [DAGCombiner][x86] prevent infinite loop from truncate/extend transformsSanjay Patel2019-08-051-2/+7
| | | | | | | | | | | | | | | | | | | | | | | The test case is based on the example from the post-commit thread for: https://reviews.llvm.org/rGc9171bd0a955 This replaces the x86-specific simple-type check from: rL367766 with a check in the DAGCombiner. Adding the check isn't strictly necessary after the fix from: rL367768 ...but it seems likely that we're heading for trouble if we are creating weird types in this transform. I combined the earlier legality check into the initial clause to simplify the code. So we should only try the trunc/sext transform at the earliest combine stage, but we limit the transform to simple types anyway because the TLI hook is probably too lax about what it considers a free truncate. llvm-svn: 367834
* [LLVM][Alignment] Introduce Alignment TypeGuillaume Chatelet2019-08-051-4/+4
| | | | | | | | | | | | | | | | | | | Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Reviewed By: jfb Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65514 llvm-svn: 367828
* [LLVM][Alignment] Introduce Alignment Type in DataLayoutGuillaume Chatelet2019-08-051-1/+1
| | | | | | | | | | | | | | | | | | | Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65521 Make getFunctionPtrAlign() return MaybeAlign llvm-svn: 367817
* [TargetLowering][X86] Teach SimplifyDemandedVectorElts to replace the base ↵Craig Topper2019-08-041-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | vector of INSERT_SUBVECTOR with undef if none of the elements are demanded even if the node has other users. Summary: The SimplifyDemandedVectorElts function can replace with undef when no elements are demanded, but due to how it interacts with TargetLoweringOpts, it can only do this when the node has no other users. Remove a now unneeded DAG combine from the X86 backend. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65713 llvm-svn: 367788
* [SelectionDAG] Add node creation debug message to getMemIntrinsicNode.Craig Topper2019-08-041-1/+3
| | | | llvm-svn: 367771
* [DAGCombiner] Prevent the combine added in r367710 from creating illegal ↵Craig Topper2019-08-031-1/+1
| | | | | | | | | | | | types after type legalization. This is further fix for PR42880. Sanjay already disabled the X86 TLI hook for non-simple types, but we should really call isTypeLegal here if we're after type legalization. llvm-svn: 367768
* Emit diagnostic if an inline asm constraint requires an immediateBill Wendling2019-08-032-14/+39
| | | | | | | | | | | | | | | | | | Summary: An inline asm call can result in an immediate after inlining. Therefore emit a diagnostic here if constraint requires an immediate but one isn't supplied. Reviewers: joerg, mgorny, efriedma, rsmith Reviewed By: joerg Subscribers: asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, MaskRay, jyknight, dylanmckay, javed.absar, fedor.sergeev, jrtc27, Jim, krytarowski, eraman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60942 llvm-svn: 367750
* [TargetLowering] SimplifyMultipleUseDemandedBits - don't assume ↵Simon Pilgrim2019-08-021-1/+1
| | | | | | | | INSERT_VECTOR_ELT value type is simple. Noticed by inspection - this was copied from the X86 target equivalent where we can assume its legal/simple. llvm-svn: 367721
* [Statepoints] Fix overalignment of loads in no-realign-stack functionsPhilip Reames2019-08-021-8/+15
| | | | | | | | This really should have been part of 366765. For some reason, I forgot to handle the corresponding load side, and the readable test cases (using deopt vs statepoints) turned out to be overly reduced. Oops. As seen in the test change, the problem was that we were using a load with alignment expectations rather than the unaligned variant when the stack alignment was less than that prefered type alignment. llvm-svn: 367718
* [DAGCombiner] try to convert opposing shifts to castsSanjay Patel2019-08-021-0/+26
| | | | | | | | | | | | | | | | | | | | | This reverses a questionable IR canonicalization when a truncate is free: sra (add (shl X, N1C), AddC), N1C --> sext (add (trunc X to (width - N1C)), AddC') https://rise4fun.com/Alive/slRC More details in PR42644: https://bugs.llvm.org/show_bug.cgi?id=42644 I limited this to pre-legalization for code simplicity because that should be enough to reverse the IR patterns. I don't have any evidence (no regression test diffs) that we need to try this later. Differential Revision: https://reviews.llvm.org/D65607 llvm-svn: 367710
* Finish moving TargetRegisterInfo::isVirtualRegister() and friends to ↵Daniel Sanders2019-08-018-51/+47
| | | | | | llvm::Register as started by r367614. NFC llvm-svn: 367633
* [X86] In decomposeMulByConstant, legalize the VT before querying whether the ↵Craig Topper2019-08-011-1/+1
| | | | | | | | | | | | multiply is legal If a type is larger than a legal type and needs to be split, we would previously allow the multiply to be decomposed even if the split multiply is legal. Since the shift + add/sub code would also need to be split, its not any better to decompose it. This patch figures out what type the mul will eventually be legalized to and then uses that type for the query. I tried just returning false illegal types and letting them get handled after type legalization, but then we can't recognize and i64 constant splat on 32-bit targets since will be destroyed by type legalization. We could special case vectors of i64 to avoid that... Differential Revision: https://reviews.llvm.org/D65533 llvm-svn: 367601
* [TargetLowering] SimplifyMultipleUseDemandedBits - Add ↵Simon Pilgrim2019-08-011-0/+10
| | | | | | | | ISD::INSERT_VECTOR_ELT handling Allow us to peek through vector insertions to avoid dependencies on entire insertion chains. llvm-svn: 367588
* [SelectionDAG] Use APInt::isSubsetOf/intersects to simplify some code.Craig Topper2019-08-011-2/+2
| | | | | | Also use KnownBits::isNegative/isNonNegative to further simplify. llvm-svn: 367518
* Revert "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" andAmy Huang2019-07-311-10/+0
| | | | | | | | | | and partial fix. Causes windows buildbot errors. This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and 53da7ca94343166ac68aef81db0398932fc258bb. llvm-svn: 367496
* Migrate some more fadd and fsub cases away from UnsafeFPMath control to ↵Michael Berg2019-07-312-7/+7
| | | | | | | | | | | | | | | | utilize NoSignedZerosFPMath options control Summary: Honoring no signed zeroes is also available as a user control through clang separately regardless of fastmath or UnsafeFPMath context, DAG guards should reflect this context. Reviewers: spatel, arsenm, hfinkel, wristow, craig.topper Reviewed By: spatel Subscribers: rampitec, foad, nhaehnle, wuzish, nemanjai, jvesely, wdng, javed.absar, MaskRay, jsji Differential Revision: https://reviews.llvm.org/D65170 llvm-svn: 367486
* Fix to r367374 "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG"Amy Huang2019-07-311-4/+8
| | | | | | | | | after windows buildbot failure. Added a check that the MachineInstr exists and is a call before trying to add symbols around it. llvm-svn: 367483
* SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned ↵Peter Collingbourne2019-07-311-15/+12
| | | | | | | | | | | | | char to unsigned. This makes the field wider than MachineOperand::SubReg_TargetFlags so that we don't end up silently truncating any higher bits. We should still catch any bits truncated from the MachineOperand field as a consequence of the assertion in MachineOperand::setTargetFlags(). Differential Revision: https://reviews.llvm.org/D65465 llvm-svn: 367474
* [DAGCombine] Limit the number of times for the same store and root nodesWei Mi2019-07-311-3/+42
| | | | | | | | | | | | | | to bail out in store merging dependence check. We run into a case where dependence check in store merging bail out many times for the same store and root nodes in a huge basicblock. That increases compile time by almost 100x. The patch add a map to track how many times the bailing out happen for the same store and root, and if it is over a limit, stop considering the store with the same root as a merging candidate. Differential Revision: https://reviews.llvm.org/D65174 llvm-svn: 367472
* [MS] Emit S_HEAPALLOCSITE debug info in SelectionDAGAmy Huang2019-07-311-0/+6
| | | | | | | | | | | | | | Summary: This emits labels around heapallocsite calls in SelectionDAG. Reviewers: rnk Subscribers: MatzeB, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61105 llvm-svn: 367374
* [DAGCombiner] Add an option to control whether or not to enable store merging.Wei Mi2019-07-301-1/+6
| | | | | | | | | Add an option to control whether or not to enable store merging in dag combiner so we can workaround some bugs more easily. Differential Revision: https://reviews.llvm.org/D65482 llvm-svn: 367365
* [DAGCombine] narrowInsertExtractVectorBinOp - early out for binops that ↵Simon Pilgrim2019-07-291-0/+4
| | | | | | | | change value type. NFCI. This is implicit in the value type checks in getSubVectorSrc - this just makes it upfront and obvious. llvm-svn: 367220
* [DAGCombine] narrowInsertExtractVectorBinOp - early out for illegal op. NFCI.Simon Pilgrim2019-07-271-1/+4
| | | | | | If the subvector binop is illegal then early-out and avoid the subvector searches. llvm-svn: 367181
* [TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through ↵Simon Pilgrim2019-07-271-2/+59
| | | | | | | | | | | | support (Reapplied) This allows us to peek through BITCASTs, attempt to simplify the source operand, and then bitcast back. This reapplies rL367091 which was reverted at rL367118 - we were inconsistently peeking through the bitcasts to the source value. Fixes PR42777 llvm-svn: 367174
* [SelectionDAG] Check for any recursion depth greater than or equal to limit ↵Simon Pilgrim2019-07-272-5/+5
| | | | | | | | | | | | instead of just equal the limit. If anything called the recursive isKnownNeverNaN/computeKnownBits/ComputeNumSignBits/SimplifyDemandedBits/SimplifyMultipleUseDemandedBits with an incorrect depth then we could continue to recurse if we'd already exceeded the depth limit. This replaces the limit check (Depth == 6) with a (Depth >= 6) to make sure that we don't circumvent it. This causes a couple of regressions as a mixture of calls (SimplifyMultipleUseDemandedBits + combineX86ShufflesRecursively) were calling with depths that were already over the limit. I've fixed SimplifyMultipleUseDemandedBits to not do this. combineX86ShufflesRecursively is trickier as we get a lot of regressions if we reduce its own limit from 8 to 6 (it also starts at Depth == 1 instead of Depth == 0 like the others....) - I'll see what I can do in future patches. llvm-svn: 367171
* [TargetLowering] Add depth limit to SimplifyMultipleUseDemandedBitsSimon Pilgrim2019-07-271-0/+3
| | | | | | We're getting reports of massive compile time increases because SimplifyMultipleUseDemandedBits was losing track of the depth and not earlying-out. No repro yet, but consider this a pre-emptive commit. llvm-svn: 367169
* Revert r367091, it caused PR42777.Nico Weber2019-07-261-59/+2
| | | | llvm-svn: 367118
* [SelectionDAG] GetDemandedBits - update SIGN_EXTEND_INREG op to just call ↵Simon Pilgrim2019-07-261-9/+1
| | | | | | SimplifyMultipleUseDemandedBits. llvm-svn: 367098
* [TargetLowering] SimplifyMultipleUseDemandedBits - add SIGN_EXTEND_INREG ↵Simon Pilgrim2019-07-261-0/+7
| | | | | | support. llvm-svn: 367096
* [SelectionDAG] GetDemandedBits - update OR/XOR ops to just call ↵Simon Pilgrim2019-07-261-6/+2
| | | | | | | | SimplifyMultipleUseDemandedBits. Eventually all of these will be moved over, but we create nodes in GetDemandedBits recursion at the moment which causes regressions when we try to remove them all. llvm-svn: 367092
* [TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through ↵Simon Pilgrim2019-07-261-2/+59
| | | | | | | | support. This allows us to peek through BITCASTs and attempt simplify the source operand, and then bitcast back. llvm-svn: 367091
* [Codegen] (X & (C l>>/<< Y)) ==/!= 0 --> ((X <</l>> Y) & C) ==/!= 0 foldRoman Lebedev2019-07-241-0/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This was originally reported in D62818. https://rise4fun.com/Alive/oPH InstCombine does the opposite fold, in hope that `C l>>/<< Y` expression will be hoisted out of a loop if `Y` is invariant and `X` is not. But as it is seen from the diffs here, if it didn't get hoisted, the produced assembly is almost universally worse. Much like with my recent "hoist add/sub by/from const" patches, we should get almost universal win if we hoist constant, there is almost always an "and/test by imm" instruction, but "shift of imm" not so much, so we may avoid having to materialize the immediate, and thus need one less register. And since we now shift not by constant, but by something else, the live-range of that something else may reduce. Special care needs to be applied not to disturb x86 `BT` / hexagon `tstbit` instruction pattern. And to not get into endless combine loop. Reviewers: RKSimon, efriedma, t.p.northover, craig.topper, spatel, arsenm Reviewed By: spatel Subscribers: hiraditya, MaskRay, wuzish, xbolva00, nikic, nemanjai, jvesely, wdng, nhaehnle, javed.absar, tpr, kristof.beyls, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62871 llvm-svn: 366955
* Fix signed/unsigned comparison warning. NFCI.Simon Pilgrim2019-07-241-1/+1
| | | | llvm-svn: 366935
* [DAGCombine] matchBinOpReduction - add partial reduction matchingSimon Pilgrim2019-07-241-7/+32
| | | | | | | | | | | | | | | | | | | | This patch adds support for recognizing cases where a larger vector type is being used to reduce just the elements in the lower subvector: e.g. <8 x i32> reduction pattern in a <16 x i32> vector: <4,5,6,7,u,u,u,u,u,u,u,u,u,u,u,u> <2,3,u,u,u,u,u,u,u,u,u,u,u,u,u,u> <1,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u> matchBinOpReduction returns the lower extracted subvector in such cases, assuming isExtractSubvectorCheap accepts the extraction. I've only enabled it for X86 reduction sums so far. I intend to enable it for the bitop/minmax cases in future patches, and eventually I think its worth turning it on all the time. This is mainly just a case of ensuring calls to matchBinOpReduction don't make assumptions on the vector width based on the original vector extraction. Fixes the x86 partial reduction sum cases in PR33758 and PR42023. Differential Revision: https://reviews.llvm.org/D65047 llvm-svn: 366933
* [SelectionDAG] makeEquivalentMemoryOrdering - early out for equal chains ↵Simon Pilgrim2019-07-241-1/+1
| | | | | | | | | | (PR42727) If we are already using the same chain for the old/new memory ops then just return. Fixes PR42727 which had getLoad() reusing an existing node. llvm-svn: 366922
* [SDAG] convert (sub x, 1) to (add x, -1) in ctpop expansion; NFCSanjay Patel2019-07-241-3/+3
| | | | | | We canonicalize to the add form, so create that directly for efficiency. llvm-svn: 366914
* [TargetLowering] SimplifyMultipleUseDemandedBits - add VECTOR_SHUFFLE support.Simon Pilgrim2019-07-231-0/+23
| | | | | | | | If all the demanded elts are from one operand and are inline, then we can use the operand directly. The changes are mainly from SSE41 targets which has blendvpd but not cmpgtq, allowing the v2i64 comparison to be simplified as we only need the signbit from alternate v4i32 elements. llvm-svn: 366817
* [TargetLowering] Add SimplifyMultipleUseDemandedBitsSimon Pilgrim2019-07-231-1/+128
| | | | | | | | | | | | | | | | | | This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured. The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use. We do see a couple of regressions that need to be addressed: AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better). X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG. The code owners have confirmed its ok for these cases to fixed up in future patches. Differential Revision: https://reviews.llvm.org/D63281 llvm-svn: 366799
* [DAGCombiner] Make ShrinkLoadReplaceStoreWithStore return an SDValue instead ↵Craig Topper2019-07-231-9/+8
| | | | | | | | | | of an SDNode*. NFCI The function was calling getNode() on an SDValue to return and the caller turned the result back into a SDValue. So just return the original SDValue to avoid this. llvm-svn: 366779
* [DAGCombiner] Use SDNode::isOperandOf to simplify some code. NFCICraig Topper2019-07-231-7/+1
| | | | llvm-svn: 366778
* Move variable out from debug only section.Richard Trieu2019-07-231-2/+0
| | | | | | | MFI is no longer just needed for an assert. Move it out of the debug only section to allow non-assert builds to be able to find it. llvm-svn: 366773
* [Statepoints] Fix a bug in statepoint lowering for functions w/no-realign-stackPhilip Reames2019-07-221-1/+8
| | | | | | | | | | We were silently using the ABI alignment for all of the stores generated for deopt and gc values. We'd gotten the alignment of the stack slot itself properly reduced (via MachineFrameInfo's clamping), but having the MMO on the store incorrect was enough for us to generate an aligned store to a unaligned location. The simplest fix would have been to just pass the alignment to the helper function, but once we do that, the helper function doesn't really help. So, inline it and directly call the MMO version of DAG.getStore with a properly constructed MMO. Note that there's a separate performance possibility here. Even if we *can* realign stacks, we probably don't *want to* if all of the stores are in slowpaths. But that's a later patch, if at all. :) llvm-svn: 366765
* TableGen: Support physical register inputs > 255Matt Arsenault2019-07-221-1/+4
| | | | | | | This was truncating register value that didn't fit in unsigned char. Switch AMDGPU sendmsg intrinsics to using a tablegen pattern. llvm-svn: 366695
* Added address-space mangling for stack related intrinsicsChristudasan Devadasan2019-07-221-2/+2
| | | | | | | | | | | | Modified the following 3 intrinsics: int_addressofreturnaddress, int_frameaddress & int_sponentry. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D64561 llvm-svn: 366679
* [IPRA][ARM] Make use of the "returned" parameter attributeOliver Stannard2019-07-221-0/+2
| | | | | | | | | | | | ARM has code to recognise uses of the "returned" function parameter attribute which guarantee that the value passed to the function in r0 will be returned in r0 unmodified. IPRA replaces the regmask on call instructions, so needs to be told about this to avoid reverting the optimisation. Differential revision: https://reviews.llvm.org/D64986 llvm-svn: 366669
* [Codegen][SelectionDAG] X u% C == 0 fold: non-splat vector improvementsRoman Lebedev2019-07-201-35/+132
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Four things here: 1. Generalize the fold to handle non-splat divisors. Reasonably trivial. 2. Unban power-of-two divisors. I don't see any reason why they should be illegal. * There is no ban in Hacker's Delight * I think the ban came from the same bug that caused the miscompile in the base patch - in `floor((2^W - 1) / D)` we were dividing by `D0` instead of `D`, and we **were** ensuring that `D0` is not `1`, which made sense. 3. Unban `1` divisors. I no longer believe Hacker's Delight actually says that the fold is invalid for `D = 0`. Further considerations: * We know that * `(X u% 1) == 0` can be constant-folded to `1`, * `(X u% 1) != 0` can be constant-folded to `0`, * Also, we know that * `X u<= -1` can be constant-folded to `1`, * `X u> -1` can be constant-folded to `0`, * https://godbolt.org/z/7jnZJX https://rise4fun.com/Alive/oF6p * We know will end up with the following: `(setule/setugt (rotr (mul N, P), K), Q)` * Therefore, for given new DAG nodes and comparison predicates (`ule`/`ugt`), we will still produce the correct answer if: `Q` is a all-ones constant; and both `P` and `K` are *anything* other than `undef`. * The fold will indeed produce `Q = all-ones`. 4. Try to re-splat the `P` and `K` vectors - we don't care about their values for the lanes where divisor was `1`. Reviewers: RKSimon, hermord, craig.topper, spatel, xbolva00 Reviewed By: RKSimon Subscribers: hiraditya, javed.absar, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63963 llvm-svn: 366637
* DAG: Handle dbg_value for arguments split into multiple subregsMatt Arsenault2019-07-191-23/+52
| | | | | | | | This was handled previously for arguments split due to not fitting in an MVT. This was dropping the register for argument registers split due to TLI::getRegisterTypeForCallingConv. llvm-svn: 366574
* [COFF] Change a variable type to be const in the HeapAllocSite map.Amy Huang2019-07-181-1/+1
| | | | llvm-svn: 366479
* [DAGCombine] Pull getSubVectorSrc helper out of ↵Simon Pilgrim2019-07-181-22/+22
| | | | | | | | narrowInsertExtractVectorBinOp. NFCI. NFC step towards reusing this in other EXTRACT_SUBVECTOR combines. llvm-svn: 366435
OpenPOWER on IntegriCloud