summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAG] Fix Big Endian in Load-Store forwardingNirav Dave2018-10-111-0/+5
| | | | | | | | | | | | | | Summary: Correct offset calculation in load-store forwarding for big-endian targets. Reviewers: rnk, RKSimon, waltl Subscribers: sdardis, nemanjai, hiraditya, jrtc27, atanasyan, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D53147 llvm-svn: 344272
* [DAGCombiner] move comment closer to the corresponding code; NFC Sanjay Patel2018-10-111-2/+1
| | | | llvm-svn: 344255
* [DAGCombine] Improve Load-Store ForwardingNirav Dave2018-10-101-11/+134
| | | | | | | | | | | | | | | | | | Summary: Extend analysis forwarding loads from preceeding stores to work with extended loads and truncated stores to the same address so long as the load is fully subsumed by the store. Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are deleted as they've no longer seem to be relevant. Reviewers: RKSimon, rnk, kparzysz, javed.absar Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D49200 llvm-svn: 344142
* [TargetLowering] SimplifyDemandedBits - rename demanded mask args. NFCI.Simon Pilgrim2018-10-101-80/+89
| | | | | | Help stop bugs like rL343935 by making the 'original' DemandedBits arg more obviously not the mask that is actually used. llvm-svn: 344138
* [TargetLowering] SimplifyDemandedBits - pull out repeated getOperands. NFCI.Simon Pilgrim2018-10-101-119/+119
| | | | | | Part of a minor cleanup to make all the switch statements more consistent prior to improving vector support. llvm-svn: 344136
* [TargetLowering] Add root node back to work list after successful ↵Simon Pilgrim2018-10-101-2/+6
| | | | | | | | | | SimplifyDemandedBits/SimplifyDemandedVectorElts Similar to what already happens in the DAGCombiner wrappers, this patch adds the root nodes back onto the worklist if the DCI wrappers' SimplifyDemandedBits/SimplifyDemandedVectorElts were successful. Differential Revision: https://reviews.llvm.org/D53026 llvm-svn: 344132
* [DAGCombiner] Expand combining of FP logical ops to sign-setting FP opsNemanja Ivanovic2018-10-091-3/+14
| | | | | | | | | | | | | | | | | We already do the following combines: (bitcast int (and (bitcast fp X to int), 0x7fff...) to fp) -> fabs X (bitcast int (xor (bitcast fp X to int), 0x8000...) to fp) -> fneg X When the target has "bit preserving fp logic". This patch just extends it to also combine: (bitcast int (or (bitcast fp X to int), 0x8000...) to fp) -> fneg (fabs X) As some targets have fnabs and even those that don't can efficiently lower both the fabs and the fneg. Differential revision: https://reviews.llvm.org/D44548 llvm-svn: 344093
* [SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG and CONCAT_VECTORS support to ↵Simon Pilgrim2018-10-091-0/+30
| | | | | | | | SimplifyDemandedBits Fix for AVX1 masked load/store regression on D52964 llvm-svn: 344043
* [DAGCombiner] simplify code for fmul with constant fold; NFCISanjay Patel2018-10-081-18/+8
| | | | llvm-svn: 343997
* [SelectionDAGBuilder][NFC] Pass LHSTy to getShiftAmountTy rather than RHSTyAlex Bradbury2018-10-081-1/+1
| | | | | | | | | | r126518 introduced a a type parameter to the getShiftAmountTy target hook. It produces the type of the shift (RHSTy), parameterised by the type of the value being shifted (LHSTy). SelectionDAGBuilder::visitShift passed RHSTy rather than LHSTy and this patch corrects this. The change is a no-op because in LLVM IR the LHS and RHS types for a shift must be equal anyway. llvm-svn: 343955
* Revert r343948 "[LegalizeDAG] Make one of the ReplaceNode signatures take an ↵Craig Topper2018-10-081-8/+6
| | | | | | | | ArrayRef instead a pointer to an array. Add assert on size of array. NFC" The assert is failing some asan tests on the bots. llvm-svn: 343950
* [LegalizeDAG] Make one of the ReplaceNode signatures take an ArrayRef ↵Craig Topper2018-10-081-6/+8
| | | | | | instead a pointer to an array. Add assert on size of array. NFC llvm-svn: 343948
* [LegalizeDAG] Move legalization of scatter and masked store from ↵Craig Topper2018-10-082-11/+10
| | | | | | | | | | LegalizeVectorOps to LegalizeDAG. This is where we legalize gather and masked load so this is consistent. Since these ops are always on vectors I've chosen to go with LegalizeDAG since that's what we do for other vector only ops like BUILD_VECTOR, VECTOR_SHUFFLE, etc. The ScalarizeMaskedMemIntrinsic pass should take care of scalarizing these before SelectionDAG so hopefully we don't need to worry about illegally typed scalar ops being emitted in the legalizing. If we did we would need to do this in LegalizeVectorOps so we could get the second type legalization that runs between LegalizeVectorOps and LegalizeDAG. llvm-svn: 343947
* [DAGCombiner] allow undef elts in vector fadd matchingSanjay Patel2018-10-071-1/+1
| | | | llvm-svn: 343945
* [DAGCombiner] allow undefs when matching vector splats for fmul foldsSanjay Patel2018-10-071-2/+2
| | | | llvm-svn: 343942
* [DAGCombiner] allow undef elts in vector fabs/fneg matchingSanjay Patel2018-10-071-1/+1
| | | | | | | | This change is proposed as a part of D44548, but we need this independently to avoid regressions from improved undef propagation in SimplifyDemandedVectorElts(). llvm-svn: 343940
* [DAGCombiner] shorten code for bitcast+fabs fold; NFCSanjay Patel2018-10-071-5/+2
| | | | llvm-svn: 343939
* [SelectionDAG] Respect multiple uses in SimplifyDemandedBits to ↵Simon Pilgrim2018-10-071-1/+1
| | | | | | | | | | | | SimplifyDemandedVectorElts simplification rL343913 was using SimplifyDemandedBits's original demanded mask instead of the adjusted 'NewMask' that accounts for multiple uses of the op (those variable names really need improving....). Annoyingly many of the test changes (back to pre-rL343913 state) are actually safe - but only because their multiple uses are all by PMULDQ/PMULUDQ. Thanks to Jan Vesely (@jvesely) for bisecting the bug. llvm-svn: 343935
* [LegalizeVectorOps] Make ExpandStrictFPOp return the result corresponding to ↵Craig Topper2018-10-071-1/+1
| | | | | | | | the result number of the SDValue passed in. It was always returning the chain which seems to be the result number of the SDValue in the lit tests we have. But I don't know if that's guaranteed. llvm-svn: 343933
* [SelectionDAG] Add SimplifyDemandedBits to SimplifyDemandedVectorElts ↵Simon Pilgrim2018-10-061-10/+46
| | | | | | | | | | | | | | | | simplification This patch enables SimplifyDemandedBits to call SimplifyDemandedVectorElts in cases where the demanded bits mask covers entire elements of a bitcasted source vector. There are a couple of cases here where simplification at a deeper level (such as through bitcasts) prevents further simplification - CommitTargetLoweringOpt only adds immediate uses/users back to the worklist when we might want to combine the original caller again to see what else it can simplify. As well as that I had to disable handling of bool vector until SimplifyDemandedVectorElts better supports some of their opcodes (SETCC, shifts etc.). Fixes PR39178 Differential Revision: https://reviews.llvm.org/D52935 llvm-svn: 343913
* [SelectionDAG] allow undefs when matching splat constantsSanjay Patel2018-10-052-11/+9
| | | | | | | | | And use that to transform fsub with zero constant operands. The integer part isn't used yet, but it is proposed for use in D44548, so adding both enhancements here makes that patch simpler. llvm-svn: 343865
* [X86][LegalizeVectorOps] Use MERGE_VALUES to return two results from ↵Craig Topper2018-10-041-11/+3
| | | | | | | | | | LowerLoad. Remove special case code in LegalizeVectorOps that allowed us to only return one result. Previously we replaced the chain use ourself and return the data result. LegalizeVectorOps then detected that we'd done this and assumed the chain had already been handled. This commit instead returns a MERGE_VALUES node with two results joined from nodes. This allows LegalizeVectorOps to do all the replacements for us without any special casing. The MERGE_VALUES will be removed by DAG combine. llvm-svn: 343817
* [LegalizeIntegerTypes] Fix typo in comment. NFCCraig Topper2018-10-041-1/+1
| | | | llvm-svn: 343750
* DAGCombiner: StoreMerging: Fix bad index calculating when adjusting ↵Matthias Braun2018-10-011-17/+8
| | | | | | | | | | | | | | | | mismatching vector types This fixes a case of bad index calculation when merging mismatching vector types. This changes the existing code to just use the existing extract_{subvector|element} and a bitcast (instead of bitcast first and then newly created extract_xxx) so we don't need to adjust any indices in the first place. rdar://44584718 Differential Revision: https://reviews.llvm.org/D52681 llvm-svn: 343493
* [DAG] Don't perform SINT_TO_FP<->UINT_TO_FP custom conversion after legalizationSimon Pilgrim2018-09-301-4/+4
| | | | | | | | The SINT_TO_FP<->UINT_TO_FP combines for non-negative integers should only occur for legal ops once LegalOperations = true No test case to hand, noticed when investigating PR38226 + PR38970 llvm-svn: 343405
* [DAGCombiner] [NFC] Improve X div/rem 1 foldDavid Bolvansky2018-09-281-8/+5
| | | | | | | | | | | | Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52661 llvm-svn: 343349
* llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)Fangrui Song2018-09-274-12/+9
| | | | | | | | | | | | Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163
* [DAG] SelectionDAGLegalize::ExpandLegalINT_TO_FP - use getFPExtendOrRound ↵Simon Pilgrim2018-09-261-11/+1
| | | | | | | | helper. NFCI. Handles SrcVT == DstVT as well. llvm-svn: 343121
* [DAG] ExpandLegalINT_TO_FP - pull out repeated getValueType() call. NFCI.Simon Pilgrim2018-09-261-9/+9
| | | | llvm-svn: 343101
* [CodeGen] Enable tail calls for functions with NonNull attributes.David Green2018-09-261-1/+3
| | | | | | | | | | | Adding NonNull as attributes to returned pointers has the unfortunate side effect of disabling tail calls. This patch ignores the NonNull attribute when we decide whether to tail merge, in the same way that we ignore the NoAlias attribute, as it has no affect on the call sequence. Differential Revision: https://reviews.llvm.org/D52238 llvm-svn: 343091
* Run VerifyDAGDiverence in debug onlyMikael Nilsson2018-09-262-0/+16
| | | | | | | | | VerifyDAGDiverence costs compilation time, avoid running it in non-debug builds. Differential Revision: https://reviews.llvm.org/D52454 llvm-svn: 343086
* [DAGCombiner] Remove unnecessary check for visitSDIVLike/visitUDIVLike ↵Craig Topper2018-09-251-2/+1
| | | | | | | | returning a UDIVREM or SDIVREM node. This shouldn't be possible and is a leftover from when we used to recursively call combine here. llvm-svn: 343049
* Unify landing pad information adding routines (NFC)Heejin Ahn2018-09-251-3/+0
| | | | | | | | | | | | | | | | | Summary: We have `llvm::addLandingPadInfo` and `MachineFunction::addLandingPad`, both of which add landing pad information to populate `LandingPadInfo` but are called from different locations, which was confusing. This patch unifies them with one `MachineFunction::addLandingPad` function, which now has functionlities of both functions. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52428 llvm-svn: 343018
* [x86] avoid 256-bit andnp that requires insert/extract with AVX1 (PR37449)Sanjay Patel2018-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the final (I hope!) problem pattern mentioned in PR37749: https://bugs.llvm.org/show_bug.cgi?id=37749 We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops. We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op. The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test, we have this vector-type-legalized sequence: t29: v8i32 = concat_vectors t27, t28 t30: v4i64 = bitcast t29 t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ... t31: v4i64 = bitcast t18 t32: v4i64 = xor t30, t31 t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ... t34: v4i64 = bitcast t9 t35: v4i64 = and t32, t34 t36: v8i32 = bitcast t35 t37: v4i32 = extract_subvector t36, Constant:i64<0> t38: v4i32 = extract_subvector t36, Constant:i64<4> Differential Revision: https://reviews.llvm.org/D52318 llvm-svn: 343008
* [LegalizeDAG] Prune Predecessor check in ↵Nirav Dave2018-09-251-0/+1
| | | | | | ExpandExtractFromVectorThroughStack. NFCI. llvm-svn: 342985
* [DAGCombine] Improve Predecessor check in SimplifySelectOps. NFCI.Nirav Dave2018-09-251-4/+36
| | | | | | | Reuse search space bookkeeping across multiple predecessor checks qdone to avoid redundancy. This should cut search cost by ~4x. llvm-svn: 342984
* [DAGCombine] Share predecessor bookkeeping in CombineToPostIndexedLoadStore. ↵Nirav Dave2018-09-251-2/+9
| | | | | | NFCI. llvm-svn: 342983
* [DAGCombine] Don't fold dependent loads across SELECT_CC.Nirav Dave2018-09-251-4/+5
| | | | | | | | | | | | | | | | | | DAGCombine will try to fold two loads that feed a SELECT or SELECT_CC after the select, resulting in a select of an address and a single load after. If either of the loads depend on the other, this is not legal as it could introduce cycles. However, it only checked this if the opcode was a SELECT, and not for a SELECT_CC. Unfortunately, the only reproducer I have for this is for our downstream target. I've tried getting it to trigger on an upstream one but haven't been successful. Patch thanks to Bevin Hansson. llvm-svn: 342980
* [DAGCombiner] use UADDO to optimize saturated unsigned addSanjay Patel2018-09-241-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preliminary step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 If we have an 'add' instruction that sets flags, we can use that to eliminate an explicit compare instruction or some other instruction (cmn) that sets flags for use in the later select. As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively reversing an IR icmp canonicalization that replaces a variable operand with a constant: https://rise4fun.com/Alive/V1Q But we're not using 'uaddo' in those cases via DAG transforms. This happens in CGP after D8889 without checking target lowering to see if the op is supported. So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with "using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned saturated add and converts to uaddo without checking target capabilities. This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title (unlike x86 which sees improvements for all sizes because all sizes are 'custom'). But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, so this patch will fire on those tests. Another possibility given the existing behavior: we could remove the legal-or-custom check altogether because we're assuming that a UADDO sequence is canonical/optimal before we ever reach here. But that seems like a bug to me. If the target doesn't have an add-with-flags op, then it's not likely that we'll get optimal DAG combining using a UADDO node. This is similar justification for why we don't canonicalize IR to the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first place. Differential Revision: https://reviews.llvm.org/D51929 llvm-svn: 342886
* Remove debug printf leftover from r342397Hans Wennborg2018-09-241-2/+0
| | | | llvm-svn: 342863
* [DAGCombiner] Remove some dead code from ConstantFoldBITCASTofBUILD_VECTORCraig Topper2018-09-241-9/+2
| | | | | | This code handled SCALAR_TO_VECTOR being returned by the recursion, but the code that used to return SCALAR_TO_VECTOR was removed in 2015. llvm-svn: 342856
* [DAGCombiner] Clarify a comment. NFCCraig Topper2018-09-231-2/+4
| | | | | | This comment was misleading about why we were restricting to before legalize types. The reason given would only apply to before legalize ops. But there is a before legalize types reason that should also be listed. llvm-svn: 342851
* [LegalizeTypes] Fix bad indentation. NFCCraig Topper2018-09-231-1/+1
| | | | llvm-svn: 342850
* [DAGCombiner][x86] extend decompose of integer multiply into shift/add with ↵Sanjay Patel2018-09-231-6/+13
| | | | | | | | | | | | | | | | | negation This is an alternative to https://reviews.llvm.org/D37896. We can't decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some existing code that overlaps with this transform. This extends D52195 and may resolve PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 (still an open question about transforming legal vector multiplies, but we could open another bug report for those) llvm-svn: 342844
* [DAGCombiner] Simplify some code in visitBITCAST. NFCICraig Topper2018-09-221-9/+3
| | | | llvm-svn: 342826
* [DAGCombiner] Rewrite r331896 in a different way to address a FIXME. NFCICraig Topper2018-09-221-11/+14
| | | | llvm-svn: 342809
* [SelectionDAG] replace duplicated peekThroughBitcast helper functions; NFCISanjay Patel2018-09-202-38/+31
| | | | | | | | | | | | | | x86 had 2 versions of peekThroughBitcast. DAGCombiner had 1. Plus, it had a 1-off implementation for the one-use variant. Move the x86 versions of the code to SelectionDAG, so we don't have different copies of the code. No functional change intended. I'm putting this next to isBitwiseNot() because I am planning to use it in there. Another option is next to the helpers in the ISD namespace (eg, ISD::isConstantSplatVector()). But if there's no good reason for those to be there, I'd prefer to pull other helpers over to SelectionDAG in follow-up steps. Differential Revision: https://reviews.llvm.org/D52285 llvm-svn: 342669
* [SelectionDAG] allow vector types with isBitwiseNot()Sanjay Patel2018-09-192-4/+5
| | | | | | | The test diff in not-and-simplify.ll is from a use in SimplifyDemandedBits, and the test diff in add.ll is from a DAGCombiner transform. llvm-svn: 342594
* Copy utilities updated and added for MI flagsMichael Berg2018-09-191-0/+9
| | | | | | | | | | | | | | Summary: This patch adds a GlobalIsel copy utility into MI for flags and updates the instruction emitter for the SDAG path. Some tests show new behavior and I added one for GlobalIsel which mirrors an SDAG test for handling nsw/nuw. Reviewers: spatel, wristow, arsenm Reviewed By: arsenm Subscribers: wdng Differential Revision: https://reviews.llvm.org/D52006 llvm-svn: 342576
* [DAGCombiner][x86] add transform/hook to decompose integer multiply into ↵Sanjay Patel2018-09-191-0/+26
| | | | | | | | | | | | | | | | | | | | | shift/add This is an alternative to D37896. I don't see a way to decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some duplicate code that overlaps with this transform. As a first step, we're only getting the most clear wins on the vector examples requested in PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 As noted in the code comment, it's likely that the x86 constraints are tighter than necessary, but it may not always be a win to replace a pmullw/pmulld. Differential Revision: https://reviews.llvm.org/D52195 llvm-svn: 342554
OpenPOWER on IntegriCloud