summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C foldRoman Lebedev2019-05-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Direct sibling of D62223 patch. While i don't have a direct motivational pattern for this, it would seem to make sense to handle both patterns (or none), for symmetry? The aarch64 changes look neutral; sparc and systemz look like improvement (one less instruction each); x86 changes - 32bit case improves, 64bit case shows that LEA no longer gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea` https://rise4fun.com/Alive/ffh Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62252 llvm-svn: 361853
* [DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C foldRoman Lebedev2019-05-281-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The main motivation is shown by all these `neg` instructions that are now created. In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test. AArch64 test changes all look good (`neg` created), or neutral. X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created). I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill is now hoisted into preheader (which should still be good?), 2 4-byte reloads become 1 8-byte reload, and are elsewhere, but i'm not sure how that affects that loop. I'm unable to interpret AMDGPU change, looks neutral-ish? This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]]. https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later) Reviewers: craig.topper, RKSimon, spatel, arsenm Reviewed By: RKSimon Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62223 llvm-svn: 361852
* [DAG] LegalizeVectorTypes - reduce scope of local variables. NFCI.Simon Pilgrim2019-05-281-4/+2
| | | | | | Move the element index/count variables into the block where they are actually used - appeases cppcheck and helps avoid shadow variable warnings. llvm-svn: 361821
* [X86] Custom lower CONCAT_VECTORS of v2i1Benjamin Kramer2019-05-281-0/+1
| | | | | | | The generic legalizer cannot handle this. Add an assert instead of silently miscompiling vectors with elements smaller than 8 bits. llvm-svn: 361814
* [SelectionDAG] fold concat of extract subvectorsSanjay Patel2019-05-271-0/+25
| | | | | | | | | | | | | | This is derived from the related fold for build vectors. We also have a version of this in DAGCombiner. The benefit of having this fold at node creation time is (1) efficiency and (2) preventing infinite looping from creating patterns that should not exist in the first place. Currently, the inf-loop could happen with MergeConsecutiveStores() because it naively creates concat of extracts when forming a wider vector store. That could fight with target-specific store narrowing. llvm-svn: 361780
* [SelectionDAG] fix formatting and redundant comments; NFCSanjay Patel2019-05-271-7/+6
| | | | | | | | | There's a possible missing fold here for extracting from the same source vector. It's similar to a check that we use to squash a build vector with all extracted elements from the same source vector. llvm-svn: 361778
* [SelectionDAG] Enhance the simplification of `copyto` from `implicit-def`.Michael Liao2019-05-272-31/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - The current implementation simplifies the case where the source of `copyto` is `implicit-def`ed. However, it only works when that `implicit-def` is single-used since it detects that from `implicit-def` and cannot determine which destination vreg should be used if there are multiple uses. - This patch changes that detection when `copyto` is being emitted. If that `copyto`'s source is defined from `implicit-def`, it simplifies it. Hence, it works even that `implicit-def` is multi-used. - Except it simplifies the internal IR, it won't improve the quality of code generation. However, it helps to detect 'implicit-def` in a straight-forward manner in some passes, such as `si-i1-copies`. A test case is added. Reviewers: sunfish, nhaehnle Subscribers: jvesely, hiraditya, asbirlea, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D62342 llvm-svn: 361777
* [SelectionDAG] GetDemandedBits - add demanded elements wrapper implementationSimon Pilgrim2019-05-271-1/+15
| | | | | | The DemandedElts variable is pretty much inert at the moment - the original GetDemandedBits implementation calls it with an 'all ones' DemandedElts value so the function is active and behaves exactly as it used to. llvm-svn: 361773
* [AMDGPU] Divergence driven ISel. Assign register class for cross block ↵Alexander Timofeev2019-05-266-24/+37
| | | | | | | | | | | | | | | | | | values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 This commit was reverted because of the build failure. The reason was mlformed patch. Build failure fixed. llvm-svn: 361741
* [SelectionDAG] GetDemandedBits - cleanup to more closely match ↵Simon Pilgrim2019-05-261-16/+21
| | | | | | | | SimplifyDemandedBits. NFCI. Prep work before adding demanded elts support. llvm-svn: 361739
* [SelectionDAG] MaskedValueIsZero - add demanded elements implementationSimon Pilgrim2019-05-261-2/+15
| | | | | | Will be used in an upcoming patch but I've updated the original implementation to call this to ensure test coverage. llvm-svn: 361738
* [SelectionDAG] soften assertion when legalizing narrow vector FP opsSanjay Patel2019-05-251-6/+4
| | | | | | | | | | The test based on PR42010: https://bugs.llvm.org/show_bug.cgi?id=42010 ...may show an inaccuracy for PPC's target defs, but we should not be so aggressive with an assert here. There's no telling what out-of-tree targets look like. llvm-svn: 361696
* Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for ↵Peter Collingbourne2019-05-256-37/+24
| | | | | | | | | | cross block values according to the divergence." Broke sanitizer bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio llvm-svn: 361688
* [AMDGPU] Divergence driven ISel. Assign register class for cross block ↵Alexander Timofeev2019-05-246-24/+37
| | | | | | | | | | | | | | values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 llvm-svn: 361644
* [SelectionDAG] computeKnownBits - support constant pool values from targetSimon Pilgrim2019-05-243-3/+66
| | | | | | | | | | | | | | | | This patch adds the overridable TargetLowering::getTargetConstantFromLoad function which allows targets to return any constant value loaded by a LoadSDNode node - only X86 makes use of this so far but everything should be in place for other targets. computeKnownBits then uses this function to improve codegen, notably vector code after legalization. A future commit will do the same for ComputeNumSignBits but computeKnownBits sees the bigger benefit. This required a couple of fixes: * SimplifyDemandedBits must early-out for getTargetConstantFromLoad cases to prevent infinite loops of constant regeneration (similar to what we already do for BUILD_VECTOR). * Fix a DAGCombiner::visitTRUNCATE issue as we had trunc(shl(v8i32),v8i16) <-> shl(trunc(v8i16),v8i32) infinite loops after legalization on AVX512 targets. Differential Revision: https://reviews.llvm.org/D61887 llvm-svn: 361620
* CodeGen: factor out swifterror value tracking.Tim Northover2019-05-244-341/+33
| | | | llvm-svn: 361607
* [DAGCombiner] make folds of binops safe for opcodes that produce >1 valueSanjay Patel2019-05-231-5/+7
| | | | | | | | | | This is no-functional-change-intended currently because the definition of isBinOp() only includes opcodes that produce 1 value. But if we share that implementation with isCommutativeBinOp() as proposed in D62191, then we need to make sure that the callers bail out for opcodes that they are not prepared to handle correctly. llvm-svn: 361547
* [TargetLowering] Extend bool args to inline-asm according to getBooleanTypeKees Cook2019-05-221-1/+10
| | | | | | | | | | | | | | | | | Summary: This extends Krzysztof Parzyszek's X86-specific solution (https://reviews.llvm.org/D60208) to the generic code pointed out by James Y Knight. Reviewers: kparzysz, craig.topper, nickdesaulniers Subscribers: efriedma, sdardis, nemanjai, javed.absar, eraman, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, srhines, void, nickdesaulniers, jyknight Tags: #llvm Differential Revision: https://reviews.llvm.org/D60224 llvm-svn: 361404
* [TargetLowering] Add blank line (test commit)Kees Cook2019-05-221-0/+1
| | | | llvm-svn: 361403
* [Intrinsic] Signed Fixed Point Saturation Multiplication IntrinsicLeonard Chan2019-05-217-19/+194
| | | | | | | | | | | | | | Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D55720 llvm-svn: 361289
* [SelectionDAG] fold insert subvector of undef into undefSanjay Patel2019-05-211-0/+3
| | | | | | | | | | | | DAGCombiner simplifies this more liberally as: // If inserting an UNDEF, just return the original vector. if (N1.isUndef()) return N0; So there's no way to make this visible in output AFAIK, but doing this at node creation time should be slightly more efficient. llvm-svn: 361287
* [SelectionDAG] remove redundant code; NFCISanjay Patel2019-05-211-6/+2
| | | | | | | | | getNode() squashes concatenation of undefs via FoldCONCAT_VECTORS(): // Concat of UNDEFs is UNDEF. if (llvm::all_of(Ops, [](SDValue Op) { return Op.isUndef(); })) return DAG.getUNDEF(VT); llvm-svn: 361284
* [DAGCombiner] prevent unsafe reassociation of FP opsSanjay Patel2019-05-211-1/+8
| | | | | | | | | | | There are no FP callers of DAGCombiner::reassociateOps() currently, but we can add a fast-math check to make sure this API is not being misused. This was noted as a potential risk (and that risk might increase) with: D62191 llvm-svn: 361268
* Add TargetLoweringInfo hook for explicitly setting the ABI calling ↵Dylan McKay2019-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | convention endianess Summary: The endianess used in the calling convention does not always match the endianess of the target on all architectures, namely AVR. When an argument is too large to be legalised by the architecture and is split for the ABI, a new hook TargetLoweringInfo::shouldSplitFunctionArgumentsAsLittleEndian is queried to find the endianess that function arguments must be laid out in. This approach was recommended by Eli Friedman. Originally reported in https://github.com/avr-rust/rust/issues/129. Patch by Carl Peto. Reviewers: bogner, t.p.northover, RKSimon, niravd, efriedma Reviewed By: efriedma Subscribers: JDevlieghere, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62003 llvm-svn: 361222
* [SelectionDAGBuilder] Flush PendingExports before creating INLINEASM_BR node ↵Craig Topper2019-05-201-3/+11
| | | | | | | | | | | | | | | for asm goto. Since INLINEASM_BR is a terminator we need to flush the pending exports before emitting it. If we don't do this, a TokenFactor can be inserted between it and the BR instruction emitted to finish the callbr lowering. It looks like nodes are glued to the INLINEASM_BR so I had to make sure we emit the TokenFactor before that. Differential Revision: https://reviews.llvm.org/D59981 llvm-svn: 361177
* [Intrinsics] Merge lround.i32 and lround.i64 into a single intrinsic with ↵Craig Topper2019-05-201-6/+4
| | | | | | | | | | overloaded result type. Make result type for llvm.llround overloaded instead of fixing to i64 We shouldn't really make assumptions about possible sizes for long and long long. And longer term we should probably support vectorizing these intrinsics. By making the result types not fixed we can support vectors as well. Differential Revision: https://reviews.llvm.org/D62026 llvm-svn: 361169
* [DAGCombiner] Refactor code in visitShiftByConstant slightly to make it more ↵Craig Topper2019-05-201-11/+12
| | | | | | | | | | | | readable. NFC This changes the isShift variable to include the constant operand check that was previously in the if statement. While there fix an 80 column violation and an unnecessary use of getNode. Also fix variable name capitalization. llvm-svn: 361168
* [SDAG] Vector op legalization for overflow opsNikita Popov2019-05-203-66/+120
| | | | | | | | | | | | | | | | | | Fixes issue reported by aemerson on D57348. Vector op legalization support is added for uaddo, usubo, saddo and ssubo (umulo and smulo were already supported). As usual, by extracting TargetLowering methods and calling them from vector op legalization. Vector op legalization doesn't really deal with multiple result nodes, so I'm explicitly performing a recursive legalization call on the result value that is not being legalized. There are some existing test changes because expansion happens earlier, so we don't get a DAG combiner run in between anymore. Differential Revision: https://reviews.llvm.org/D61692 llvm-svn: 361166
* [NFC] Refactor visitIntrinsicCall so it doesn't return a const char*Guillaume Chatelet2019-05-202-141/+148
| | | | | | | | | | | | | | Summary: API simplification Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61306 llvm-svn: 361140
* [DebugInfoMetadata] Refactor DIExpression::prepend constants (NFC)Petar Jovanovic2019-05-202-4/+3
| | | | | | | | | | | Refactor DIExpression::With* into a flag enum in order to be less error-prone to use (as discussed on D60866). Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D61943 llvm-svn: 361137
* Revert "[NFC] Refactor visitIntrinsicCall so it doesn't return a const char*"Guillaume Chatelet2019-05-202-145/+138
| | | | | | This reverts commit 706d3cd6388cc3446aab282f3af879862b10cbed. llvm-svn: 361130
* [NFC] Refactor visitIntrinsicCall so it doesn't return a const char*Guillaume Chatelet2019-05-202-138/+145
| | | | | | | | | | | | | | Summary: API simplification Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61306 llvm-svn: 361129
* [DAGCombiner] visitShiftByConstant(): drop bogus signbit checkRoman Lebedev2019-05-171-18/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: That check claims that the transform is illegal otherwise. That isn't true: 1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter https://rise4fun.com/Alive/K4A 2. For `ISD::AND`, there is no restriction on constants: https://rise4fun.com/Alive/Wy3 3. For `ISD::OR`, there is no restriction on constants: https://rise4fun.com/Alive/GOH 3. For `ISD::XOR`, there is no restriction on constants: https://rise4fun.com/Alive/ml6 So, why is it there then? This changes the testcase that was touched by @spatel in rL347478, but i'm not sure that test tests anything particular? Reviewers: RKSimon, spatel, craig.topper, jojo, rengolin Reviewed By: spatel Subscribers: javed.absar, llvm-commits, spatel Tags: #llvm Differential Revision: https://reviews.llvm.org/D61918 llvm-svn: 361044
* [CodeGen] Fixed de-optimization of legalize subvector extractTim Renouf2019-05-161-0/+18
| | | | | | | | | | | | | | | The recent introduction of v3i32 etc as an MVT, and its use in AMDGPU 3-dword memory instructions, caused a de-optimization problem for code with such a load that then bitcasts via vector of i8, because v12i8 is not an MVT so it legalizes the bitcast by widening it. This commit adds the ability to widen a bitcast using extract_subvector on the result, so the value does not need to go via memory. Differential Revision: https://reviews.llvm.org/D60457 Change-Id: Ie4abb7760547e54a2445961992eafc78e80d4b64 llvm-svn: 360942
* [CodeGen] Add lround/llround builtinsAdhemerval Zanella2019-05-166-0/+144
| | | | | | | | | | | | | This patch add the ISD::LROUND and ISD::LLROUND along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lround/llround generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. llvm-svn: 360889
* [codeview] Fix SDNode representation of annotation labelsReid Kleckner2019-05-152-1/+3
| | | | | | | | | | | Before this change, they were erroneously constructed with the EH_LABEL SDNode opcode, which caused other passes to interact with them in incorrect ways. See the FIXME about fastisel that this addresses in the existing test case. Fixes PR41890 llvm-svn: 360818
* [[DAGCombiner][NFC] Add a comment.Clement Courbet2019-05-151-0/+2
| | | | | | As suggested in D61846. llvm-svn: 360755
* [SDAG] fix unused variable warning and unneeded indirection; NFCSanjay Patel2019-05-142-3/+3
| | | | llvm-svn: 360640
* [SDAG, x86] allow targets to override test for binop opcodesSanjay Patel2019-05-142-9/+10
| | | | | | | | This follows the pattern of the existing isCommutativeBinOp(). x86 shows improvements from vector narrowing for the min/max opcodes. llvm-svn: 360639
* [TargetLowering] Handle multi depth GEPs w/ inline asm constraintsNick Desaulniers2019-05-131-38/+33
| | | | | | | | | | | | | | | | | | | | | | | Summary: X86TargetLowering::LowerAsmOperandForConstraint had better support than TargetLowering::LowerAsmOperandForConstraint for arbitrary depth getelementpointers for "i", "n", and "s" extended inline assembly constraints. Hoist its support from the derived class into the base class. Link: https://github.com/ClangBuiltLinux/linux/issues/469 Reviewers: echristo, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, E5ten, kees, jyknight, nemanjai, javed.absar, eraman, hiraditya, jsji, llvm-commits, void, craig.topper, nathanchance, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D61560 llvm-svn: 360604
* [TargetLowering] Add SimplifyDemandedBits support for ZERO_EXTEND_VECTOR_INREGSimon Pilgrim2019-05-131-0/+24
| | | | | | More work for PR39709. llvm-svn: 360592
* [DAGCombiner] narrow vector binop with inserts/extractSanjay Patel2019-05-131-1/+33
| | | | | | | | | | | | We catch most of these patterns (on x86 at least) by matching a concat vectors opcode early in combining, but the pattern may emerge later using insert subvector instead. The AVX1 diffs for add/sub overflow show another missed narrowing pattern. That one may be falling though the cracks because of combine ordering and multiple uses. llvm-svn: 360585
* Add constrained fptrunc and fpext intrinsics.Kevin P. Neal2019-05-137-29/+245
| | | | | | | | | | | The new fptrunc and fpext intrinsics are constrained versions of the regular fptrunc and fpext instructions. Reviewed by: Andrew Kaylor, Craig Topper, Cameron McInally, Conner Abbot Approved by: Craig Topper Differential Revision: https://reviews.llvm.org/D55897 llvm-svn: 360581
* TargetLowering::SimplifyDemandedBits - early-out for UNDEF ops. NFCI.Simon Pilgrim2019-05-131-3/+5
| | | | llvm-svn: 360579
* [DAGCombiner] Fix invalid alias analysis.Clement Courbet2019-05-131-3/+2
| | | | | | | | | | | | | | | | | | | | | Summary: When we know for sure whether two addresses do or do not alias, we should immediately return from DAGCombiner::isAlias(). I think this comes from a bad copy/paste, Sorry for not catching that during the code review. Fixes PR41855. Reviewers: niravd, gchatelet, EricWF Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61846 llvm-svn: 360566
* Recommit r358887 "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits ↵Craig Topper2019-05-131-1/+25
| | | | | | | | | | | | | | | | | | | | bitcast handling" I've included a new fix in X86RegisterInfo to prevent PR41619 without reintroducing r359392. We might be able to improve that in the base class implementation of shouldRewriteCopySrc somehow. But this hopefully enables forward progress on SimplifyDemandedBits improvements for now. Original commit message: This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly. The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGComb but it caused a lot of noise on other targets - some improvements, some regressions. The X86 changes are all definite wins. llvm-svn: 360552
* [DAGCombiner] try to move bitcast after extract_subvectorSanjay Patel2019-05-121-0/+24
| | | | | | | | | | | | | | | | | I noticed that we were failing to narrow an x86 ymm math op in a case similar to the 'madd' test diff. That is because a bitcast is sitting between the math and the extract subvector and thwarting our pattern matching for narrowing: t56: v8i32 = add t59, t58 t68: v4i64 = bitcast t56 t73: v2i64 = extract_subvector t68, Constant:i64<2> t96: v4i32 = bitcast t73 There are a few wins and neutral diffs in the other tests. Differential Revision: https://reviews.llvm.org/D61806 llvm-svn: 360541
* [DAG] Add SimplifyDemandedBits support for BITREVERSESimon Pilgrim2019-05-111-0/+10
| | | | | | Pulled out of D58017 while I continue to investigate the BSWAP regression on PPC llvm-svn: 360534
* SelectionDAGISel::CodeGenAndEmitDAG - remove unused variable. NFCI.Simon Pilgrim2019-05-111-3/+0
| | | | llvm-svn: 360514
* Revert [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactorJordan Rupprecht2019-05-101-3/+2
| | | | | | This reverts r360171 (git commit a9d6c32eafc645c55b07eb50698c428e14c0bffd). A repro showing the asan/msan failures is forthcoming. llvm-svn: 360481
OpenPOWER on IntegriCloud