summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] Add missing one use check on the shuffle in the ↵Craig Topper2018-12-311-1/+1
| | | | | | | | bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) transform. Found while trying out some other changes so I don't really have a test case. llvm-svn: 350172
* [PowerPC] Fix ADDE, SUBE do not know how to promote operatorKang Zhang2018-12-301-0/+5
| | | | | | | | | | | | | | Summary: This patch is created to fix the Bugzilla bug 39815: https://bugs.llvm.org/show_bug.cgi?id=39815 This patch is to support promotion integer result for the instruction ADDE, SUBE. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D56119 llvm-svn: 350161
* Add vtable anchor to classes.Richard Trieu2018-12-291-0/+2
| | | | llvm-svn: 350142
* [NVPTX] Allow libcalls that are defined in the current module.Justin Lebar2018-12-261-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch adds a possibility to make library calls on NVPTX. An important thing about library functions - they must be defined within the current module. This basically should guarantee that we produce a valid PTX assembly (without calls to not defined functions). The one who wants to use the libcalls is probably will have to link against compiler-rt or any other implementation. Currently, it's completely impossible to make library calls because of error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can lower ExternalSymbol to TargetExternalSymbol and verify if the function definition is available. Also, there was an issue with a DAG during legalisation. When we expand instruction into libcall, the inner call-chain isn't being "integrated" into outer chain. Since the last "data-flow" (call retval load) node is located in call-chain earlier than CALLSEQ_END node, the latter becomes a leaf and therefore a dead node (and is being removed quite fast). Proposed here solution relies on another data-flow pseudo nodes (ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and instruction selection phases - we remove the pseudo instructions before register scheduling phase. Patch by Denys Zariaiev! Differential Revision: https://reviews.llvm.org/D34708 llvm-svn: 350069
* [X86] Use GetDemandedBits to simplify the operands of PMULDQ/PMULUDQ.Craig Topper2018-12-241-0/+9
| | | | | | | | | | | | | | This is an alternative to what I attempted in D56057. GetDemandedBits is a special version of SimplifyDemandedBits that allows simplifications even when the operand has other uses. GetDemandedBits will only do simplifications that allow a node to be bypassed. It won't create new nodes or alter any of the other users. I had to add support for bypassing SIGN_EXTEND_INREG to GetDemandedBits. Based on a patch that Simon Pilgrim sent me in email. Fixes PR40142. llvm-svn: 350059
* [SelectionDAGBuilder] Use ::precise LocationSizes; NFCGeorge Burgess IV2018-12-241-11/+23
| | | | | | | | | | | | | | | More migration so we can disable the implicit int -> LocationSize conversion. All of these are either scatter/gather'ed vector instructions, or direct loads. Hence, they're all precise. Perhaps if we see way more getTypeStoreSize calls, we can make a getTypeStoreLocationSize (or similar) as a wrapper that applies this ::precise. Doesn't appear that it's a good idea to make getTypeStoreSize return a LocationSize itself, however. llvm-svn: 350042
* [DAGCombiner] limit shuffle to extend transform (PR40146)Sanjay Patel2018-12-231-4/+5
| | | | | | | | | | It's dangerous to knowingly create an illegal vector type no matter what stage of combining we're in. This prevents the missed folding/scalarization seen in: https://bugs.llvm.org/show_bug.cgi?id=40146 llvm-svn: 350034
* [DAGCombiner] allow hoisting vector bitwise logic ahead of extendsSanjay Patel2018-12-231-6/+5
| | | | llvm-svn: 350032
* [DAGCombiner] allow narrowing of add followed by truncateSanjay Patel2018-12-221-2/+1
| | | | | | | | | | | | | | | trunc (add X, C ) --> add (trunc X), C' If we're throwing away the top bits of an 'add' instruction, do it in the narrow destination type. This makes the truncate-able opcode list identical to the sibling transform done in IR (in instcombine). This change used to show regressions for x86, but those are gone after D55494. This gets us closer to deleting the x86 custom function (combineTruncatedArithmetic) that does almost the same thing. Differential Revision: https://reviews.llvm.org/D55866 llvm-svn: 350006
* [DAGCombiner] simplify code leading to scalarizeExtractedVectorLoad; NFCSanjay Patel2018-12-211-6/+5
| | | | llvm-svn: 349958
* [SelectionDAG] Always use the version of computeKnownBits that returns a ↵Simon Pilgrim2018-12-215-27/+16
| | | | | | | | value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349907
* [ARM] Complete the Thumb1 shift+and->shift+shift transforms.Eli Friedman2018-12-201-1/+2
| | | | | | | | | | | | | | This saves materializing the immediate. The additional forms are less common (they don't usually show up for bitfield insert/extract), but they're still relevant. I had to add a new target hook to prevent DAGCombine from reversing the transform. That isn't the only possible way to solve the conflict, but it seems straightforward enough. Differential Revision: https://reviews.llvm.org/D55630 llvm-svn: 349857
* [SelectionDAGBuilder] Enable funnel shift building to custom rotatesSimon Pilgrim2018-12-201-4/+2
| | | | | | | | | | This patch enables funnel shift -> rotate building for all ROTL/ROTR custom/legal operations. AFAICT X86 was the last target that was missing modulo support (PR38243), but I've tried to CC stakeholders for every target that has ROTL/ROTR custom handling for their final OK. Differential Revision: https://reviews.llvm.org/D55747 llvm-svn: 349765
* [DAGCombiner] Fix a place that was creating a SIGN_EXTEND with an extra operand.Craig Topper2018-12-201-1/+1
| | | | llvm-svn: 349726
* [SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate ↵Simon Pilgrim2018-12-191-4/+4
| | | | | | | | | | | | | | (part 2 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. I've updated the (or (and X, c1), c2) -> (and (or X, c2), c1|c2) fold to demonstrate its use, which I believe is safe for undef cases. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349629
* [SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate ↵Simon Pilgrim2018-12-191-6/+13
| | | | | | | | | | | | (part 1 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349628
* [TargetLowering] Fix propagation of undefs in zero extension ops (PR40091)Simon Pilgrim2018-12-192-4/+23
| | | | | | | | | | | | As described on PR40091, we have several places where zext (and zext_vector_inreg) fold an undef input into an undef output. For zero extensions this is incorrect as the output should guarantee to least have the new upper bits set to zero. SimplifyDemandedVectorElts is the worst offender (and its the most likely to cause new undefs to appear) but DAGCombiner's tryToFoldExtendOfConstant has a similar issue. Thanks to @dmgreen for catching this. Differential Revision: https://reviews.llvm.org/D55883 llvm-svn: 349625
* [SelectionDAG] Optional handling of UNDEF elements in matchUnaryPredicateSimon Pilgrim2018-12-191-4/+13
| | | | | | | | | | | | Now that SimplifyDemandedBits/SimplifyDemandedVectorElts are simplifying vector elements, we're seeing more constant BUILD_VECTOR containing UNDEFs. This patch provides opt-in handling of UNDEF elements in matchUnaryPredicate, passing NULL instead of the ConstantSDNode* argument. I've updated SelectionDAG::simplifyShift to demonstrate its use. Differential Revision: https://reviews.llvm.org/D55819 llvm-svn: 349616
* Rewrite objc intrinsics to runtime methods in PreISelIntrinsicLowering ↵Pete Cooper2018-12-181-50/+0
| | | | | | | | | | instead of SDAG. SelectionDAG currently changes these intrinsics to function calls, but that won't work for other ISel's. Also we want to eventually support nonlazybind and weak linkage coming from the front-end which we can't do in SelectionDAG. llvm-svn: 349552
* [SelectionDAG][X86] Fix [US](ADD|SUB)SAT vector legalization, add testsNikita Popov2018-12-182-2/+6
| | | | | | | | | Integer result promotion needs to use the scalar size, and we need support for result widening. This is in preparation for D55787. llvm-svn: 349480
* [TargetLowering] Fallback from SimplifyDemandedVectorElts to ↵Simon Pilgrim2018-12-181-1/+8
| | | | | | | | SimplifyDemandedBits For opcodes not covered by SimplifyDemandedVectorElts, SimplifyDemandedBits might be able to help now that it supports demanded elts as well. llvm-svn: 349466
* [SDAG] Clarify the origin of chain in REG_SEQUENCE in comment, NFCKrzysztof Parzyszek2018-12-171-1/+3
| | | | llvm-svn: 349391
* [SelectionDAG] Fix noop detection for vectors in AssertZext/AssertSext in ↵Craig Topper2018-12-171-2/+2
| | | | | | | | | | | | getNode The assertion type is always supposed to be a scalar type. So if the result VT of the assertion is a vector, we need to get the scalar VT before we can compare them. Similarly for the assert above it. I don't have a test case because I don't know of any place we violate this today. A coworker found this while trying to use r347287 on the 6.0 branch without also having r336868 llvm-svn: 349390
* NFC: remove unused variableJF Bastien2018-12-171-1/+0
| | | | | | D55768 removed its use. llvm-svn: 349377
* [TargetLowering] Add DemandedElts mask to SimplifyDemandedBits (PR40000)Simon Pilgrim2018-12-171-42/+120
| | | | | | | | | | This is an initial patch to add the necessary support for a DemandedElts argument to SimplifyDemandedBits, more closely matching computeKnownBits and to help improve vector codegen. I've added only a small amount of the changes necessary to get at least one test to update - a lot more can be done but I'd like to add these methodically with proper test coverage, at the same time the hope is to slowly move some/all of SimplifyDemandedVectorElts into SimplifyDemandedBits as well. Differential Revision: https://reviews.llvm.org/D55768 llvm-svn: 349374
* FastIsel: take care to update iterators when removing instructions.Tim Northover2018-12-171-0/+9
| | | | | | | | | | We keep a few iterators into the basic block we're selecting while performing FastISel. Usually this is fine, but occasionally code wants to remove already-emitted instructions. When this happens we have to be careful to update those iterators so they're not pointint at dangling memory. llvm-svn: 349365
* [DAGCombiner] allow hoisting vector bitwise logic ahead of truncatesSanjay Patel2018-12-161-5/+2
| | | | | | | | | | | | | | | | | | The transform performs a bitwise logic op in a wider type followed by truncate when both inputs are truncated from the same source type: logic_op (truncate x), (truncate y) --> truncate (logic_op x, y) There are a bunch of other checks that should prevent doing this when it might be harmful. We already do this transform for scalars in this spot. The vector limitation was shared with a check for the case when the operands are extended. I'm not sure if that limit is needed either, but that would be a separate patch. Differential Revision: https://reviews.llvm.org/D55448 llvm-svn: 349303
* [SelectionDAG] Add FSHL/FSHR support to computeKnownBitsSimon Pilgrim2018-12-162-2/+37
| | | | | | Also exposes an issue in DAGCombiner::visitFunnelShift where we were assuming the shift amount had the result type (after legalization it'll have the targets shift amount type). llvm-svn: 349298
* [TargetLowering] Add ISD::OR + ISD::XOR handling to SimplifyDemandedVectorEltsSimon Pilgrim2018-12-151-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D55600 llvm-svn: 349264
* [SDAG] Ignore chain operand in REG_SEQUENCE when emitting instructionsKrzysztof Parzyszek2018-12-141-0/+4
| | | | llvm-svn: 349186
* [DAGCombiner][X86] Prevent visitSIGN_EXTEND from returning N when (sext ↵Craig Topper2018-12-141-15/+18
| | | | | | | | | | | | | | | | | (setcc)) already has the target desired type for the setcc Summary: If the setcc already has the target desired type we can reach the getSetCC/getSExtOrTrunc after the MatchingVecType check with the exact same types as the nodes we started with. This causes those causes VsetCC to be CSEd to N0 and the getSExtOrTrunc will CSE to N. When we return N, the caller will think that meant we called CombineTo and did our own worklist management. But that's not what happened. This prevents target hooks from being called for the node. To fix this, I've now returned SDValue if the setcc is already the desired type. But to avoid some regressions in X86 I've had to disable one of the target combines that wasn't being reached before in the case of a (sext (setcc)). If we get vector widening legalization enabled that entire function will be deleted anyway so hopefully this is only for the short term. Reviewers: RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55459 llvm-svn: 349137
* [DAGCombiner] clean up visitEXTRACT_VECTOR_ELTSanjay Patel2018-12-141-138/+129
| | | | | | | | | | | | | | | | | | This isn't quite NFC, but I don't know how to expose any outward diffs from these changes. Mostly, this was confusing because it used 'VT' to refer to the operand type rather the usual type of the input node. There's also a large block at the end that is dedicated solely to matching loads, but that wasn't obvious. This could probably be split up into separate functions to make it easier to see. It's still not clear to me when we make certain transforms because the legality and constant conditions are intertwined in a way that might be improved. llvm-svn: 349095
* [DAGCombiner] after simplifying demanded elements of vector operand of ↵Sanjay Patel2018-12-131-1/+6
| | | | | | | | | | | extract, revisit the extract; 2nd try This is a retry of rL349051 (reverted at rL349056). I changed the check for dead-ness from number of uses to an opcode test for DELETED_NODE based on existing similar code. Differential Revision: https://reviews.llvm.org/D55655 llvm-svn: 349058
* revert rL349051: [DAGCombiner] after simplifying demanded elements of vector ↵Sanjay Patel2018-12-131-6/+1
| | | | | | | | | operand of extract, revisit the extract This causes an address sanitizer bot failure: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/27187/steps/check-llvm%20asan/logs/stdio llvm-svn: 349056
* [DAGCombiner] after simplifying demanded elements of vector operand of ↵Sanjay Patel2018-12-131-1/+6
| | | | | | | | extract, revisit the extract Differential Revision: https://reviews.llvm.org/D55655 llvm-svn: 349051
* [DAGCombine] Moved X86 rotate_amount % bitwidth == 0 early out to DAGCombinerSimon Pilgrim2018-12-131-0/+7
| | | | | | Remove common code from custom lowering (code is still safe if somehow a zero value gets used). llvm-svn: 349028
* [TargetLowering] Add ISD::ROTL/ROTR vector expansionSimon Pilgrim2018-12-133-40/+62
| | | | | | | | | | Move existing rotation expansion code into TargetLowering and set it up for vectors as well. Ideally this would share more of the funnel shift expansion, but we handle the shift amount modulo quite differently at the moment. Begun removing x86 vector rotate custom lowering to use the expansion. llvm-svn: 349025
* [CodeGen] Allow mempcy/memset to generate small overlapping stores.Clement Courbet2018-12-131-5/+3
| | | | | | | | | | | | | Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 349016
* [SelectionDAG] Add a generic isSplatValue functionSimon Pilgrim2018-12-121-0/+96
| | | | | | | | | | | | | | This patch introduces a generic function to determine whether a given vector type is known to be a splat value for the specified demanded elements, recursing up the DAG looking for BUILD_VECTOR or VECTOR_SHUFFLE splat patterns. It also keeps track of the elements that are known to be UNDEF - it returns true if all the demanded elements are UNDEF (as this may be useful under some circumstances), so this needs to be handled by the caller. A wrapper variant is also provided that doesn't take the DemandedElts or UndefElts arguments for cases where we just want to know if the SDValue is a splat or not (with/without UNDEFS). I had hoped to completely remove the X86 local version of this function, but I'm seeing some regressions in shift/rotate codegen that will take a little longer to fix and I hope to get this in sooner so I can continue work on PR38243 which needs more capable splat detection. Differential Revision: https://reviews.llvm.org/D55426 llvm-svn: 348953
* [TargetLowering] Add ISD::AND handling to SimplifyDemandedVectorEltsSimon Pilgrim2018-12-121-0/+16
| | | | | | | | If either of the operand elements are zero then we know the result element is going to be zero (even if the other element is undef). Differential Revision: https://reviews.llvm.org/D55558 llvm-svn: 348926
* [Intrinsic] Signed Fixed Point Multiplication IntrinsicLeonard Chan2018-12-128-5/+239
| | | | | | | | | | | | Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D54719 llvm-svn: 348912
* Revert r348843 "[CodeGen] Allow mempcy/memset to generate small overlapping ↵Clement Courbet2018-12-111-3/+5
| | | | | | | | stores." Breaks ARM/memcpy-inline.ll llvm-svn: 348844
* [CodeGen] Allow mempcy/memset to generate small overlapping stores.Clement Courbet2018-12-111-5/+3
| | | | | | | | | | | | | Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 348843
* [TargetLowering] Add ISD::EXTRACT_VECTOR_ELT support to SimplifyDemandedBitsSimon Pilgrim2018-12-111-0/+19
| | | | | | | | Let SimplifyDemandedBits attempt to simplify all elements of a vector extraction. Part of PR39689. llvm-svn: 348839
* [TargetLowering] Add UNDEF folding to SimplifyDemandedVectorEltsSimon Pilgrim2018-12-101-1/+6
| | | | | | | | | | If all the demanded elements of the SimplifyDemandedVectorElts are known to be UNDEF, we can simplify to an ISD::UNDEF node. Zero constant folding will be handled in a future patch - its a little trickier as we often have bitcasted zero values. Differential Revision: https://reviews.llvm.org/D55511 llvm-svn: 348784
* [DAGCombiner] Remove unnecessary recursive ↵Simon Pilgrim2018-12-101-6/+0
| | | | | | | | DAGCombiner::visitINSERT_SUBVECTOR call. As discussed on D55511, this caused an issue if the inner node deletes a node that the outer node depends upon. As it doesn't affect any lit-tests and I've only been able to expose this with the D55511 change I'm committing this now. llvm-svn: 348781
* [DAGCombiner] Use the result value type in visitCONCAT_VECTORSFrancis Visoiu Mistrih2018-12-101-1/+1
| | | | | | | | | | | | | | | This triggers an assert when combining concat_vectors of a bitcast of merge_values. With asserts disabled, it fails to select: fatal error: error in backend: Cannot select: 0x7ff19d000e90: i32 = any_extend 0x7ff19d000ae8 0x7ff19d000ae8: f64,ch = CopyFromReg 0x7ff19d000c20:1, Register:f64 %1 0x7ff19d000b50: f64 = Register %1 In function: d Differential Revision: https://reviews.llvm.org/D55507 llvm-svn: 348759
* [DebugInfo] Don't drop dbg.value's of nullptrJeremy Morse2018-12-102-1/+5
| | | | | | | | | | | | | | | | | | | Currently, dbg.value's of "nullptr" are dropped when entering a SelectionDAG -- apparently just because of an oversight when recognising Values that are constant (see PR39787). This patch adds ConstantPointerNull to the list of constants that can be turned into DBG_VALUEs. The matter of what bit-value a null pointer constant in LLVM has was raised in this mailing list thread: http://lists.llvm.org/pipermail/llvm-dev/2018-December/128234.html Where it transpires LLVM relies on (IR) null pointers being zero valued, thus I've baked this assumption into the patch. Differential Revision: https://reviews.llvm.org/D55227 llvm-svn: 348753
* [DebugInfo] Emit undef DBG_VALUEs when SDNodes are optimised outJeremy Morse2018-12-105-11/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fix for PR39896, where dbg.value's of SDNodes that have been optimised out do not lead to "DBG_VALUE undef" instructions being created. Such undef instructions are necessary to terminate earlier variable ranges, otherwise variable values leak past the point where they're valid. The "invalidated" flag of SDDbgValue is currently being abused to mean two things: * The corresponding SDNode is now invalid * This SDDbgValue should not be emitted Of which there are several legitimate combinations of meaning: * The SDNode has been invalidated and we should emit "DBG_VALUE undef" * The SDNode has been invalidated but the debug data was salvaged, don't emit anything for this SDDbgValue * This SDDbgValue has been emitted This patch introduces distinct "Emitted" and "Invalidated" fields to the SDDbgValue class, updates users accordingly, and generates "undef" DBG_VALUEs for invalidated records. Awkwardly, there are circumstances where we emit SDDbgValue's twice, specifically DebugInfo/X86/dbg-addr-dse.ll which I've preserved. Differential Revision: https://reviews.llvm.org/D55372 llvm-svn: 348751
* [DAGCombiner] re-enable truncation of binopsSanjay Patel2018-12-081-12/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is effectively re-committing the changes from: rL347917 (D54640) rL348195 (D55126) ...which were effectively reverted here: rL348604 ...because the code had a bug that could induce infinite looping or eventual out-of-memory compilation. The bug was that this code did not guard against transforming opaque constants. More details are in the post-commit mailing list thread for r347917. A reduced test for that is included in the x86 bool-math.ll file. (I wasn't able to reduce a PPC backend test for this, but it was almost the same pattern.) Original commit message for r347917: The motivating case for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=32023 and the corresponding rot16.ll regression tests. Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc sequences that don't get folded in IR. As the TODO comments suggest, there will be regressions if we extend this (for x86, we mostly seem to be missing LEA opportunities, but there are likely vector folds missing too). I think those should be considered existing bugs because this is the same transform that we do as an IR canonicalization in instcombine. We just need more tests to make those visible independent of this patch. llvm-svn: 348706
OpenPOWER on IntegriCloud