summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r348843 "[CodeGen] Allow mempcy/memset to generate small overlapping ↵Clement Courbet2018-12-111-3/+5
| | | | | | | | stores." Breaks ARM/memcpy-inline.ll llvm-svn: 348844
* [CodeGen] Allow mempcy/memset to generate small overlapping stores.Clement Courbet2018-12-111-5/+3
| | | | | | | | | | | | | Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 348843
* [DebugInfo] Emit undef DBG_VALUEs when SDNodes are optimised outJeremy Morse2018-12-101-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fix for PR39896, where dbg.value's of SDNodes that have been optimised out do not lead to "DBG_VALUE undef" instructions being created. Such undef instructions are necessary to terminate earlier variable ranges, otherwise variable values leak past the point where they're valid. The "invalidated" flag of SDDbgValue is currently being abused to mean two things: * The corresponding SDNode is now invalid * This SDDbgValue should not be emitted Of which there are several legitimate combinations of meaning: * The SDNode has been invalidated and we should emit "DBG_VALUE undef" * The SDNode has been invalidated but the debug data was salvaged, don't emit anything for this SDDbgValue * This SDDbgValue has been emitted This patch introduces distinct "Emitted" and "Invalidated" fields to the SDDbgValue class, updates users accordingly, and generates "undef" DBG_VALUEs for invalidated records. Awkwardly, there are circumstances where we emit SDDbgValue's twice, specifically DebugInfo/X86/dbg-addr-dse.ll which I've preserved. Differential Revision: https://reviews.llvm.org/D55372 llvm-svn: 348751
* [SelectionDAG] Remove ISD::ADDC/ADDE from some undef handling code in ↵Craig Topper2018-12-081-2/+0
| | | | | | | | getNode. NFCI These nodes should have two results. A real VT and a Glue. But this code would have returned Undef which would only be a single result. But we're in the single result version of getNode so these opcodes should never be seen by this function anyway. llvm-svn: 348670
* [SelectionDAG] Don't pass on DemandedElts when handling SCALAR_TO_VECTORSimon Pilgrim2018-12-071-1/+1
| | | | | | | | | | | | Fixes an assertion: llc: lib/CodeGen/SelectionDAG/SelectionDAG.cpp:2200: llvm::KnownBits llvm::SelectionDAG::computeKnownBits(llvm::SDValue, const llvm::APInt&, unsigned int) const: Assertion `(!Op.getValueType().isVector() || NumElts == Op.getValueType().getVectorNumElements()) && "Unexpected vector size"' failed. Committed on behalf of: @pendingchaos (Rhys Perry) Differential Revision: https://reviews.llvm.org/D55223 llvm-svn: 348574
* [SelectionDAG] Split very large token factors for loads into 64k chunks.Amara Emerson2018-12-051-1/+4
| | | | | | | | | | | | | | | There's a 64k limit on the number of SDNode operands, and some very large functions with 64k or more loads can cause crashes due to this limit being hit when a TokenFactor with this many operands is created. To fix this, create sub-tokenfactors if we've exceeded the limit. No test case as it requires a very large function. rdar://45196621 Differential Revision: https://reviews.llvm.org/D55073 llvm-svn: 348324
* [SelectionDAG] fold constant with undef vector per elementSanjay Patel2018-12-021-10/+16
| | | | | | | | | | | | This makes the SDAG behavior consistent with the way we do this in IR. It's possible that we were getting the wrong answer before. For example, 'xor undef, undef --> 0' but 'xor undef, C' --> undef. But the most practical improvement is likely as shown in the tests here - for FP, we were overconstraining undef lanes to NaN, and that can prevent vector simplifications/narrowing (see D51553). llvm-svn: 348090
* [SelectionDAG] fold FP binops with 2 undef operands to undefSanjay Patel2018-11-301-2/+4
| | | | llvm-svn: 348016
* [SelectionDAG] move constant or splat functions to common locationSanjay Patel2018-11-251-2/+16
| | | | | | | | rL347502 moved the null sibling, so we should group all of these together. I'm not sure why these aren't methods of the SDValue class itself, but that's another patch if that's possible. llvm-svn: 347523
* [DAG] consolidate shift simplificationsSanjay Patel2018-11-231-7/+34
| | | | | | | | | | ...and use them to avoid creating obviously undef values as discussed in the post-commit thread for r347478. The diffs in vector div/rem show that we were missing real optimizations by creating bogus shift nodes. llvm-svn: 347502
* Implement computeKnownBits for scalar_to_vectorStanislav Mekhanoshin2018-11-191-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D54728 llvm-svn: 347274
* [SelectionDAG] simplify vector select with undef operand(s)Sanjay Patel2018-11-191-2/+10
| | | | llvm-svn: 347227
* [SelectionDAG] simplify select FP with undef conditionSanjay Patel2018-11-191-1/+1
| | | | llvm-svn: 347212
* [SelectionDAG] add simplifySelect() to reduce code duplication; NFCSanjay Patel2018-11-191-18/+25
| | | | | | This should be extended to handle FP and vectors in follow-up patches. llvm-svn: 347210
* [DAG] add undef simplifications for select nodesSanjay Patel2018-11-181-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sadly, this duplicates (twice) the logic from InstSimplify. There might be some way to at least share the DAG versions of the code, but copying the folds seems to be the standard method to ensure that we don't miss these folds. Unlike in IR, we don't run DAGCombiner to fixpoint, so there's no way to ensure that we do these kinds of simplifications unless the code is repeated at node creation time and during combines. There were other tests that would become worthless with this improvement that I changed as pre-commits: rL347161 rL347164 rL347165 rL347166 rL347167 I'm not sure how to salvage the remaining tests (diffs in this patch). So the x86 tests verify that the new code is working as intended. The AMDGPU test is actually similar to my motivating case: we have some undef value that has survived to machine IR in an x86 test, and then it gets folded in some weird way, or we crash if we don't transfer the undef flag. But we would have been better off never getting to that point by doing these simplifications. This will lead back to PR32023 someday... https://bugs.llvm.org/show_bug.cgi?id=32023 llvm-svn: 347170
* [SelectionDAG] simplify code; NFCSanjay Patel2018-11-181-6/+5
| | | | llvm-svn: 347160
* Use llvm::copy. NFCFangrui Song2018-11-171-3/+3
| | | | llvm-svn: 347126
* [SelectionDAG][X86] Relax restriction on the width of an input to ↵Craig Topper2018-11-131-3/+2
| | | | | | | | | | | | | | | | | | *_EXTEND_VECTOR_INREG. Use them and regular *_EXTEND to replace the X86 specific VSEXT/VZEXT opcodes Previously, the extend_vector_inreg opcode required their input register to be the same total width as their output. But this doesn't match up with how the X86 instructions are defined. For X86 the input just needs to be a legal type with at least enough elements to cover the output. This patch weakens the check on these nodes and allows them to be used as long as they have more input elements than output elements. I haven't changed type legalization behavior so it will still create them with matching input and output sizes. X86 will custom legalize these nodes by shrinking the input to be a 128 bit vector and once we've done that we treat them as legal operations. We still have one case during type legalization where we must custom handle v64i8 on avx512f targets without avx512bw where v64i8 isn't a legal type. In this case we will custom type legalize to a *extend_vector_inreg with a v16i8 input. After that the input is a legal type so type legalization should ignore the node and doesn't need to know about the relaxed restriction. We are no longer allowed to use the default expansion for these nodes during vector op legalization since the default expansion uses a shuffle which required the widths to match. Custom legalization for all types will prevent us from reaching the default expansion code. I believe DAG combine works correctly with the released restriction because it doesn't check the number of input elements. The rest of the patch is changing X86 to use either the vector_inreg nodes or the regular zero_extend/sign_extend nodes. I had to add additional isel patterns to handle any_extend during isel since simplifydemandedbits can create them at any time so we can't legalize to zero_extend before isel. We don't yet create any_extend_vector_inreg in simplifydemandedbits. Differential Revision: https://reviews.llvm.org/D54346 llvm-svn: 346784
* [SelectionDAG] Fix a -Wparentheses warning from gcc in an assert. NFCCraig Topper2018-11-091-2/+2
| | | | | | gcc wants parentheses around the logical OR since there is a logical AND for the string. llvm-svn: 346564
* [SelectionDAG] Assert on the width of DemandedElts argument to ↵Craig Topper2018-11-081-2/+3
| | | | | | | | computeKnownBits for all vector typed operations not just build_vector. Fix AArch64 unit test that fails with the assertion added. llvm-svn: 346437
* [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsicsCameron McInally2018-11-051-0/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D53411 llvm-svn: 346141
* [SelectionDAG] Remove special methods for creating *_EXTEND_VECTOR_INREG ↵Craig Topper2018-11-041-33/+11
| | | | | | | | | | nodes. Move asserts into getNode. These methods were just wrappers around getNode with additional asserts (identical and repeated 3 times). But getNode already has a switch that can be used to hold these asserts that allows them to be shared for all 3 opcodes. This also enables checking on the places that create these nodes without using the wrappers. The rest of the patch is just changing all callers to use getNode directly. llvm-svn: 346087
* [FPEnv] [FPEnv] Add constrained intrinsics for MAXNUM and MINNUMCameron McInally2018-10-301-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D53216 llvm-svn: 345650
* [SelectionDAG] fix build warning for mismatched signs in compare; NFCSanjay Patel2018-10-301-1/+1
| | | | llvm-svn: 345598
* [SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodesSimon Pilgrim2018-10-301-0/+58
| | | | | | | | | | Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy. This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle. Differential Revision: https://reviews.llvm.org/D53760 llvm-svn: 345578
* [SelectionDAG] Fix bad indentation. NFCCraig Topper2018-10-281-4/+4
| | | | llvm-svn: 345481
* [LegalizeTypes] Stop DAGTypeLegalizer::getSETCCWidenedResultTy from creating ↵Craig Topper2018-10-261-0/+8
| | | | | | | | | | | | illegal setccs. Add checks for valid setccs The DAGTypeLegalizer::getSETCCWidenedResultTy was widening the MaskVT, but the code in convertMask called after getSETCCWidenedResultTy had no idea this widening had occurred. So none of the operands were widened when convertMask created new setccs with the widened VT. This patch removes the widening and adds some asserts to getNode to validate the types of setccs to prevent issues like this in the future. Differential Revision: https://reviews.llvm.org/D53743 llvm-svn: 345428
* [NFC] Rename minnan and maxnan to minimum and maximumThomas Lively2018-10-241-3/+2
| | | | | | | | | | | | | | | Summary: Changes all uses of minnan/maxnan to minimum/maximum globally. These names emphasize that the semantic difference between these operations is more than just NaN-propagation. Reviewers: arsenm, aheejin, dschuff, javed.absar Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53112 llvm-svn: 345218
* SelectionDAG: Reuse bigger sized constants in memset expansion.Matthias Braun2018-10-231-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | When implementing memset's today we often see this pattern: $x0 = MOV 0xXYXYXYXYXYXYXYXY store $x0, ... $w1 = MOV 0xXYXYXYXY store $w1, ... We first create a 64bit constant in a 64bit register with all bytes the same and then create a 32bit constant with all bytes the same in a 32bit register. In many targets we could just access the lower byte of the 64bit register instead. - Ideally this would be handled by the ConstantHoist pass but it runs too early when memset isn't expanded yet. - The memset expansion code already had this optimization implemented, however SelectionDAG constantfolding would constantfold the "trunc(bigconstnat)" pattern to "smallconstant". - This patch makes the memset expansion mark the constant as Opaque and stop DAGCombiner from constant folding in this situation. (Similar to how ConstantHoisting marks things as Opaque to avoid folding ADD/SUB/etc.) Differential Revision: https://reviews.llvm.org/D53181 llvm-svn: 345102
* DAG: Change behavior of fminnum/fmaxnum nodesMatt Arsenault2018-10-221-2/+24
| | | | | | | | | | | Introduce new versions that follow the IEEE semantics to help with legalization that may need quieted inputs. There are some regressions from inserting unnecessary canonicalizes when these are matched from fast math fcmp + select which should be fixed in a future commit. llvm-svn: 344914
* [SelectionDAG] allow undefs when matching splat constantsSanjay Patel2018-10-051-7/+4
| | | | | | | | | And use that to transform fsub with zero constant operands. The integer part isn't used yet, but it is proposed for use in D44548, so adding both enhancements here makes that patch simpler. llvm-svn: 343865
* llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)Fangrui Song2018-09-271-1/+1
| | | | | | | | | | | | Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163
* Run VerifyDAGDiverence in debug onlyMikael Nilsson2018-09-261-0/+2
| | | | | | | | | VerifyDAGDiverence costs compilation time, avoid running it in non-debug builds. Differential Revision: https://reviews.llvm.org/D52454 llvm-svn: 343086
* [x86] avoid 256-bit andnp that requires insert/extract with AVX1 (PR37449)Sanjay Patel2018-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the final (I hope!) problem pattern mentioned in PR37749: https://bugs.llvm.org/show_bug.cgi?id=37749 We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops. We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op. The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test, we have this vector-type-legalized sequence: t29: v8i32 = concat_vectors t27, t28 t30: v4i64 = bitcast t29 t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ... t31: v4i64 = bitcast t18 t32: v4i64 = xor t30, t31 t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ... t34: v4i64 = bitcast t9 t35: v4i64 = and t32, t34 t36: v8i32 = bitcast t35 t37: v4i32 = extract_subvector t36, Constant:i64<0> t38: v4i32 = extract_subvector t36, Constant:i64<4> Differential Revision: https://reviews.llvm.org/D52318 llvm-svn: 343008
* [SelectionDAG] replace duplicated peekThroughBitcast helper functions; NFCISanjay Patel2018-09-201-0/+12
| | | | | | | | | | | | | | x86 had 2 versions of peekThroughBitcast. DAGCombiner had 1. Plus, it had a 1-off implementation for the one-use variant. Move the x86 versions of the code to SelectionDAG, so we don't have different copies of the code. No functional change intended. I'm putting this next to isBitwiseNot() because I am planning to use it in there. Another option is next to the helpers in the ISD namespace (eg, ISD::isConstantSplatVector()). But if there's no good reason for those to be there, I'd prefer to pull other helpers over to SelectionDAG in follow-up steps. Differential Revision: https://reviews.llvm.org/D52285 llvm-svn: 342669
* [SelectionDAG] allow vector types with isBitwiseNot()Sanjay Patel2018-09-191-1/+4
| | | | | | | The test diff in not-and-simplify.ll is from a use in SimplifyDemandedBits, and the test diff in add.ll is from a DAGCombiner transform. llvm-svn: 342594
* Fix debug info for SelectionDAG legalization of DAG nodes with two results.Adrian Prantl2018-09-141-1/+1
| | | | | | | | | | | | | | | | This patch fixes the debug info handling for SelectionDAG legalization of DAG nodes with two results. When an replaced SDNode has more than one result, transferDbgValues was always copying the SDDbgValue from the first result and attaching them to all members. In reality SelectionDAG::ReplaceAllUsesWith() is given an array of SDNodes (though the type signature doesn't make this obvious (cf. the call site code in ReplaceNode()). rdar://problem/44162227 Differential Revision: https://reviews.llvm.org/D52112 llvm-svn: 342264
* [CodeGen] Fix remaining zext() assertions in SelectionDAGScott Linder2018-09-041-2/+2
| | | | | | | | Fix remaining cases not committed in https://reviews.llvm.org/D49574 Differential Revision: https://reviews.llvm.org/D50659 llvm-svn: 341380
* DAG: Handle extract_vector_elt in isKnownNeverNaNMatt Arsenault2018-09-031-0/+3
| | | | llvm-svn: 341317
* [NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysisNicolai Haehnle2018-08-301-1/+1
| | | | | | | | | | | | | | | | | | | | Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433). The purpose of this patch is to free up the name DivergenceAnalysis for the new generic implementation. The generic implementation class will be shared by specialized divergence analysis classes. Patch by: Simon Moll Reviewed By: nhaehnle Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50434 Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb llvm-svn: 341071
* [X86] Support v2i32 gather/scatter indices with ↵Craig Topper2018-08-291-2/+2
| | | | | | | | | | | | | | | | -x86-experimental-vector-widening-legalization Summary: This is split out from D41062 to cover the code in LegalVectorTypes.cpp Reviewers: RKSimon, spatel, efriedma Reviewed By: efriedma Subscribers: sdardis, jvesely, nhaehnle, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D51337 llvm-svn: 340891
* [DAG] Avoid recomputing Divergence checks. NFCI.Nirav Dave2018-08-281-6/+10
| | | | | | | When making multiple updates to the same SDNode, recompute node divergence only once after all changes have been made. llvm-svn: 340852
* [DAG] Fix updateDivergence calculationNirav Dave2018-08-281-1/+1
| | | | | | | Check correct SDNode when deciding if we should update the divergence property. llvm-svn: 340851
* [SelectionDAG][X86] Reorder the operands the MaskedStoreSDNode to put the ↵Craig Topper2018-08-251-3/+3
| | | | | | | | | | | | | | | | | | | | | value first. Summary: Previously the value being stored is the last operand in SDNode. This causes the type legalizer to visit the mask operand before the value operand. The type legalizer was more complicated because of this since we want the type of the value to drive the decisions. This patch moves the value to be the first operand so we visit it first during type legalization. It also simplifies the type legalization code accordingly. X86 is currently the only in tree target that uses this SDNode. Not sure if there are any users out of tree. Reviewers: RKSimon, delena, hfinkel, eli.friedman Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50402 llvm-svn: 340689
* [SDAG] Add versions of computeKnownBits that return a valueJustin Bogner2018-08-241-93/+81
| | | | | | | | | | | Having the KnownBits as an output parameter is kind of awkward to use and a holdover from when it was two separate APInts. Instead, just return a KnownBits object. I'm leaving the existing interface in place for now, since updating the callers all at once would be thousands of lines of diff. llvm-svn: 340594
* [SelectionDAG] Reuse the Op's VT. NFCI.Simon Pilgrim2018-08-201-2/+2
| | | | llvm-svn: 340173
* [SelectionDAG] Add partial sign-bit support to ComputeNumSignBits for ↵Simon Pilgrim2018-08-201-2/+16
| | | | | | | | | | BITCAST nodes Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts. Handle the case where the sign bit extends to only part of the small elements. llvm-svn: 340169
* [SelectionDAG] Add basic demanded elements support to ComputeNumSignBits for ↵Simon Pilgrim2018-08-191-1/+8
| | | | | | | | | | BITCAST nodes Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts. The next step would be to support cases where the large elements aren't all sign bits, and determine the small element equivalent based on the demanded elements. llvm-svn: 340143
* DAG: Fix isKnownNeverNaN for basic non-sNaN casesMatt Arsenault2018-08-171-12/+7
| | | | | | | fadd/fsub/fmul need to worry about infinities as well as fdiv. llvm-svn: 340085
* [SDAG] Remove the reliance on MI's allocation strategy forChandler Carruth2018-08-141-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `MachineMemOperand` pointers attached to `MachineSDNodes` and instead have the `SelectionDAG` fully manage the memory for this array. Prior to this change, the memory management was deeply confusing here -- The way the MI was built relied on the `SelectionDAG` allocating memory for these arrays of pointers using the `MachineFunction`'s allocator so that the raw pointer to the array could be blindly copied into an eventual `MachineInstr`. This creates a hard coupling between how `MachineInstr`s allocate their array of `MachineMemOperand` pointers and how the `MachineSDNode` does. This change is motivated in large part by a change I am making to how `MachineFunction` allocates these pointers, but it seems like a layering improvement as well. This would run the risk of increasing allocations overall, but I've implemented an optimization that should avoid that by storing a single `MachineMemOperand` pointer directly instead of allocating anything. This is expected to be a net win because the vast majority of uses of these only need a single pointer. As a side-effect, this makes the API for updating a `MachineSDNode` and a `MachineInstr` reasonably different which seems nice to avoid unexpected coupling of these two layers. We can map between them, but we shouldn't be *surprised* at where that occurs. =] Differential Revision: https://reviews.llvm.org/D50680 llvm-svn: 339740
OpenPOWER on IntegriCloud