summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* DAG: fp->int conversion for non-splat constants.Jim Grosbach2014-07-231-12/+11
| | | | | | | | | | Constant fold the lanes of the input constant build_vector individually so we correctly handle when the vector elements are not all the same constant value. PR20394 llvm-svn: 213798
* [SDAG] Refactor the code for inserting a newly allocated SDNode into theChandler Carruth2014-07-221-96/+86
| | | | | | | | | | | DAG into a helper function. This adds a trip through the (very minimal) verification logic in a bunch of places that were missing it, but shouldn't have any other impact outside of refactoring. I'm hoping to use this to do more clever things when DAG nodes are inserted into the graph. llvm-svn: 213612
* [SDAG] Remove a giant pile of asserts that may have helped track downChandler Carruth2014-07-221-40/+3
| | | | | | | | | | | a bug in 2010 when they were added but are adding no value today. In fact, they are utter lies. NodeAllocator is used to allocate almost all of these node types. I don't know what we were trying to assert here, and the docs don't give any answer. Until we once again stumble upon a bug needing help, let's clear the path for improvements. llvm-svn: 213610
* [DAG] Refactor some logic. No functional change.Andrea Di Biagio2014-07-211-0/+21
| | | | | | | | This patch removes function 'CommuteVectorShuffle' from X86ISelLowering.cpp and moves its logic into SelectionDAG.cpp as method 'getCommutedVectorShuffles'. This refactoring is in preperation of an upcoming change to the DAGCombiner. llvm-svn: 213503
* AArch64: Constant fold converting vector setcc results to float.Jim Grosbach2014-07-181-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can move unary operations, in this case [su]int_to_fp through the mask operation and constant fold the operation away. Generally speaking: UNARYOP(AND(VECTOR_CMP(x,y), constant)) --> AND(VECTOR_CMP(x,y), constant2) where constant2 is UNARYOP(constant). This implements the transform where UNARYOP is [su]int_to_fp. For example, consider the simple function: define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind { %cmp = fcmp oeq <4 x float> %val, %test %ext = zext <4 x i1> %cmp to <4 x i32> %result = sitofp <4 x i32> %ext to <4 x float> ret <4 x float> %result } Before this change, the code is generated as: fcmeq.4s v0, v0, v1 movi.4s v1, #0x1 // Integer splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. scvtf.4s v0, v0 // Convert each lane to f32. ret After, the code is improved to: fcmeq.4s v0, v0, v1 fmov.4s v1, #1.00000000 // f32 splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. ret The svvtf.4s has been constant folded away and the floating point 1.0f vector lanes are materialized directly via fmov.4s. Rather than do the folding manually in the target code, teach getNode() in the generic SelectionDAG to handle folding constant operands of vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do additional constant folding there as well, but I don't have test cases for those operations, so leaving them for another time when it becomes appropriate. rdar://17693791 llvm-svn: 213341
* [x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogousChandler Carruth2014-07-101-0/+20
| | | | | | | | | | | | | | | | | | | | to the zero-extend-vector-inreg node introduced previously for the same purpose: manage the type legalization of widened extend operations, especially to support the experimental widening mode for x86. I'm adding both because sign-extend is expanded in terms of any-extend with shifts to propagate the sign bit. This removes the last fundamental scalarization from vec_cast2.ll (a test case that hit many really bad edge cases for widening legalization), although the trunc tests in that file still appear scalarized because the the shuffle legalization is scalarizing. Funny thing, I've been working on that. Some initial experiments with this and SSE2 scenarios is showing moderately good behavior already for sign extension. Still some work to do on the shuffle combining on X86 before we're generating optimal sequences, but avoiding scalarization is a huge step forward. llvm-svn: 212714
* Make it possible for ints/floats to return different values from ↵Daniel Sanders2014-07-101-8/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getBooleanContents() Summary: On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer comparisons return 0 or 1. Updated the various uses of getBooleanContents. Two simplifications had to be disabled when float and int boolean contents differ: - ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially discoverable (i.e. when the condition of the VSELECT is a SETCC node). - visitVSELECT (select C, 0, 1) -> (xor C, 1). Come to think of it, this one could test for the common case of 'C' being a SETCC too. Preserved existing behaviour for all other targets and updated the affected MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low' variable was counting in the wrong direction because it thought it could simply add the result of the comparison. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4389 llvm-svn: 212697
* [x86] Fix a bug in my new zext-vector-inreg DAG trickery where we wereChandler Carruth2014-07-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | not widening the input type to the node sufficiently to let the ext take place in a register. This would in turn result in a mysterious bitcast assertion failure downstream. First change here is to add back the helpful assert I had in an earlier version of the code to catch this immediately. Next change is to add support to the type legalization to detect when we have widened the operand either too little or too much (for whatever reason) and find a size-matched legal vector type to convert it to first. This can also fail so we get a new fallback path, but that seems OK. With this, we no longer crash on vec_cast2.ll when using widening. I've also added the CHECK lines for the zero-extend cases here. We still need to support sign-extend and trunc (or something) to get plausible code for the other two thirds of this test which is one of the regression tests that showed the most scalarization when widening was force-enabled. Slowly closing in on widening being a viable legalization strategy without it resorting to scalarization at every turn. =] llvm-svn: 212614
* [x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when wideningChandler Carruth2014-07-091-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vector types to be legal and a ZERO_EXTEND node is encountered. When we use widening to legalize vector types, extend nodes are a real challenge. Either the input or output is likely to be legal, but in many cases not both. As a consequence, we don't really have any way to represent this situation and the prior code in the widening legalization framework would just scalarize the extend operation completely. This patch introduces a new DAG node to represent doing a zero extend of a vector "in register". The core of the idea is to allow legal but different vector types in the input and output. The output vector must have fewer lanes but wider elements. The operation is defined to zero extend the low elements of the input to the size of the output elements, and drop all of the high elements which don't have a corresponding lane in the output vector. It also includes generic expansion of this node in terms of blending a zero vector into the high elements of the vector and bitcasting across. This in turn yields extremely nice code for x86 SSE2 when we use the new widening legalization logic in conjunction with the new shuffle lowering logic. There is still more to do here. We need to support sign extension, any extension, and potentially int-to-float conversions. My current plan is to continue using similar synthetic nodes to model each of these transitions with generic lowering code for each one. However, with this patch LLVM already reaches performance parity with GCC for the core C loops of the x264 code (assuming you disable the hand-written assembly versions) when compiling for SSE2 and SSE3 architectures and enabling the new widening and lowering logic for vectors. Differential Revision: http://reviews.llvm.org/D4405 llvm-svn: 212610
* [SDAG] At the suggestion of Hal, switch to an output parameter thatChandler Carruth2014-07-091-13/+18
| | | | | | | | | | tracks which elements of the build vector are in fact undef. This should make actually inpsecting them (likely in my next patch) reasonably pretty. Also makes the output parameter optional as it is clear now that *most* users are happy with undefs in their splats. llvm-svn: 212581
* [x86,SDAG] Sink the logic for folding shuffles of splats moreChandler Carruth2014-07-081-5/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | aggressively from the x86 shuffle lowering to the generic SDAG vector shuffle formation code. This code already tried to fold away shuffles of splats! It just had lots of bugs and couldn't handle the case my new x86 shuffle lowering needed. First, it failed to correctly compute whether N2 was undef because it pre-computed this, then did transformations which could *make* N2 undef, then failed to ever re-consider the precomputed state. Second, it didn't look through bitcasts at all, even in the safe cases where they are just element-type bitcasts with no change to the number of elements. Third, it didn't handle all-zero bit casts nicely the way my code in the x86 side of things did, which is essential to getting good zext-shuffle lowerings. But all of these are generic. I just ported the code down to this layer and fixed the surrounding bugs. Tests exercising this in the x86 backend still pass and some silly code in widen_cast-6.ll gets better. I updated that test to be a bit more precise but it's still pretty unclear what the value of the test is in this day and age. llvm-svn: 212517
* [SDAG] Build up a more rich set of APIs for querying build-vector SDAGChandler Carruth2014-07-081-9/+33
| | | | | | | | | | | | | | | | | | | | nodes about whether they are splats. This is factored out and improved from r212324 which got reverted as it was far too aggressive. The new API should help more conservatively handle buildvectors that are a mixture of splatted and undef values. No functionality change at this point. The hope is to slowly re-introduce the undef-tolerant optimization of splats, but each time being forced to make a concious decision about how to handle the undefs in a way that doesn't lead to contradicting assumptions about the collapsed value. Hal has pointed out in discussions that this may not end up being the desired API and instead it may be more convenient to get a mask of the undef elements or something similar. I'm starting simple and will expand the API as I adapt actual callers and see exactly what they need. llvm-svn: 212514
* [x86] Revert r212324 which was too aggressive w.r.t. allowing undefChandler Carruth2014-07-071-20/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lanes in vector splats. The core problem here is that undef lanes can't *unilaterally* be considered to contribute to splats. Their handling needs to be more cautious. There is also a reported failure of the nightly testers (thanks Tobias!) that may well stem from the same core issue. I'm going to fix this theoretical issue, factor the APIs a bit better, and then verify that I don't see anything bad with Tobias's reduction from the test suite before recommitting. Original commit message for r212324: [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is *dramatically* improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212475
* [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work forChandler Carruth2014-07-041-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is *dramatically* improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212324
* [DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI.Juergen Ributzka2014-07-011-3/+3
| | | | | | | | The argument list vector is never used after it has been passed to the CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and pretty much as cheap as keeping a pointer to it. llvm-svn: 212135
* Add ops() method to SDNode that returns an ArrayRef<SDUse>. Use it to ↵Craig Topper2014-06-291-1/+1
| | | | | | simplify some code. llvm-svn: 211993
* [AArch64] Fix memset ICE when memset value is f128.Chad Rosier2014-06-271-1/+1
| | | | llvm-svn: 211960
* [AArch64] Fix a build_vector pattern match failKevin Qin2014-06-241-24/+25
| | | | | | caused by defect in isBuildVectorAllZeros(). llvm-svn: 211567
* [ValueTracking] Extend range metadata to call/invokeJingyue Wu2014-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With this patch, range metadata can be added to call/invoke including IntrinsicInst. Previously, it could only be added to load. Rename computeKnownBitsLoad to computeKnownBitsFromRangeMetadata because range metadata is not only used by load. Update the language reference to reflect this change. Test Plan: Add several tests in range-2.ll to confirm the verifier is happy with having range metadata on call/invoke. Add two tests in AddOverFlow.ll to confirm annotating range metadata to call/invoke can benefit InstCombine. Reviewers: meheff, nlewycky, reames, hfinkel, eliben Reviewed By: eliben Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4187 llvm-svn: 211281
* IR: add "cmpxchg weak" variant to support permitted failure.Tim Northover2014-06-131-28/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903
* [DAG] Expose NoSignedWrap, NoUnsignedWrap and Exact flags to SelectionDAG.Andrea Di Biagio2014-06-091-9/+62
| | | | | | | | | | | | | This patch modifies SelectionDAGBuilder to construct SDNodes with associated NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator instructions. Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing nsw/nuw/exact flags during codegen. Patch by Marcello Maggioni. llvm-svn: 210467
* [SelectionDAG] Force cycle detection in AssignTopologicalOrder before abortingAdam Nemet2014-05-311-9/+18
| | | | | | | | | | | | | | | DAG cycle detection is only enabled with ENABLE_EXPENSIVE_CHECKS. However we can run it just before we would crash in order to provide more informative diagnostics. Now in addition to the "Overran sorted position" message we also get the Node printed if a cycle was detected. Tested by building several configs: Debug+Assert, Debug+Assert+Check (this is ENABLE_EXPENSIVE_CHECKS), Release+Assert and Release. Also tried that the AssignTopologicalOrder assert produces the expected results. llvm-svn: 209977
* [SelectionDAG] Pass DAG to checkForCyclesAdam Nemet2014-05-311-10/+12
| | | | | | | | | Pass the DAG down to checkForCycles from all callers where we have it. This allows target-specific nodes to be printed properly. Also print some missing newlines. llvm-svn: 209976
* [pr19636] Fix known bit computation in urem instruction with power of two.Rafael Espindola2014-05-301-2/+5
| | | | | | Patch by Andrey Kuharev. llvm-svn: 209902
* [pr19844] Add thread local mode to aliases.Rafael Espindola2014-05-281-8/+1
| | | | | | | | | | This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759
* Target: remove old constructors for CallLoweringInfoSaleem Abdulrasool2014-05-171-28/+23
| | | | | | | | | | This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082
* Delete getAliasedGlobal.Rafael Espindola2014-05-161-1/+1
| | | | llvm-svn: 209040
* Instead of littering asserts throughout the code after every call toJay Foad2014-05-151-53/+32
| | | | | | | computeKnownBits, consolidate them into one assert at the end of computeKnownBits itself. llvm-svn: 208876
* Rename ComputeMaskedBits to computeKnownBits. "Masked" has beenJay Foad2014-05-141-44/+43
| | | | | | inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811
* Update the comments for ComputeMaskedBits, which lost its Mask parameterJay Foad2014-05-141-3/+2
| | | | | | in r154011. llvm-svn: 208757
* Use a logical not when inverting SetCC. This unfortunately doesn't fire on ↵Pete Cooper2014-05-121-0/+16
| | | | | | | | | | | | any targets so I couldn't find a test case to trigger it. The problem occurs when a non-i1 setcc is inverted. For example 'i8 = setcc' will get 'xor 0xff' to invert this. This is clearly wrong when the boolean contents are ZeroOrOne. This patch introduces getLogicalNOT and updates SetCC legalisation to use it. Reviewed by Hal Finkel. llvm-svn: 208641
* Fix using wrong result type for setcc.Matt Arsenault2014-05-071-0/+8
| | | | | | | | | | | When reducing the bitwidth of a comparison against a constant, the original setcc's result type was used, which was incorrect. No test since I don't think any other in tree targets change the bitwidth of the setcc type depending on the bitwidth of the compared type. llvm-svn: 208236
* Satisfy GCC's urgent need for parentheses around ‘&&’ within ‘||’.Benjamin Kramer2014-05-021-2/+2
| | | | llvm-svn: 207871
* Allow SelectionDAG::FoldConstantArithmetic to work when it's called with a ↵Benjamin Kramer2014-05-021-2/+8
| | | | | | vector VT but scalar values. llvm-svn: 207835
* Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I ↵Craig Topper2014-04-301-1/+1
| | | | | | introduced most of these recently. llvm-svn: 207616
* Convert more SelectionDAG functions to use ArrayRef.Craig Topper2014-04-281-28/+29
| | | | llvm-svn: 207397
* Convert AddNodeIDNode and SelectionDAG::getNodeIfExiists to use ↵Craig Topper2014-04-271-41/+41
| | | | | | ArrayRef<SDValue> llvm-svn: 207383
* Convert SelectionDAG::MorphNodeTo to use ArrayRef.Craig Topper2014-04-271-9/+9
| | | | llvm-svn: 207378
* Convert SelectionDAG::SelectNodeTo to use ArrayRef.Craig Topper2014-04-271-22/+19
| | | | llvm-svn: 207377
* Convert one last signature of getNode to take an ArrayRef of SDUse.Craig Topper2014-04-271-4/+4
| | | | llvm-svn: 207376
* Convert SDNode constructor to use ArrayRef.Craig Topper2014-04-271-9/+8
| | | | llvm-svn: 207375
* Convert SelectionDAG::getMergeValues to use ArrayRef.Craig Topper2014-04-271-7/+5
| | | | llvm-svn: 207374
* Const-correct SelectionDAG::getAtomic.Craig Topper2014-04-271-2/+2
| | | | llvm-svn: 207373
* SelectionDAG: Aggressively fold shuffles of constant splats.Benjamin Kramer2014-04-271-0/+5
| | | | llvm-svn: 207352
* Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer ↵Craig Topper2014-04-261-7/+7
| | | | | | and size. llvm-svn: 207329
* Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.Craig Topper2014-04-261-47/+38
| | | | llvm-svn: 207327
* Remove an unused version of getMemIntrinsicNode and getNode. Additionally, ↵Craig Topper2014-04-261-20/+0
| | | | | | these were calling makeVTList with the pointers passed in which would were unlikely to belong to SelectionDAG and likely would have just been stack pointers. llvm-svn: 207326
* This reapplies r207235 with an additional bugfixes caught by the msanAdrian Prantl2014-04-251-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | buildbot - do not insert debug intrinsics before phi nodes. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207269
* Revert "This reapplies r207130 with an additional testcase+and a missing ↵Adrian Prantl2014-04-251-12/+6
| | | | | | | | check for" This reverts commit 207235 to investigate msan buildbot breakage. llvm-svn: 207250
* This reapplies r207130 with an additional testcase+and a missing check forAdrian Prantl2014-04-251-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207235
OpenPOWER on IntegriCloud