summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* Move function dependent resetting of a subtarget variable out of theEric Christopher2014-07-041-0/+1
| | | | | | | | | | subtarget. This involved having the movt predicate take the current function - since we care about size in instruction selection for whether or not to use movw/movt take the function so we can check the attributes. This required adding the current MachineFunction to FastISel and propagating through. llvm-svn: 212309
* Fix ppcf128 component access on little-endian systemsUlrich Weigand2014-07-033-10/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PowerPC 128-bit long double data type (ppcf128 in LLVM) is in fact a pair of two doubles, where one is considered the "high" or more-significant part, and the other is considered the "low" or less-significant part. When a ppcf128 value is stored in memory or a register pair, the high part always comes first, i.e. at the lower memory address or in the lower-numbered register, and the low part always comes second. This is true both on big-endian and little-endian PowerPC systems. (Similar to how with a complex number, the real part always comes first and the imaginary part second, no matter the byte order of the system.) This was implemented incorrectly for little-endian systems in LLVM. This commit fixes three related issues: - When printing an immediate ppcf128 constant to assembler output in emitGlobalConstantFP, emit the high part first on both big- and little-endian systems. - When lowering a ppcf128 type to a pair of f64 types in SelectionDAG (which is used e.g. when generating code to load an argument into a register pair), use correct low/high part ordering on little-endian systems. - In a related issue, because lowering ppcf128 into a pair of f64 must operate differently from lowering an int128 into a pair of i64, bitcasts between ppcf128 and int128 must not be optimized away by the DAG combiner on little-endian systems, but must effect a word-swap. Reviewed by Hal Finkel. llvm-svn: 212274
* [x86] Fix the completely broken vector widening legalization of bswap.Chandler Carruth2014-07-031-1/+1
| | | | | | | | | | | | This operation was classified as a binary operation in the widening logic for some reason (clearly, untested). It is in fact a unary operation. Add a RUN line to a test to exercise this for x86. Note that again the vector widening strategy doesn't regress anything and in one case removes a totally unecessary instruction that we couldn't avoid when promoting the element type. llvm-svn: 212257
* [cleanup] Hoist an if-else chain on ISD opcodes (really designed forChandler Carruth2014-07-021-17/+28
| | | | | | | switches) into a switch, and sink them into a dispatch function that can return the result rather than awkward variable setting with breaks. llvm-svn: 212166
* [cleanup] Remove dead 'break;' statements that I meant to nuke inChandler Carruth2014-07-021-2/+0
| | | | | | | | r212158 but missed. Thanks to Craig for spotting the goof! llvm-svn: 212159
* [cleanup] Hoist the promotion dispatch logic into the promote functionChandler Carruth2014-07-021-23/+22
| | | | | | | so that we can use return to express it more cleanly and avoid so many nested switch statements. llvm-svn: 212158
* [cleanup] Nuke the 'VectorOp' bit of the promote method names.Chandler Carruth2014-07-021-9/+9
| | | | | | | This doesn't add any information for methods in the VectorLegalizer class that clearly take SDAG operations to legalize. llvm-svn: 212157
* [x86] Clean up and modernize the doxygen and API comments for the vectorChandler Carruth2014-07-021-24/+38
| | | | | | operation legalization code. llvm-svn: 212155
* [FastISel] Factor out stackmap intrinsic selection code into a dedicated ↵Juergen Ributzka2014-07-011-73/+76
| | | | | | helper method. NFCI. llvm-svn: 212140
* [DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI.Juergen Ributzka2014-07-016-17/+18
| | | | | | | | The argument list vector is never used after it has been passed to the CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and pretty much as cheap as keeping a pointer to it. llvm-svn: 212135
* Fix 'platform-specific' hyphenationsAlp Toker2014-06-302-3/+3
| | | | llvm-svn: 212056
* Add ops() method to SDNode that returns an ArrayRef<SDUse>. Use it to ↵Craig Topper2014-06-291-1/+1
| | | | | | simplify some code. llvm-svn: 211993
* [AArch64] Fix memset ICE when memset value is f128.Chad Rosier2014-06-271-1/+1
| | | | llvm-svn: 211960
* Revert "Introduce a string_ostream string builder facilty"Alp Toker2014-06-262-2/+4
| | | | | | Temporarily back out commits r211749, r211752 and r211754. llvm-svn: 211814
* Introduce a string_ostream string builder faciltyAlp Toker2014-06-262-4/+2
| | | | | | | | | | | | | | | | | | | | string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface. small_string_ostream<bytes> additionally permits an explicit stack storage size other than the default 128 bytes to be provided. Beyond that, storage is transferred to the heap. This convenient class can be used in most places an std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair would previously have been used, in order to guarantee consistent access without byte truncation. The patch also converts much of LLVM to use the new facility. These changes include several probable bug fixes for truncated output, a programming error that's no longer possible with the new interface. llvm-svn: 211749
* [AArch64] Fix a build_vector pattern match failKevin Qin2014-06-241-24/+25
| | | | | | caused by defect in isBuildVectorAllZeros(). llvm-svn: 211567
* Legalizer: Add support for splitting insert_subvectors.Benjamin Kramer2014-06-212-0/+39
| | | | | | | | | We handle this by spilling the whole thing to the stack and doing the insertion as a store. PR19492. This happens in real code because the vectorizer creates v2i128 when AVX is enabled. llvm-svn: 211435
* [ValueTracking] Extend range metadata to call/invokeJingyue Wu2014-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With this patch, range metadata can be added to call/invoke including IntrinsicInst. Previously, it could only be added to load. Rename computeKnownBitsLoad to computeKnownBitsFromRangeMetadata because range metadata is not only used by load. Update the language reference to reflect this change. Test Plan: Add several tests in range-2.ll to confirm the verifier is happy with having range metadata on call/invoke. Add two tests in AddOverFlow.ll to confirm annotating range metadata to call/invoke can benefit InstCombine. Reviewers: meheff, nlewycky, reames, hfinkel, eliben Reviewed By: eliben Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4187 llvm-svn: 211281
* DAG: move sret demotion into most basic LowerCallTo implementation.Tim Northover2014-06-181-119/+124
| | | | | | | | | | | | | | | | It looks like there are two versions of LowerCallTo here: the SelectionDAGBuilder one is designed to operate on LLVM IR, and the TargetLowering one in the case where everything is at DAG level. Previously, only the SelectionDAGBuilder variant could handle demoting an impossible return to sret semantics (before delegating to the TargetLowering version), but this functionality is also useful for certain libcalls (e.g. 128-bit operations on 32-bit x86). So this commit moves the sret handling down a level. rdar://problem/17242889 llvm-svn: 211155
* SelectionDAG: Expand i64 = FP_TO_SINT i32Tom Stellard2014-06-171-0/+59
| | | | llvm-svn: 211108
* LegalizeDAG: make sure cast is unsigned before using FP_TO_UINT.Tim Northover2014-06-151-1/+4
| | | | | | | | | | | | It's valid to use FP_TO_SINT when asking for a smaller type (e.g. all "unsigned int16" values fit into a "signed int32"), but the reverse isn't true. Unfortunately, I'm not actually aware of any architecture with asymmetric FP_TO_SINT and FP_TO_UINT handling and the logic happens to work in the symmetric case, so I can't actually write a test for this. llvm-svn: 210986
* The hazard recognizer only needs a subtarget, not a target machineEric Christopher2014-06-132-2/+4
| | | | | | so make it take one. Fix up all users accordingly. llvm-svn: 210948
* IR: add "cmpxchg weak" variant to support permitted failure.Tim Northover2014-06-136-65/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903
* [FastISel][X86] - Add branch weightsJuergen Ributzka2014-06-131-2/+6
| | | | | | | Add branch weights to branch instructions, so that the following passes can optimize based on it (i.e. basic block ordering). llvm-svn: 210863
* [FastISel][X86] Add MachineMemOperand to load/store instructions.Juergen Ributzka2014-06-121-0/+44
| | | | | | | | This commit adds MachineMemOperands to load and store instructions. This allows the peephole optimizer to fold load instructions. Unfortunatelly the peephole optimizer currently doesn't run at -O0. llvm-svn: 210858
* Revert "SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, ↵Tom Stellard2014-06-121-4/+4
| | | | | | | | | y)) for vectors" This reverts commit r210540, adds a testcase for the regression it caused, and marks the R600 test it was supposed to fix as XFAIL. llvm-svn: 210792
* [FastISel] Add support for the stackmap intrinsic.Juergen Ributzka2014-06-121-0/+102
| | | | | | This implements target-independent FastISel lowering for the stackmap intrinsic. llvm-svn: 210742
* Have isInTailCallPosition take the DAG so that we can use theEric Christopher2014-06-101-1/+1
| | | | | | | version of TargetLowering/Machine from there on the way to avoiding TargetMachine in TargetLowering. llvm-svn: 210579
* [FastISel] Collect statistics about failing intrinsic calls.Juergen Ributzka2014-06-101-1/+50
| | | | | | | Add more instruction-specific statistics about failing intrinsic calls during FastISel. llvm-svn: 210556
* SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CCTom Stellard2014-06-101-16/+5
| | | | | | | | | | | | | The SelectionDAG bad a special case for ISD::SELECT_CC, where it would allow targets to specify: setOperationAction(ISD::SELECT_CC, MVT::Other, Expand); to indicate that they wanted to expand ISD::SELECT_CC for all types. This wasn't applied correctly everywhere, and it makes writing new DAG patterns with ISD::SELECT_CC difficult. llvm-svn: 210541
* SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for ↵Tom Stellard2014-06-101-4/+4
| | | | | | | | | | vectors This prevents a future commit from regressing: test/CodeGen/R600/setcc-equivalent.ll llvm-svn: 210540
* SelectionDAG: Expand SELECT_CC to SELECT + SETCCTom Stellard2014-06-101-1/+17
| | | | | | | | This consolidates code from the Hexagon, R600, and XCore targets. No functionality change intended. llvm-svn: 210539
* [X86] Add target combine rules for horizontal add/sub.Andrea Di Biagio2014-06-091-0/+21
| | | | | | | | | | | | | | | | | | | | This patch adds new target specific combine rules to identify horizontal add/sub idioms from BUILD_VECTOR dag nodes. This patch also teaches the DAGCombiner how to canonicalize sequences of insert_vector_elt dag nodes according to the following rule: (insert_vector_elt (insert_vector_elt A, I0), I1) -> (insert_vecto_elt (insert_vector_elt A, I1), I0) This new canonicalization rule only triggers if the inner insert_vector dag node has exactly one use; also, both indices must be known constants, and I1 < I0. This last rule made it possible to write a simpler algorithm to identify horizontal add/sub patterns because now we don't have to worry about the ordering of insert_vector_elt dag nodes. llvm-svn: 210477
* [DAG] Expose NoSignedWrap, NoUnsignedWrap and Exact flags to SelectionDAG.Andrea Di Biagio2014-06-094-17/+109
| | | | | | | | | | | | | This patch modifies SelectionDAGBuilder to construct SDNodes with associated NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator instructions. Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing nsw/nuw/exact flags during codegen. Patch by Marcello Maggioni. llvm-svn: 210467
* Have TargetSelectionDAGInfo take a DataLayout initializer rather thanEric Christopher2014-06-061-2/+2
| | | | | | a TargetMachine since the only thing it wants is DataLayout. llvm-svn: 210366
* [SelectionDAG] Force cycle detection in AssignTopologicalOrder before abortingAdam Nemet2014-05-311-9/+18
| | | | | | | | | | | | | | | DAG cycle detection is only enabled with ENABLE_EXPENSIVE_CHECKS. However we can run it just before we would crash in order to provide more informative diagnostics. Now in addition to the "Overran sorted position" message we also get the Node printed if a cycle was detected. Tested by building several configs: Debug+Assert, Debug+Assert+Check (this is ENABLE_EXPENSIVE_CHECKS), Release+Assert and Release. Also tried that the AssignTopologicalOrder assert produces the expected results. llvm-svn: 209977
* [SelectionDAG] Pass DAG to checkForCyclesAdam Nemet2014-05-311-10/+12
| | | | | | | | | Pass the DAG down to checkForCycles from all callers where we have it. This allows target-specific nodes to be printed properly. Also print some missing newlines. llvm-svn: 209976
* [X86] Add two combine rules to simplify dag nodes introduced during type ↵Andrea Di Biagio2014-05-301-0/+21
| | | | | | | | | | | | | | | | | | | | | | | legalization when promoting nodes with illegal vector type. This patch teaches the backend how to simplify/canonicalize dag node sequences normally introduced by the backend when promoting certain dag nodes with illegal vector type. This patch adds two new combine rules: 1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) -> (shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>) 2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) -> (shuffle (BINOP A, B), Undef, <Mask>). Both rules are only triggered on the type-legalized DAG. In particular, rule 1. is a target specific combine rule that attempts to sink a bitconvert into the operands of a binary operation. Rule 2. is a target independet rule that attempts to move a shuffle immediately after a binary operation. llvm-svn: 209930
* Convert a vselect into a concat_vector if possibleFilipe Cabecinhas2014-05-301-0/+61
| | | | | | | | | | | | | | | | | | | | | Summary: If both vector args to vselect are concat_vectors and the condition is constant and picks half a vector from each argument, convert the vselect into a concat_vectors. Added a test. The ConvertSelectToConcatVector is assuming it doesn't get vselects with arguments of, for example, <undef, undef, true, true>. Those get taken care of in the checks above its call. Reviewers: nadav, delena, grosbach, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3916 llvm-svn: 209929
* [pr19636] Fix known bit computation in urem instruction with power of two.Rafael Espindola2014-05-301-2/+5
| | | | | | Patch by Andrey Kuharev. llvm-svn: 209902
* SelectionDAG: skip barriers for unordered atomic operationsTim Northover2014-05-301-2/+2
| | | | | | | | | Unordered is strictly weaker than monotonic, so if the latter doesn't have any barriers then the former certainly shouldn't. rdar://problem/16548260 llvm-svn: 209901
* Fix an assertion failure caused by v1i64 in DAGCombiner Shrink.Hao Liu2014-05-291-0/+4
| | | | llvm-svn: 209798
* [x86] Fold extract_vector_elt of a load into the Load's address computation.Michael J. Spencer2014-05-291-90/+124
| | | | | | | | An address only use of an extract element of a load can be simplified to a load. Without this the result of the extract element is spilled to the stack so that an address is available. llvm-svn: 209788
* Fix wrong setcc result type when legalizing uaddo/usuboMatt Arsenault2014-05-281-5/+11
| | | | | | | | | No test because no in-tree targets change the bitwidth of the setcc type depending on the bitwidth of the compared type. Patch by Ke Bai llvm-svn: 209771
* [pr19844] Add thread local mode to aliases.Rafael Espindola2014-05-281-8/+1
| | | | | | | | | | This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759
* Revert "[DAGCombiner] Split up an indexed load if only the base pointer ↵Hal Finkel2014-05-281-30/+7
| | | | | | | | | | value is live" This reverts r208640 (I've just XFAILed the test) because it broke ppc64/Linux self-hosting. Because nearly every regression test triggers a segfault, I hope this will be easy to fix. llvm-svn: 209747
* ARM: teach AAPCS-VFP to deal with Cortex-M4.Tim Northover2014-05-271-8/+11
| | | | | | | | | | | Cortex-M4 only has single-precision floating point support, so any LLVM "double" type will have been split into 2 i32s by now. Fortunately, the consecutive-register framework turns out to be precisely what's needed to reconstruct the double and follow AAPCS-VFP correctly! rdar://problem/17012966 llvm-svn: 209650
* Legalizer: Make bswap promotion safe for vectors.Benjamin Kramer2014-05-201-2/+2
| | | | llvm-svn: 209202
* SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the ↵Benjamin Kramer2014-05-191-0/+27
| | | | | | | | | | bswap not. - On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though. - On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal. - On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled. llvm-svn: 209123
* Target: remove old constructors for CallLoweringInfoSaleem Abdulrasool2014-05-176-105/+93
| | | | | | | | | | This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082
OpenPOWER on IntegriCloud