summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Update DebugLoc while merging nodes at -O0.Devang Patel2011-12-151-5/+20
| | | | | | Patch by Kyriakos Georgiou! llvm-svn: 146670
* Add missing cases to SDNode::getOperationName(). Patch by Micah Villmow.Eli Friedman2011-12-141-0/+5
| | | | llvm-svn: 146548
* Initial CodeGen support for CTTZ/CTLZ where a zero input produces anChandler Carruth2011-12-131-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | undefined result. This adds new ISD nodes for the new semantics, selecting them when the LLVM intrinsic indicates that the undef behavior is desired. The new nodes expand trivially to the old nodes, so targets don't actually need to do anything to support these new nodes besides indicating that they should be expanded. I've done this for all the operand types that I could figure out for all the targets. Owners of various targets, please review and let me know if any of these are incorrect. Note that the expand behavior is *conservatively correct*, and exactly matches LLVM's current behavior with these operations. Ideally this patch will not change behavior in any way. For example the regtest suite finds the exact same instruction sequences coming out of the code generator. That's why there are no new tests here -- all of this is being exercised by the existing test suite. Thanks to Duncan Sands for reviewing the various bits of this patch and helping me get the wrinkles ironed out with expanding for each target. Also thanks to Chris for clarifying through all the discussions that this is indeed the approach he was looking for. That said, there are likely still rough spots. Further review much appreciated. llvm-svn: 146466
* Move global variables in TargetMachine into new TargetOptions class. As an APINick Lewycky2011-12-021-4/+4
| | | | | | | | | | | | change, now you need a TargetOptions object to create a TargetMachine. Clang patch to follow. One small functionality change in PTX. PTX had commented out the machine verifier parts in their copy of printAndVerify. That now calls the version in LLVMTargetMachine. Users of PTX who need verification disabled should rely on not passing the command-line flag to enable it. llvm-svn: 145714
* Make SelectionDAG::InferPtrAlignment use llvm::ComputeMaskedBits instead of ↵Eli Friedman2011-11-281-18/+9
| | | | | | duplicating the logic for globals. Make llvm::ComputeMaskedBits handle GlobalVariables slightly more aggressively, to match what InferPtrAlignment knew how to do. llvm-svn: 145304
* Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead.Evan Cheng2011-11-281-1/+5
| | | | | | | | | | | Conservatively returns zero when the GV does not specify an alignment nor is it initialized. Previously it returns ABI alignment for type of the GV. However, if the type is a "packed" type, then the under-specified alignments is attached to the load / store instructions. In that case, the alignment of the type cannot be trusted. rdar://10464621 llvm-svn: 145300
* Remove dead llvm.eh.sjlj.dispatchsetup intrinsic.Bill Wendling2011-11-281-1/+0
| | | | llvm-svn: 145263
* Remove some unnecessary includes of PseudoSourceValue.h.Jay Foad2011-11-151-1/+0
| | | | llvm-svn: 144634
* Added invariant field to the DAG.getLoad method and changed all calls.Pete Cooper2011-11-081-14/+20
| | | | | | When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses llvm-svn: 144100
* Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization ↵Eli Friedman2011-11-081-0/+6
| | | | | | | | doesn't get confused by CSE later on. Fixes PR11318. Re-commit of r144034, with an extra fix so that RemoveDeadNode doesn't blow up. llvm-svn: 144055
* Reapply r143206, with fixes. Disallow physical register lifetimesDan Gohman2011-11-031-0/+16
| | | | | | | across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660
* Revert r143206, as there are still some failing tests.Dan Gohman2011-10-291-16/+0
| | | | llvm-svn: 143262
* Reapply r143177 and r143179 (reverting r143188), with schedulerDan Gohman2011-10-281-0/+16
| | | | | | | | | fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206
* Speculatively disable Dan's commits 143177 and 143179 to see ifDuncan Sands2011-10-281-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188
* Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUWDan Gohman2011-10-281-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177
* Rename NonScalarIntSafe to something more appropriate.Lang Hames2011-10-261-4/+4
| | | | llvm-svn: 143080
* Fix a bunch of unused variable warnings when doing a releaseDuncan Sands2011-10-181-0/+1
| | | | | | build with gcc-4.6. llvm-svn: 142350
* Let printf do the formatting instead aligning strings ourselves.Benjamin Kramer2011-10-161-1/+1
| | | | | | While at it, merge some format strings. llvm-svn: 142140
* Formatting.Eric Christopher2011-10-111-2/+1
| | | | llvm-svn: 141728
* When inferring the pointer alignment, if the global doesn't have an initializerBill Wendling2011-09-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | and the alignment is 0 (i.e., it's defined globally in one file and declared in another file) it could get an alignment which is larger than the ABI allows for that type, resulting in aligned moves being used for unaligned loads. For instance, in file A.c: struct S s; In file B.c: struct { // something long }; extern S s; void foo() { struct S p = s; // ... } this copy is a 'memcpy' which is turned into a series of 'movaps' instructions on X86. But this is wrong, because 'struct S' has alignment of 4, not 16. llvm-svn: 140902
* Rename AddSelectionDAGCSEId() to addSelectionDAGCSEId().Jim Grosbach2011-09-271-2/+2
| | | | | | Naming conventions consistency. No functional change. llvm-svn: 140636
* Cleanup PromoteIntOp_EXTRACT_VECTOR_ELT and PromoteIntRes_SETCC.Nadav Rotem2011-09-271-0/+6
| | | | | | Add a new method: getAnyExtOrTrunc and use it to replace the manual check. llvm-svn: 140603
* Add vselect target support for targets that do not support blend but do supportNadav Rotem2011-09-131-0/+4
| | | | | | xor/and/or (For example SSE2). llvm-svn: 139623
* Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the ↵Eli Friedman2011-09-091-1/+4
| | | | | | same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897. llvm-svn: 139407
* Relax the MemOperands on atomics a bit. Fixes -verify-machineinstrs ↵Eli Friedman2011-09-071-2/+17
| | | | | | | | failures for atomic laod/store on ARM. (The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.) llvm-svn: 139221
* Add codegen support for vector select (in the IR this means a selectDuncan Sands2011-09-061-8/+20
| | | | | | | | | | | | with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. llvm-svn: 139159
* Split the init.trampoline intrinsic, which currently combines GCC'sDuncan Sands2011-09-061-1/+2
| | | | | | | | | | | | | | | | | | | | init.trampoline and adjust.trampoline intrinsics, into two intrinsics like in GCC. While having one combined intrinsic is tempting, it is not natural because typically the trampoline initialization needs to be done in one function, and the result of adjust trampoline is needed in a different (nested) function. To get around this llvm-gcc hacks the nested function lowering code to insert an additional parent variable holding the adjust.trampoline result that can be accessed from the child function. Dragonegg doesn't have the luxury of tweaking GCC code, so it stored the result of adjust.trampoline in the memory GCC set aside for the trampoline itself (this is always available in the child function), and set up some new memory (using an alloca) to hold the trampoline. Unfortunately this breaks Go which allocates trampoline memory on the heap and wants to use it even after the parent has exited (!). Rather than doing even more hacks to get Go working, it seemed best to just use two intrinsics like in GCC. Patch mostly by Sanjoy Das. llvm-svn: 139140
* Basic x86 code generation for atomic load and store instructions.Eli Friedman2011-08-241-3/+58
| | | | llvm-svn: 138478
* Revert r137562 because it caused PR10674Nadav Rotem2011-08-161-7/+0
| | | | llvm-svn: 137719
* Fix PR 10635. When generating integer constants, the constant element type mayNadav Rotem2011-08-131-0/+7
| | | | | | | be illegal, even if the requested vector type is legal. Testcase is one of the disabled ARM tests in the vector-select patch. llvm-svn: 137562
* Don't create a ridiculous EXTRACT_ELEMENT. PR10563.Eli Friedman2011-08-021-0/+1
| | | | | | The testcase looks extremely fragile, so I'm adding an assertion which should catch any cases like this. llvm-svn: 136711
* Misc optimizer+codegen work for 'cmpxchg' and 'atomicrmw'. They appear to beEli Friedman2011-07-291-8/+20
| | | | | | | | | working on x86 (at least for trivial testcases); other architectures will need more work so that they actually emit the appropriate instructions for orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC, Mips, and Alpha backends need such changes.) llvm-svn: 136457
* Code generation for 'fence' instruction.Eli Friedman2011-07-271-0/+1
| | | | llvm-svn: 136283
* Add APInt(numBits, ArrayRef<uint64_t> bigVal) constructor to prevent future ↵Jeffrey Yasskin2011-07-181-1/+1
| | | | | | | | | ambiguity errors like the one corrected by r135261. Migrate all LLVM callers of the old constructor to the new one. llvm-svn: 135431
* land David Blaikie's patch to de-constify Type, with a few tweaks.Chris Lattner2011-07-181-9/+9
| | | | llvm-svn: 135375
* Add assertion for the chain value typeNadav Rotem2011-07-141-0/+10
| | | | llvm-svn: 135143
* Add an intrinsic and codegen support for fused multiply-accumulate. The intentCameron Zwarich2011-07-081-0/+1
| | | | | | is to use this for architectures that have a native FMA instruction. llvm-svn: 134742
* Add functions 'hasPredecessor' and 'hasPredecessorHelper' to SDNode. TheLang Hames2011-07-071-15/+30
| | | | | | | | | | | | | | | | hasPredecessorHelper function allows predecessors to be cached to speed up repeated invocations. This fixes PR10186. X.isPredecessorOf(Y) now just calls Y.hasPredecessor(X) Y.hasPredecessor(X) calls Y.hasPredecessorHelper(X, Visited, Worklist) with empty Visited and Worklist sets (i.e. no caching over invocations). Y.hasPredecessorHelper(X, Visited, Worklist) caches search state in Visited and Worklist to speed up repeated calls. The Visited set is searched for X before going to the worklist to further search the DAG if necessary. llvm-svn: 134592
* Grammar and 80-col.Eric Christopher2011-07-061-7/+7
| | | | llvm-svn: 134555
* Don't allocate empty read-only SmallVectors during SelectionDAG deallocation.Benjamin Kramer2011-06-181-3/+3
| | | | llvm-svn: 133348
* Remove dead code.Devang Patel2011-05-241-10/+0
| | | | llvm-svn: 131974
* - Teach SelectionDAG::isKnownNeverZero to return true (op x, c) when c isEvan Cheng2011-05-241-5/+13
| | | | | | | | | | non-zero. - Teach X86 cmov optimization to eliminate the cmov from ctlz, cttz extension when the source of X86ISD::BSR / X86ISD::BSF is proven to be non-zero. rdar://9490949 llvm-svn: 131948
* Revert 121907 (it causes llc crash) and apply original patch from PR9817.Devang Patel2011-05-231-3/+0
| | | | llvm-svn: 131926
* While replacing all uses of a SDValue with another value, do not forget to ↵Devang Patel2011-05-231-0/+3
| | | | | | transfer SDDbgValue. llvm-svn: 131907
* Other parts of the SelectionDAG framework assume that targets use their ↵Owen Anderson2011-05-021-1/+1
| | | | | | pointer type for vector indices. Make the vector unrolling code respect that. llvm-svn: 130733
* sext(undef) = 0, because the top bits will all be the same.Evan Cheng2011-03-151-1/+5
| | | | | | zext(undef) = 0, because the top bits will be zero. llvm-svn: 127649
* BIT_CONVERT has been renamed to BITCAST.Evan Cheng2011-03-141-1/+1
| | | | llvm-svn: 127600
* Minor optimization. sign-ext/anyext of undef is still undef.Evan Cheng2011-03-141-0/+4
| | | | llvm-svn: 127598
* Use the correct LHS type when determining the legalization of a shift's RHS ↵Owen Anderson2011-03-071-3/+4
| | | | | | type. llvm-svn: 127163
* Avoid exponential blow-up when printing DAGs.Bob Wilson2011-03-021-2/+5
| | | | | | | | | | | | | David Greene changed CannotYetSelect() to print the full DAG including multiple copies of operands reached through different paths in the DAG. Unfortunately this blows up exponentially in some cases. The depth limit of 100 is way too high to prevent this -- I'm seeing a message string of 150MB with a depth of only 40 in one particularly bad case, even though the DAG has less than 200 nodes. Part of the problem is that the printing code is following chain operands, so if you fail to select an operation with a chain, the printer will follow all the chained operations back to the entry node. llvm-svn: 126899
OpenPOWER on IntegriCloud