summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Reapply r143206, with fixes. Disallow physical register lifetimesDan Gohman2011-11-031-0/+16
| | | | | | | across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660
* Revert r143206, as there are still some failing tests.Dan Gohman2011-10-291-16/+0
| | | | llvm-svn: 143262
* Reapply r143177 and r143179 (reverting r143188), with schedulerDan Gohman2011-10-281-0/+16
| | | | | | | | | fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206
* Speculatively disable Dan's commits 143177 and 143179 to see ifDuncan Sands2011-10-281-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188
* Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUWDan Gohman2011-10-281-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177
* Rename NonScalarIntSafe to something more appropriate.Lang Hames2011-10-261-4/+4
| | | | llvm-svn: 143080
* Fix a bunch of unused variable warnings when doing a releaseDuncan Sands2011-10-181-0/+1
| | | | | | build with gcc-4.6. llvm-svn: 142350
* Let printf do the formatting instead aligning strings ourselves.Benjamin Kramer2011-10-161-1/+1
| | | | | | While at it, merge some format strings. llvm-svn: 142140
* Formatting.Eric Christopher2011-10-111-2/+1
| | | | llvm-svn: 141728
* When inferring the pointer alignment, if the global doesn't have an initializerBill Wendling2011-09-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | and the alignment is 0 (i.e., it's defined globally in one file and declared in another file) it could get an alignment which is larger than the ABI allows for that type, resulting in aligned moves being used for unaligned loads. For instance, in file A.c: struct S s; In file B.c: struct { // something long }; extern S s; void foo() { struct S p = s; // ... } this copy is a 'memcpy' which is turned into a series of 'movaps' instructions on X86. But this is wrong, because 'struct S' has alignment of 4, not 16. llvm-svn: 140902
* Rename AddSelectionDAGCSEId() to addSelectionDAGCSEId().Jim Grosbach2011-09-271-2/+2
| | | | | | Naming conventions consistency. No functional change. llvm-svn: 140636
* Cleanup PromoteIntOp_EXTRACT_VECTOR_ELT and PromoteIntRes_SETCC.Nadav Rotem2011-09-271-0/+6
| | | | | | Add a new method: getAnyExtOrTrunc and use it to replace the manual check. llvm-svn: 140603
* Add vselect target support for targets that do not support blend but do supportNadav Rotem2011-09-131-0/+4
| | | | | | xor/and/or (For example SSE2). llvm-svn: 139623
* Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the ↵Eli Friedman2011-09-091-1/+4
| | | | | | same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897. llvm-svn: 139407
* Relax the MemOperands on atomics a bit. Fixes -verify-machineinstrs ↵Eli Friedman2011-09-071-2/+17
| | | | | | | | failures for atomic laod/store on ARM. (The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.) llvm-svn: 139221
* Add codegen support for vector select (in the IR this means a selectDuncan Sands2011-09-061-8/+20
| | | | | | | | | | | | with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. llvm-svn: 139159
* Split the init.trampoline intrinsic, which currently combines GCC'sDuncan Sands2011-09-061-1/+2
| | | | | | | | | | | | | | | | | | | | init.trampoline and adjust.trampoline intrinsics, into two intrinsics like in GCC. While having one combined intrinsic is tempting, it is not natural because typically the trampoline initialization needs to be done in one function, and the result of adjust trampoline is needed in a different (nested) function. To get around this llvm-gcc hacks the nested function lowering code to insert an additional parent variable holding the adjust.trampoline result that can be accessed from the child function. Dragonegg doesn't have the luxury of tweaking GCC code, so it stored the result of adjust.trampoline in the memory GCC set aside for the trampoline itself (this is always available in the child function), and set up some new memory (using an alloca) to hold the trampoline. Unfortunately this breaks Go which allocates trampoline memory on the heap and wants to use it even after the parent has exited (!). Rather than doing even more hacks to get Go working, it seemed best to just use two intrinsics like in GCC. Patch mostly by Sanjoy Das. llvm-svn: 139140
* Basic x86 code generation for atomic load and store instructions.Eli Friedman2011-08-241-3/+58
| | | | llvm-svn: 138478
* Revert r137562 because it caused PR10674Nadav Rotem2011-08-161-7/+0
| | | | llvm-svn: 137719
* Fix PR 10635. When generating integer constants, the constant element type mayNadav Rotem2011-08-131-0/+7
| | | | | | | be illegal, even if the requested vector type is legal. Testcase is one of the disabled ARM tests in the vector-select patch. llvm-svn: 137562
* Don't create a ridiculous EXTRACT_ELEMENT. PR10563.Eli Friedman2011-08-021-0/+1
| | | | | | The testcase looks extremely fragile, so I'm adding an assertion which should catch any cases like this. llvm-svn: 136711
* Misc optimizer+codegen work for 'cmpxchg' and 'atomicrmw'. They appear to beEli Friedman2011-07-291-8/+20
| | | | | | | | | working on x86 (at least for trivial testcases); other architectures will need more work so that they actually emit the appropriate instructions for orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC, Mips, and Alpha backends need such changes.) llvm-svn: 136457
* Code generation for 'fence' instruction.Eli Friedman2011-07-271-0/+1
| | | | llvm-svn: 136283
* Add APInt(numBits, ArrayRef<uint64_t> bigVal) constructor to prevent future ↵Jeffrey Yasskin2011-07-181-1/+1
| | | | | | | | | ambiguity errors like the one corrected by r135261. Migrate all LLVM callers of the old constructor to the new one. llvm-svn: 135431
* land David Blaikie's patch to de-constify Type, with a few tweaks.Chris Lattner2011-07-181-9/+9
| | | | llvm-svn: 135375
* Add assertion for the chain value typeNadav Rotem2011-07-141-0/+10
| | | | llvm-svn: 135143
* Add an intrinsic and codegen support for fused multiply-accumulate. The intentCameron Zwarich2011-07-081-0/+1
| | | | | | is to use this for architectures that have a native FMA instruction. llvm-svn: 134742
* Add functions 'hasPredecessor' and 'hasPredecessorHelper' to SDNode. TheLang Hames2011-07-071-15/+30
| | | | | | | | | | | | | | | | hasPredecessorHelper function allows predecessors to be cached to speed up repeated invocations. This fixes PR10186. X.isPredecessorOf(Y) now just calls Y.hasPredecessor(X) Y.hasPredecessor(X) calls Y.hasPredecessorHelper(X, Visited, Worklist) with empty Visited and Worklist sets (i.e. no caching over invocations). Y.hasPredecessorHelper(X, Visited, Worklist) caches search state in Visited and Worklist to speed up repeated calls. The Visited set is searched for X before going to the worklist to further search the DAG if necessary. llvm-svn: 134592
* Grammar and 80-col.Eric Christopher2011-07-061-7/+7
| | | | llvm-svn: 134555
* Don't allocate empty read-only SmallVectors during SelectionDAG deallocation.Benjamin Kramer2011-06-181-3/+3
| | | | llvm-svn: 133348
* Remove dead code.Devang Patel2011-05-241-10/+0
| | | | llvm-svn: 131974
* - Teach SelectionDAG::isKnownNeverZero to return true (op x, c) when c isEvan Cheng2011-05-241-5/+13
| | | | | | | | | | non-zero. - Teach X86 cmov optimization to eliminate the cmov from ctlz, cttz extension when the source of X86ISD::BSR / X86ISD::BSF is proven to be non-zero. rdar://9490949 llvm-svn: 131948
* Revert 121907 (it causes llc crash) and apply original patch from PR9817.Devang Patel2011-05-231-3/+0
| | | | llvm-svn: 131926
* While replacing all uses of a SDValue with another value, do not forget to ↵Devang Patel2011-05-231-0/+3
| | | | | | transfer SDDbgValue. llvm-svn: 131907
* Other parts of the SelectionDAG framework assume that targets use their ↵Owen Anderson2011-05-021-1/+1
| | | | | | pointer type for vector indices. Make the vector unrolling code respect that. llvm-svn: 130733
* sext(undef) = 0, because the top bits will all be the same.Evan Cheng2011-03-151-1/+5
| | | | | | zext(undef) = 0, because the top bits will be zero. llvm-svn: 127649
* BIT_CONVERT has been renamed to BITCAST.Evan Cheng2011-03-141-1/+1
| | | | llvm-svn: 127600
* Minor optimization. sign-ext/anyext of undef is still undef.Evan Cheng2011-03-141-0/+4
| | | | llvm-svn: 127598
* Use the correct LHS type when determining the legalization of a shift's RHS ↵Owen Anderson2011-03-071-3/+4
| | | | | | type. llvm-svn: 127163
* Avoid exponential blow-up when printing DAGs.Bob Wilson2011-03-021-2/+5
| | | | | | | | | | | | | David Greene changed CannotYetSelect() to print the full DAG including multiple copies of operands reached through different paths in the DAG. Unfortunately this blows up exponentially in some cases. The depth limit of 100 is way too high to prevent this -- I'm seeing a message string of 150MB with a depth of only 40 in one particularly bad case, even though the DAG has less than 200 nodes. Part of the problem is that the printing code is following chain operands, so if you fail to select an operation with a chain, the printer will follow all the chained operations back to the entry node. llvm-svn: 126899
* Allow targets to specify a the type of the RHS of a shift parameterized on ↵Owen Anderson2011-02-251-6/+6
| | | | | | the type of the LHS. llvm-svn: 126518
* Add a getNumSignBits() method to APInt.Cameron Zwarich2011-02-241-6/+1
| | | | llvm-svn: 126379
* Do not lose debug info of an inlined function argument even if the argument ↵Devang Patel2011-02-181-5/+11
| | | | | | | | is only used through GEPs. This time with a fix that avoids using invalidated DenseMap iterator. llvm-svn: 125984
* Roll out r125794 to help diagnose the llvm-gcc-i386-linux-selfhost failure.Cameron Zwarich2011-02-181-6/+4
| | | | llvm-svn: 125830
* Do not lose debug info of an inlined function argument even if the argument ↵Devang Patel2011-02-171-4/+6
| | | | | | is only used through GEPs. llvm-svn: 125794
* Swap VT and DebugLoc operands of getExtLoad() for consistency withStuart Hastings2011-02-161-2/+2
| | | | | | other getNode() methods. Radar 9002173. llvm-svn: 125665
* fix two comment thinkosChris Lattner2011-02-141-1/+1
| | | | llvm-svn: 125481
* Enhance ComputeMaskedBits to know that aligned frameindexesChris Lattner2011-02-131-8/+35
| | | | | | | | | | | | | | | | | have their low bits set to zero. This allows us to optimize out explicit stack alignment code like in stack-align.ll:test4 when it is redundant. Doing this causes the code generator to start turning FI+cst into FI|cst all over the place, which is general goodness (that is the canonical form) except that various pieces of the code generator don't handle OR aggressively. Fix this by introducing a new SelectionDAG::isBaseWithConstantOffset predicate, and using it in places that are looking for ADD(X,CST). The ARM backend in particular was missing a lot of addressing mode folding opportunities around OR. llvm-svn: 125470
* Revisit my fix for PR9028: the issue is that DAGCombine was Chris Lattner2011-02-131-0/+7
| | | | | | | | | generating i8 shift amounts for things like i1024 types. Add an assert in getNode to prevent this from occuring in the future, fix the buggy transformation, revert my previous patch, and document this gotcha in ISDOpcodes.h llvm-svn: 125465
* Remove comment about an argument that was removed couple of years ago.Devang Patel2011-02-071-1/+0
| | | | llvm-svn: 125054
OpenPOWER on IntegriCloud