summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Hook into PassManager's analysis verification.Jakob Stoklund Olesen2012-07-303-7/+4
| | | | | | | By overriding Pass::verifyAnalysis(), the pass contents will be verified by the pass manager. llvm-svn: 160994
* Consider address spaces for hashing and CSEing DAG nodes. Otherwise two ↵Pete Cooper2012-07-301-0/+22
| | | | | | loads from different x86 segments but the same address would get CSEd llvm-svn: 160987
* Add MachineInstr::isTransient().Jakob Stoklund Olesen2012-07-301-23/+1
| | | | | | | | | | | This is a cleaned up version of the isFree() function in MachineTraceMetrics.cpp. Transient instructions are very unlikely to produce any code in the final output. Either because they get eliminated by RegisterCoalescing, or because they are pseudo-instructions like labels and debug values. llvm-svn: 160977
* Add MachineTraceMetrics::verify().Jakob Stoklund Olesen2012-07-303-11/+55
| | | | | | | This function verifies the consistency of cached data in the MachineTraceMetrics analysis. llvm-svn: 160976
* Verify that the CFG hasn't changed during invalidate().Jakob Stoklund Olesen2012-07-301-2/+12
| | | | | | | The MachineTraceMetrics analysis must be invalidated before modifying the CFG. This will catch some of the violations of that rule. llvm-svn: 160969
* Add MachineBasicBlock::isPredecessor().Jakob Stoklund Olesen2012-07-301-2/+5
| | | | | | | | A->isPredecessor(B) is the same as B->isSuccessor(A), but it can tolerate a B that is null or dangling. This shouldn't happen normally, but it it useful for verification code. llvm-svn: 160968
* Revert r160920 and r160919 due to dragonegg and clang selfhost failureManman Ren2012-07-291-22/+0
| | | | llvm-svn: 160927
* X86 Peephole: fold loads to the source register operand if possible.Manman Ren2012-07-281-0/+22
| | | | | | | | | Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919
* Reenable a basic SSA DAG builder optimization.Andrew Trick2012-07-281-5/+4
| | | | | | Jakob fixed ProcessImplicifDefs in r159149. llvm-svn: 160910
* Add more debug output to MachineTraceMetrics.Jakob Stoklund Olesen2012-07-273-3/+48
| | | | llvm-svn: 160905
* Keep track of the head and tail of the trace through each block.Jakob Stoklund Olesen2012-07-272-4/+18
| | | | | | | This makes it possible to quickly detect blocks that are outside the trace. llvm-svn: 160904
* Add a DW_AT_high_pc for CUs that are a single address range. UpdateEric Christopher2012-07-272-6/+24
| | | | | | | | | | all tests accordingly. Fixes PR13351. Patch by shinichiro hamaji! llvm-svn: 160899
* Also compute register mask lists under -new-live-intervals.Jakob Stoklund Olesen2012-07-271-8/+34
| | | | llvm-svn: 160898
* Eliminate the IS_PHI_DEF flag and VNInfo::setIsPHIDef().Jakob Stoklund Olesen2012-07-274-8/+2
| | | | | | | | A value number is a PHI def if and only if it begins at a block boundary. This can be derived from the def slot, a separate flag is not necessary. llvm-svn: 160893
* Add a -new-live-intervals experimental option.Jakob Stoklund Olesen2012-07-271-1/+36
| | | | | | | | | This option replaces the existing live interval computation with one based on LiveRangeCalc.cpp. The new algorithm does not depend on LiveVariables, and it can be run at any time, before or after leaving SSA form. llvm-svn: 160892
* Add <imp-def> of super-register when lowering SUBREG_TO_REG.Jakob Stoklund Olesen2012-07-271-4/+7
| | | | | | Patch by Tyler Nowicki! llvm-svn: 160888
* Use an otherwise unused variable.Jakob Stoklund Olesen2012-07-261-1/+1
| | | | llvm-svn: 160798
* Start scaffolding for a MachineTraceMetrics analysis pass.Jakob Stoklund Olesen2012-07-264-1/+726
| | | | | | | | | | | | | | | | | | | | | | | | This is still a work in progress. Out-of-order CPUs usually execute instructions from multiple basic blocks simultaneously, so it is necessary to look at longer traces when estimating the performance effects of code transformations. The MachineTraceMetrics analysis will pick a typical trace through a given basic block and provide performance metrics for the trace. Metrics will include: - Instruction count through the trace. - Issue count per functional unit. - Critical path length, and per-instruction 'slack'. These metrics can be used to determine the performance limiting factor when executing the trace, and how it will be affected by a code transformation. Initially, this will be used by the early if-conversion pass. llvm-svn: 160796
* Add a floor intrinsic.Dan Gohman2012-07-261-0/+5
| | | | llvm-svn: 160791
* Disable rematerialization in TwoAddressInstructionPass.Manman Ren2012-07-251-78/+6
| | | | | | | | | | | It is redundant; RegisterCoalescer will do the remat if it can't eliminate the copy. Collected instruction counts before and after this. A few extra instructions are generated due to spilling but it is normal to see these kinds of changes with almost any small codegen change, according to Jakob. This also fixed rdar://11830760 where xor is expected instead of movi0. llvm-svn: 160749
* Preserve 2-addr constraints in ConnectedVNInfoEqClasses.Jakob Stoklund Olesen2012-07-251-7/+4
| | | | | | | | | | | | | | | | | | | | When a live range splits into multiple connected components, we would arbitrarily assign <undef> uses to component 0. This is wrong when the use is tied to a def that gets assigned to a different component: %vreg69<def> = ADD8ri %vreg68<undef>, 1 The use and def must get the same virtual register. Fix this by assigning <undef> uses to the same component as the value defined by the instruction, if any: %vreg69<def> = ADD8ri %vreg69<undef>, 1 This fixes PR13402. The PR has a test case which I am not including because it is unlikely to keep exposing this behavior in the future. llvm-svn: 160739
* Verify two-address constraints more carefully.Jakob Stoklund Olesen2012-07-251-14/+7
| | | | | | Include <undef> operands and virtual registers after leaving SSA form. llvm-svn: 160734
* Change llvm_unreachable in SplitVectorOperand to report_fatal_error. Keeps ↵Craig Topper2012-07-241-1/+3
| | | | | | release builds from crashing if code uses an intrinsic with an illegal type. llvm-svn: 160661
* Fix a typo (the the => the)Sylvestre Ledru2012-07-232-2/+2
| | | | llvm-svn: 160621
* Fixed DAGCombine optimizations which generate select_cc for targetsNadav Rotem2012-07-231-33/+47
| | | | | | | | | | that do not support it (X86 does not lower select_cc). PR: 13428 Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160619
* Tidy up. Fix indentation and remove trailing whitespace.Craig Topper2012-07-231-16/+14
| | | | llvm-svn: 160617
* Change llvm_unreachable in SplitVectorResult to report_fatal_error. Keeps ↵Craig Topper2012-07-231-1/+2
| | | | | | release builds from crashing if code uses an intrinsic with an illegal type. For instance 256-bit AVX intrinsics without having AVX enabled. llvm-svn: 160616
* Remove unused private member variables uncovered by the recent changes to ↵Benjamin Kramer2012-07-203-16/+2
| | | | | | clang's -Wunused-private-field. llvm-svn: 160583
* Avoid folding loads that are unsafe to move.Jakob Stoklund Olesen2012-07-201-0/+13
| | | | | | | | | | LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load into its only use. Only do that when the load is safe to move, and it won't extend any live ranges. This fixes PR13414. llvm-svn: 160575
* Split loop exiting edges more aggressively.Jakob Stoklund Olesen2012-07-201-13/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PHIElimination splits critical edges when it predicts it can resolve interference and eliminate copies. It doesn't split the edge if the interference wouldn't be resolved anyway because the phi-use register is live in the critical edge anyway. Teach PHIElimination to split loop exiting edges with interference, even if it wouldn't resolve the interference. This removes the necessary copies from the loop, which is still an improvement from injecting the copies into the loop. The test case demonstrates the improvement. Before: LBB0_1: cmpb $0, (%rdx) leaq 1(%rdx), %rdx movl %esi, %eax je LBB0_1 After: LBB0_1: cmpb $0, (%rdx) leaq 1(%rdx), %rdx je LBB0_1 movl %esi, %eax llvm-svn: 160571
* Fix crash in machine verifier when trying to print the def of a register ↵Pete Cooper2012-07-191-0/+2
| | | | | | which has no def llvm-svn: 160531
* Replace some explicit compare loops with std::equal.Benjamin Kramer2012-07-191-7/+1
| | | | | | No functionality change. llvm-svn: 160501
* Fixed few warnings.Galina Kistanova2012-07-191-1/+1
| | | | llvm-svn: 160493
* Remove tabs.Bill Wendling2012-07-198-28/+31
| | | | llvm-svn: 160475
* Fix a somewhat nasty crasher in PR13378. This crashes inside ofChandler Carruth2012-07-181-22/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LiveIntervals due to the two-addr pass generating bogus MI code. The crux of the issue was a loop nesting problem. The intent of the code which attempts to transform instructions before converting them to two-addr form is to defer and reprocess any transformed instructions as the second processing is likely to have more opportunities to coalesce copies, etc. Unfortunately, there was one section of processing that was not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this rewriting proceeded, not only did it occur early, it removed the bits of information needed for the deferred processing to correctly generate the necessary two address form (specifically inserting a copy), but didn't trigger any immediate assertions and produced what appeared to be already valid two-address from code. Thus, the assertion only fired much later in the pipeline. The fix is to hoist the transformation logic up layer to where it can more firmly defer all further processing, and to teach the normal processing to handle an edge case previously handled as part of the transformation logic. This edge case (already matched tied register operands) needs to *not* defer any steps. As has been brought up repeatedly in the process: wow does this code need refactoring. I *may* squeeze in some time to at least bring sanity to this loop... but wow... =] Thanks to Jakob for helpful hints on the way here, and the review. llvm-svn: 160443
* ignore 'invoke @llvm.donothing', but still keep the edge to the continuation BBNuno Lopes2012-07-181-1/+1
| | | | llvm-svn: 160411
* Back out r160101 and instead implement a dag combine to recover from ↵Evan Cheng2012-07-171-0/+28
| | | | | | instcombine transformation. llvm-svn: 160387
* Add some trace output to TwoAddressInstructionPass.Jakob Stoklund Olesen2012-07-171-1/+4
| | | | llvm-svn: 160380
* Remove unused variable.Benjamin Kramer2012-07-171-1/+0
| | | | llvm-svn: 160372
* Fix a crash in the legalization of large vectors.Nadav Rotem2012-07-171-6/+3
| | | | | | | When truncating a result of a vector that is split we need to use the result of the split vector, and not re-split the dead node. llvm-svn: 160357
* Implement r160312 as target indepedenet dag combine.Evan Cheng2012-07-171-0/+27
| | | | llvm-svn: 160354
* Make sure constant bitwidth is <= 64 bit before calling getSExtValue().Evan Cheng2012-07-171-1/+2
| | | | llvm-svn: 160350
* This is another case where instcombine demanded bits optimization createdEvan Cheng2012-07-171-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | large immediates. Add dag combine logic to recover in case the large immediates doesn't fit in cmp immediate operand field. int foo(unsigned long l) { return (l>> 47) == 1; } we produce %shr.mask = and i64 %l, -140737488355328 %cmp = icmp eq i64 %shr.mask, 140737488355328 %conv = zext i1 %cmp to i32 ret i32 %conv which codegens to movq $0xffff800000000000,%rax andq %rdi,%rax movq $0x0000800000000000,%rcx cmpq %rcx,%rax sete %al movzbl %al,%eax ret TargetLowering::SimplifySetCC would transform (X & -256) == 256 -> (X >> 8) == 1 if the immediate fails the isLegalICmpImmediate() test. For x86, that's immediates which are not a signed 32-bit immediate. Based on a patch by Eli Friedman. PR10328 rdar://9758774 llvm-svn: 160346
* Minor cleanup and docs.Nadav Rotem2012-07-161-1/+3
| | | | llvm-svn: 160311
* Make ComputeDemandedBits return a deterministic result when computing an ↵Nadav Rotem2012-07-161-0/+1
| | | | | | | | | | | AssertZext value. In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits reported that some of the bits were both known to be one and known to be zero. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160305
* Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be ↵Nadav Rotem2012-07-152-1/+10
| | | | | | | | wider than the output element type. Make sure to trunc them if needed. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160235
* Refactor the code that checks that all operands of a node are UNDEFs.Nadav Rotem2012-07-152-13/+28
| | | | | | | | | Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs. Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160229
* Reapply r160194, switching to use LV information for finding local kills.Chandler Carruth2012-07-151-56/+32
| | | | | | | | | | | | | | | | | | | | | The notable fix is to look at any dependencies attached to the kill instruction (or other instructions between MI nad the kill) where the dependencies are specific to the register in question. The old code implicitly handled this by rejecting the transform if *any* other uses were found within the block, but after the start point. The new code directly finds the kill, and has to re-use the existing dependency scan to check for non-kill uses. This was caught by self-host, but I found the bug via inspection and use of absurd assert scaffolding to compute the kills in two ways and compare them. So I have no useful testcase for this other than "bootstrap". I'd work harder to reduce a test case if this particular code were likely to live for a long time. Thanks to Benjamin Kramer for reviewing the fix itself. llvm-svn: 160228
* Add a dagcombine optimization to convert concat_vectors of undefs into a ↵Nadav Rotem2012-07-141-0/+11
| | | | | | | | single undef. The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node. llvm-svn: 160221
* Account for early-clobber reload instructions.Jakob Stoklund Olesen2012-07-141-0/+4
| | | | | | No test case, there are no in-tree targets that require this. llvm-svn: 160219
OpenPOWER on IntegriCloud