summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Allow machine-cse to look across MBB boundary when cse'ing instructions thatEvan Cheng2012-01-104-18/+107
| | | | | | | | | | define physical registers. It's currently very restrictive, only catching cases where the CE is in an immediate (and only) predecessor. But it catches a surprising large number of cases. rdar://10660865 llvm-svn: 147827
* Enable LSR IV Chains with sufficient heuristics.Andrew Trick2012-01-106-7/+819
| | | | | | | | | | | | | | | | | | | | | | | | | These heuristics are sufficient for enabling IV chains by default. Performance analysis has been done for i386, x86_64, and thumbv7. The optimization is rarely important, but can significantly speed up certain cases by eliminating spill code within the loop. Unrolled loops are prime candidates for IV chains. In many cases, the final code could still be improved with more target specific optimization following LSR. The goal of this feature is for LSR to make the best choice of induction variables. Instruction selection may not completely take advantage of this feature yet. As a result, there could be cases of slight code size increase. Code size can be worse on x86 because it doesn't support postincrement addressing. In fact, when chains are formed, you may see redundant address plus stride addition in the addressing mode. GenerateIVChains tries to compensate for the common cases. On ARM, code size increase can be mitigated by using postincrement addressing, but downstream codegen currently misses some opportunities. llvm-svn: 147826
* Accurately model hardware alignment rounding.Jakob Stoklund Olesen2012-01-101-21/+56
| | | | | | | | | | | | | | | | | | | On Thumb, the displacement computation hardware uses the address of the current instruction rouned down to a multiple of 4. Include this rounding in the UserOffset we compute for each instruction. When inline asm is present, the instruction alignment may not be known. Constrain the maximum displacement instead in that case. This makes it possible for CreateNewWater() and OffsetIsInRange() to agree about the valid displacements. When they disagree, infinite looping happens. As always, test cases for this stuff are insane. <rdar://problem/10660175> llvm-svn: 147825
* Remove the logging streamer.Rafael Espindola2012-01-106-279/+0
| | | | llvm-svn: 147820
* Catch runaway ARMConstantIslandPass even in -Asserts builds.Jakob Stoklund Olesen2012-01-091-2/+2
| | | | | | | | | The pass is prone to looping, and it is better to crash than loop forever, even in a -Asserts build. <rdar://problem/10660175> llvm-svn: 147806
* Fix asm string wrt variants.Devang Patel2012-01-092-7/+7
| | | | llvm-svn: 147805
* Use descriptive variable name and remove incorrect operand number check.Devang Patel2012-01-091-12/+9
| | | | llvm-svn: 147802
* Adding IV chain generation to LSR.Andrew Trick2012-01-092-5/+324
| | | | | | | | | | | | | | | | | | After collecting chains, check if any should be materialized. If so, hide the chained IV users from the LSR solver. LSR will only solve for the head of the chain. GenerateIVChains will then materialize the chained IV users by computing the IV relative to its previous value in the chain. In theory, chained IV users could be exposed to LSR's solver. This would be considerably complicated to implement and I'm not aware of a case where we need it. In practice it's more important to intelligently prune the search space of nontrivial loops before running the solver, otherwise the solver is often forced to prune the most optimal solutions. Hiding the chained users does this well, so that LSR is more likely to find the best IV for the chain as a whole. llvm-svn: 147801
* Adding collection of IV chains to LSR.Andrew Trick2012-01-091-0/+242
| | | | | | | | This collects a set of IV uses within the loop whose values can be computed relative to each other in a sequence. Following checkins will make use of this information. llvm-svn: 147797
* Split AsmParser into two components - AsmParser and AsmParserVariantDevang Patel2012-01-095-80/+128
| | | | | | | AsmParser holds info specific to target parser. AsmParserVariant holds info specific to asm variants supported by the target. llvm-svn: 147787
* "Minor LSR debugging stuff"Andrew Trick2012-01-091-1/+4
| | | | llvm-svn: 147785
* Update language check. Do not ignore DW_LANG_Python.Devang Patel2012-01-091-1/+2
| | | | | | Patch by Joe Groff! llvm-svn: 147781
* Move assert to the right place.Benjamin Kramer2012-01-091-1/+1
| | | | llvm-svn: 147779
* InstCombine: Teach foldLogOpOfMaskedICmpsHelper that sign bit tests are bit ↵Benjamin Kramer2012-01-092-81/+102
| | | | | | | | tests. This subsumes several other transforms while enabling us to catch more cases. llvm-svn: 147777
* Don't rely on the fact that shift values are never very large, and thusChandler Carruth2012-01-091-1/+1
| | | | | | | | | | | | this substraction will result in small negative numbers at worst which become very large positive numbers on assignment and are thus caught by the <=4 check on the next line. The >0 check clearly intended to catch these as negative numbers. Spotted by inspection, and impossible to trigger given the shift widths that can be used. llvm-svn: 147773
* Cleanup and FileCheck-ize a test.Chandler Carruth2012-01-091-13/+25
| | | | llvm-svn: 147772
* Remove AVX hack in X86Subtarget. AVX/AVX2 are now treated as an SSE level. ↵Craig Topper2012-01-093-32/+22
| | | | | | Predicate functions have been altered to maintain previous names and behavior. llvm-svn: 147770
* Add HasAVX predicate to some of the AVX patterns.Craig Topper2012-01-091-0/+17
| | | | llvm-svn: 147769
* Reorder a bunch of patterns to put the AVX version first thus giving it ↵Craig Topper2012-01-091-405/+407
| | | | | | priority over the SSE version. Another step towards trying to remove the AVX hack that disables SSE from X86Subtarget. llvm-svn: 147768
* Clean up patterns for MOVNT*. Not sure why there were floating point types ↵Craig Topper2012-01-092-17/+25
| | | | | | on MOVNTPS and MOVNTDQ. And v4i64 was completely missing. llvm-svn: 147767
* Mark MOVNTI as being supported in SSE2 OR AVX mode. This instruction has no ↵Craig Topper2012-01-091-2/+2
| | | | | | AVX equivalent so we should use the SSE version. llvm-svn: 147766
* Move SSE2 logical operations PAND/POR/PXOR/PANDN above SSE1 logical ↵Craig Topper2012-01-091-47/+63
| | | | | | operations ANDPS/ORPS/XORPS/ANDNPS. This fixes a pattern ordering issue that meant that the SSE2 instructions could never be directly selected since the SSE1 patterns would always match first. This is largely moot with the ExeDepsFix pass, but I'm trying to audit for all such ordering issues. llvm-svn: 147765
* Change some places that were checking for AVX OR SSE1/2 to use ↵Craig Topper2012-01-092-6/+6
| | | | | | hasXMM/hasXMMInt instead. Also fix one place that checked SSE3, but accidentally excluded AVX to use hasSSE3orAVX. This is a step towards removing the AVX hack from the X86Subtarget.h llvm-svn: 147764
* Don't print an unused label before .cfi_endproc.Rafael Espindola2012-01-098-13/+29
| | | | llvm-svn: 147763
* Don't disable MMX support when AVX is enabled. Fix predicates for MMX ↵Craig Topper2012-01-095-28/+53
| | | | | | instructions that were added along with SSE instructions to check for AVX in addition to SSE level. llvm-svn: 147762
* Enable FISTTP* instructions when AVX is enabled.Craig Topper2012-01-081-9/+9
| | | | llvm-svn: 147758
* Tweak my last commit to be less conservative about uses.Benjamin Kramer2012-01-082-37/+35
| | | | | | | We still save an instruction when just the "and" part is replaced. Also change the code to match comments more closely. llvm-svn: 147753
* Don't forget to transfer implicit uses of return instruction.Evan Cheng2012-01-081-2/+5
| | | | llvm-svn: 147752
* Avoid eraseing copies from a reserved register unless the definition can beEvan Cheng2012-01-081-0/+26
| | | | | | safely proven not to have been clobbered. No small test case possible. llvm-svn: 147751
* InstCombine: If we have a bit test and a sign test anded/ored together, ↵Benjamin Kramer2012-01-082-0/+112
| | | | | | | | merge the sign bit into the bit test. This is common in bit field code, e.g. checking if the first or the last bit of a bit field is set. llvm-svn: 147749
* Reverted commit #147601 upon Evan's request.Victor Umansky2012-01-084-628/+0
| | | | llvm-svn: 147748
* Remove MCELFStreamer.h.Rafael Espindola2012-01-072-143/+121
| | | | llvm-svn: 147745
* Don't print a label before .cfi_startproc when we don't need to. This makesRafael Espindola2012-01-0710-19/+30
| | | | | | the produce assembly when using CFI just a bit more readable. llvm-svn: 147743
* Make clever use of alignment and padding to shrink GlobalValue.Benjamin Kramer2012-01-071-4/+3
| | | | | | -8 bytes on x86_64, no change on x86. llvm-svn: 147742
* Match SelectionDAG logic for enabling movt.Jakob Stoklund Olesen2012-01-072-2/+7
| | | | | | Darwin doesn't do static, and ELF targets only support static. llvm-svn: 147740
* Fix typo in the X86 backend readme. Patch from Jaeden Amero.Craig Topper2012-01-071-1/+1
| | | | llvm-svn: 147739
* Remove VectorExtras. This unused helper was written for a type of API that ↵Benjamin Kramer2012-01-0711-51/+0
| | | | | | is discouraged now. llvm-svn: 147738
* Remove unnecessary check of hasAVX(). It's already included in hasXMM().Craig Topper2012-01-071-1/+1
| | | | llvm-svn: 147734
* Replace some uses of hasNUsesOfValue(0, X) with !hasAnyUseOfValue(X)Craig Topper2012-01-071-4/+4
| | | | llvm-svn: 147733
* Port the trick to skip the check for empty buckets from StringMap to DenseMap.Benjamin Kramer2012-01-071-9/+9
| | | | | | This should fix the odd behavior that find() is slower than lookup(). llvm-svn: 147731
* Add some DAG combines for SUBC/SUBE. If nothing uses the carry/borrow out of ↵Craig Topper2012-01-071-2/+51
| | | | | | subc, turn it into a sub. Turn (subc x, x) into 0 with no borrow. Turn (subc x, 0) into x with no borrow. Turn (subc -1, x) into (xor x, -1) with no borrow. Turn sube with no borrow in into subc. llvm-svn: 147728
* Fix TableGen so that it will emit the correct signature for FastEmit_f:Cameron Zwarich2012-01-071-1/+1
| | | | | | | | | | | | | | | | | /// FastEmit_f - This method is called by target-independent code /// to request that an instruction with the given type, opcode, and /// floating-point immediate operand be emitted. virtual unsigned FastEmit_f(MVT VT, MVT RetVT, unsigned Opcode, const ConstantFP *FPImm); Currently, it emits an accidentally overloaded version without the const on the ConstantFP*. This doesn't affect anything in the tree, since nothing causes that method to be autogenerated, but I have been playing with some ARM TableGen refactorings that hit this problem. llvm-svn: 147727
* Optimize reserved register coalescing.Jakob Stoklund Olesen2012-01-071-0/+25
| | | | | | | | | | | | | | | | | Reserved registers don't have proper live ranges, their LiveInterval simply has a snippet of liveness for each def. Virtual registers with a single value that is a copy of a reserved register (typically %esp) can be coalesced with the reserved register if the live range doesn't overlap any reserved register defs. When coalescing with a reserved register, don't modify the reserved register live range. Just leave it as a bunch of dead defs. This eliminates quadratic coalescer behavior in i386 functions with many function calls. PR11699 llvm-svn: 147726
* Use the 'regalloc' debug tag for most register allocator tracing.Jakob Stoklund Olesen2012-01-073-3/+3
| | | | llvm-svn: 147725
* Enable redundant phi elimination after LSR.Andrew Trick2012-01-071-1/+3
| | | | | | This will be more important as we extend the LSR pass in ways that don't rely on the formula solver. In particular, we need it for constructing IV chains. llvm-svn: 147724
* Fix dead linkEli Bendersky2012-01-071-1/+1
| | | | llvm-svn: 147721
* Use getRegForValue() to materialize the address of ARM globals.Jakob Stoklund Olesen2012-01-073-38/+16
| | | | | | | | | This enables basic local CSE, giving us 20% smaller code for consumer-typeset in -O0 builds. <rdar://problem/10658692> llvm-svn: 147720
* Revert part of r147716. Looks like x87 instructions kill markers are all messedEvan Cheng2012-01-071-9/+11
| | | | | | | | | | up so branch folding pass can't use the scavenger. :-( This doesn't breaks anything currently. It just means targets which do not carefully update kill markers cannot run post-ra scheduler (not new, it has always been the case). We should fix this at some point since it's really hacky. llvm-svn: 147719
* LSR: Don't optimize loops if an outer loop has no preheader.Andrew Trick2012-01-072-4/+49
| | | | | | | | LoopSimplify may not run on some outer loops, e.g. because of indirect branches. SCEVExpander simply cannot handle outer loops with no preheaders. Fixes rdar://10655343 SCEVExpander segfault. llvm-svn: 147718
* Split Finish into Finish and FinishImpl to have a common place to do end ofRafael Espindola2012-01-0716-25/+41
| | | | | | | | file error checking. Use that to error on an unfinished cfi_startproc. The error is not nice, but is already better than a segmentation fault. llvm-svn: 147717
OpenPOWER on IntegriCloud