summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Relax the "don't unroll loops containing calls" rule. Instead, when a loop ↵Owen Anderson2010-09-081-0/+51
| | | | | | | | | contains a call, lower the unrolling threshold to the optimize-for-size threshold. Basically, for loops containing calls, unrolling can still be profitable as long as the loop is REALLY small. llvm-svn: 113439
* Generalize instcombine's support for combining multiple bit checks into a ↵Owen Anderson2010-09-081-1/+347
| | | | | | single test. Patch by Dirk Steinke! llvm-svn: 113423
* Fix a serious performance regression introduced by r108687 on linux:Chris Lattner2010-09-071-3/+19
| | | | | | | | turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have to delete the original sqrt as well. Not doing so causes us to do two sqrt's when building with -fmath-errno (the default on linux). llvm-svn: 113260
* rename test.Chris Lattner2010-09-071-0/+0
| | | | llvm-svn: 113257
* fix PR8067, an over-aggressive assertion in LICM.Chris Lattner2010-09-061-0/+14
| | | | llvm-svn: 113146
* Teach loop rotate to hoist trivially invariant instructionsChris Lattner2010-09-061-0/+35
| | | | | | | | | | | | | | | in the duplicated block instead of duplicating them. Duplicating them into the end of the loop and the preheader means that we got a phi node in the header of the loop, which prevented LICM from hoisting them. GVN would usually come around later and merge the duplicated instructions so we'd get reasonable output... except that anything dependent on the shoulda-been-hoisted value can't be hoisted. In PR5319 (which this fixes), a memory value didn't get promoted. llvm-svn: 113134
* fix PR8063, a crash in globalopt in the malloc analysis code.Chris Lattner2010-09-051-0/+15
| | | | llvm-svn: 113109
* Fix LoopSimplify to notify ScalarEvolution when splitting a loop backedgeDan Gohman2010-09-041-0/+50
| | | | | | | into an inner loop, as the new loop iteration may differ substantially. This fixes PR8078. llvm-svn: 113057
* fix a bug in my licm rewrite when a load from the promoted memoryChris Lattner2010-09-041-0/+27
| | | | | | | | location is being re-stored to the memory location. We would get a dangling pointer from the SSAUpdate data structure and miss a use. This fixes PR8068 llvm-svn: 113042
* Propagate non-local comparisons. Fixes PR1757.Owen Anderson2010-09-031-0/+24
| | | | llvm-svn: 113025
* Add support for simplifying a load from a computed value to a load from a ↵Owen Anderson2010-09-031-0/+18
| | | | | | | | global when it is provable that they're equivalent. This fixes PR4855. llvm-svn: 112994
* Add a test for PR4413, which was apparently fixed at some point in the past.Owen Anderson2010-09-031-0/+21
| | | | llvm-svn: 112987
* Add PR number to test.Owen Anderson2010-09-031-0/+1
| | | | llvm-svn: 112971
* more test cleanupChris Lattner2010-09-024-28/+4
| | | | llvm-svn: 112892
* remove some noise from tests.Chris Lattner2010-09-021-2/+2
| | | | llvm-svn: 112889
* fix more AST updating bugs, correcting miscompilation in PR8041Chris Lattner2010-09-021-0/+47
| | | | llvm-svn: 112878
* Fix typo. I accidentally edited the wrong file before my last commit.Owen Anderson2010-09-021-1/+1
| | | | llvm-svn: 112851
* Fix a bug in LazyValueInfo that CorrelatedValuePropagation exposed: In the ↵Owen Anderson2010-09-021-0/+25
| | | | | | | | LVI lattice, undef and the full set ConstantRange should not be treated as equivalent. llvm-svn: 112843
* Print the number of uses of a function in the .ll since it can be informativeDuncan Sands2010-09-021-2/+1
| | | | | | and there seems to be no reason not to. llvm-svn: 112812
* deepen my MMX/SRoA hack to avoid hurting non-x86 codegen.Chris Lattner2010-09-011-0/+1
| | | | llvm-svn: 112763
* Fix loop unswitching's assumption that a code path which eitherDan Gohman2010-09-011-0/+53
| | | | | | infinite loops or exits will eventually exit. This fixes PR5373. llvm-svn: 112745
* The output of opt -stats must be sent to stderr. Patch by NAKAMURA Takumi!Bill Wendling2010-09-011-1/+1
| | | | llvm-svn: 112724
* add a gross hack to work around a problem that Argiris reportedChris Lattner2010-09-011-0/+14
| | | | | | | | | | | | | on llvmdev: SRoA is introducing MMX datatypes like <1 x i64>, which then cause random problems because the X86 backend is producing mmx stuff without inserting proper emms calls. In the short term, force off MMX datatypes. In the long term, the X86 backend should not select generic vector types to MMX registers. This is being worked on, but won't be done in time for 2.8. rdar://8380055 llvm-svn: 112696
* filecheckizeChris Lattner2010-09-011-3/+25
| | | | llvm-svn: 112695
* licm is wasting time hoisting constant foldable operations,Chris Lattner2010-08-312-6/+22
| | | | | | | instead of hoisting them, just fold them away. This occurs in the testcase for PR8041, for example. llvm-svn: 112669
* Merge 2010-08-31-InfiniteRecursion.ll into crash.ll.Owen Anderson2010-08-312-25/+23
| | | | llvm-svn: 112635
* Add a test for the duplicated-conditional situation illutrated by PR5652.Owen Anderson2010-08-311-0/+24
| | | | llvm-svn: 112621
* merge two tests.Chris Lattner2010-08-312-35/+35
| | | | llvm-svn: 112617
* Manually reduce this testcase.Owen Anderson2010-08-311-77/+11
| | | | llvm-svn: 112615
* merge two tests and convert to filecheck.Chris Lattner2010-08-312-23/+28
| | | | llvm-svn: 112613
* Add a micro-test for the transforms I added to JumpThreading.Owen Anderson2010-08-311-0/+30
| | | | | | | | | | | | I have not been able to find a way to test each in isolation, for a few reasons: 1) The ability to look-through non-i1 BinaryOperator's requires the ability to look through non-constant ICmps in order for it to ever trigger. 2) The ability to do LVI-powered PHI value determination only matters in cases that ProcessBranchOnPHI can't handle. Since it already handles all the cases without other instructions in the def-use chain between the PHI and the branch, it requires the ability to look through ICmps and/or BinaryOperators as well. llvm-svn: 112611
* Rename test directory to reflect new pass name.Owen Anderson2010-08-312-0/+0
| | | | llvm-svn: 112592
* Rename ValuePropagation to a more descriptive CorrelatedValuePropagation.Owen Anderson2010-08-311-1/+1
| | | | llvm-svn: 112591
* More Chris-inspired JumpThreading fixes: use ConstantExpr to correctly ↵Owen Anderson2010-08-311-0/+91
| | | | | | | | | constant-fold undef, and be more careful with its return value. This actually exposed an infinite recursion bug in ComputeValueKnownInPredecessors which theoretically already existed (in JumpThreading's handling of and/or of i1's), but never manifested before. This patch adds a tracking set to prevent this case. llvm-svn: 112589
* Remove r111665, which implemented store-narrowing in InstCombine. Chris ↵Owen Anderson2010-08-311-21/+0
| | | | | | | | discovered a miscompilation in it, and it's not easily fixable at the optimizer level. I'll investigate reimplementing it in DAGCombine. llvm-svn: 112575
* Combine these two tests, and make sure there's a newline at the end of the file.Owen Anderson2010-08-302-21/+19
| | | | llvm-svn: 112554
* Correct bogus module triple specifications.Duncan Sands2010-08-305-5/+5
| | | | llvm-svn: 112469
* LICM does get dead instructions input to it. Instead of sinking themChris Lattner2010-08-291-0/+14
| | | | | | out of loops, just delete them. llvm-svn: 112451
* remove the ABCD and SSI passes. They don't have any clients thatChris Lattner2010-08-288-182/+0
| | | | | | | I'm aware of, aren't maintained, and LVI will be replacing their value. nlewycky approved this on irc. llvm-svn: 112355
* handle the constant case of vector insertion. For somethingChris Lattner2010-08-281-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.B; A.A = 42; return A; } we now generate: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 pshufd $16, %xmm2, %xmm2 movss LCPI1_1(%rip), %xmm0 pshufd $16, %xmm0, %xmm0 unpcklps %xmm2, %xmm0 ret instead of: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 movd %xmm2, %eax shlq $32, %rax addq $1109917696, %rax ## imm = 0x42280000 movd %rax, %xmm0 ret llvm-svn: 112345
* optimize bitcasts from large integers to vector into vectorChris Lattner2010-08-281-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | element insertion from the pieces that feed into the vector. This handles a pattern that occurs frequently due to code generated for the x86-64 abi. We now compile something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.A; ++A.C; return A; } into all nice vector operations: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 12(%rax), %xmm3 pshufd $16, %xmm2, %xmm2 unpcklps %xmm2, %xmm0 addss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 pshufd $16, %xmm3, %xmm2 unpcklps %xmm2, %xmm1 ret instead of icky integer operations: _bar: ## @bar movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 movd %xmm0, %ecx movl 4(%rax), %edx movl 12(%rax), %esi shlq $32, %rdx addq %rcx, %rdx movd %rdx, %xmm0 addss 8(%rax), %xmm1 movd %xmm1, %eax shlq $32, %rsi addq %rax, %rsi movd %rsi, %xmm1 ret This resolves rdar://8360454 llvm-svn: 112343
* Add a prototype of a new peephole optimizing pass that uses LazyValue info ↵Owen Anderson2010-08-273-0/+45
| | | | | | | | to simplify PHIs and select's. This pass addresses the missed optimizations from PR2581 and PR4420. llvm-svn: 112325
* tidy up test.Chris Lattner2010-08-271-1/+2
| | | | llvm-svn: 112321
* Enhance the shift propagator to handle the case when you have:Chris Lattner2010-08-271-0/+15
| | | | | | | | | | | | | | | | A = shl x, 42 ... B = lshr ..., 38 which can be transformed into: A = shl x, 4 ... iff we can prove that the would-be-shifted-in bits are already zero. This eliminates two shifts in the testcase and allows eliminate of the whole i128 chain in the real example. llvm-svn: 112314
* Implement a pretty general logical shift propagationChris Lattner2010-08-271-4/+17
| | | | | | | | | | | | framework, which is good at ripping through bitfield operations. This generalize a bunch of the existing xforms that instcombine does, such as (x << c) >> c -> and to handle intermediate logical nodes. This is useful for ripping up the "promote to large integer" code produced by SRoA. llvm-svn: 112304
* merge and filecheckize testChris Lattner2010-08-272-42/+57
| | | | llvm-svn: 112289
* merge two testsChris Lattner2010-08-272-10/+12
| | | | llvm-svn: 112288
* teach the truncation optimization that an entire chain ofChris Lattner2010-08-271-4/+21
| | | | | | | computation can be truncated if it is fed by a sext/zext that doesn't have to be exactly equal to the truncation result type. llvm-svn: 112285
* Add an instcombine to clean up a common pattern producedChris Lattner2010-08-271-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by the SRoA "promote to large integer" code, eliminating some type conversions like this: %94 = zext i16 %93 to i32 ; <i32> [#uses=2] %96 = lshr i32 %94, 8 ; <i32> [#uses=1] %101 = trunc i32 %96 to i8 ; <i8> [#uses=1] This also unblocks other xforms from happening, now clang is able to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry pshufd $1, %xmm0, %xmm2 addss %xmm0, %xmm2 movdqa %xmm1, %xmm3 addss %xmm2, %xmm3 pshufd $1, %xmm1, %xmm0 addss %xmm3, %xmm0 ret on x86-64, instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret This seems pretty close to optimal to me, at least without using horizontal adds. This also triggers in lots of other code, including SPEC. llvm-svn: 112278
* Use LVI to eliminate conditional branches where we've tested a related ↵Owen Anderson2010-08-272-5/+12
| | | | | | | | condition previously. Update tests for this change. This fixes PR5652. llvm-svn: 112270
OpenPOWER on IntegriCloud