summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* implement cast.ll:test35. With this, we recognize:Chris Lattner2006-11-291-0/+16
| | | | | | | | | | unsigned short swp(unsigned short a) { return ((a & 0xff00) >> 8 | (a & 0x00ff) << 8); } as an idiom for bswap. llvm-svn: 32011
* Teach instcombine to turn trunc(srl x, c) -> srl (trunc(x), c) when safe.Chris Lattner2006-11-291-1/+33
| | | | | | | This implements InstCombine/cast.ll:test34. It fires hundreds of times on 176.gcc. llvm-svn: 32009
* Implement Regression/Transforms/InstCombine/bswap-fold.ll,Chris Lattner2006-11-291-1/+24
| | | | | | folding seteq (bswap(x)), c -> seteq(x,bswap(c)) llvm-svn: 32006
* Join a split line.Reid Spencer2006-11-291-2/+1
| | | | llvm-svn: 31996
* Undo the last patch until 253.perlbmk passes with these changes.Reid Spencer2006-11-281-3/+46
| | | | llvm-svn: 31977
* Remove 4 FIXME's from the CAST patch now that the back end is correctlyReid Spencer2006-11-281-46/+3
| | | | | | producing code for "trunc to bool". This passes all tests on Linux. llvm-svn: 31963
* Fix PR1014 and InstCombine/2006-11-27-XorBug.ll.Chris Lattner2006-11-271-10/+8
| | | | llvm-svn: 31941
* For PR950:Reid Spencer2006-11-2721-786/+908
| | | | | | | | | | The long awaited CAST patch. This introduces 12 new instructions into LLVM to replace the cast instruction. Corresponding changes throughout LLVM are provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the exception of 175.vpr which fails only on a slight floating point output difference. llvm-svn: 31931
* Remove #include <iostream> and use llvm_* streams instead.Bill Wendling2006-11-263-40/+37
| | | | llvm-svn: 31925
* Replace #include <iostream> with llvm_* streams.Bill Wendling2006-11-266-69/+62
| | | | llvm-svn: 31924
* Removed #include <iostream> and replaced with llvm_* streams.Bill Wendling2006-11-2611-115/+100
| | | | llvm-svn: 31923
* Removed #include <iostream> and used the llvm_cerr/DOUT streams instead.Bill Wendling2006-11-267-44/+34
| | | | llvm-svn: 31922
* Update to new predicate simplifier VRP design. Fixes PR966 and PR967.Nick Lewycky2006-11-221-574/+1105
| | | | | | | | Remove predicate simplifier from default gcc3 pipeline. New design is too slow to enable by default. Add new testcases for problems encountered in development. llvm-svn: 31895
* This xform is handled by FoldOpIntoPhi in visitCastInst in a more elegant way.Chris Lattner2006-11-211-30/+1
| | | | llvm-svn: 31889
* Do not convert massive blocks on phi nodes into select statements. InsteadChris Lattner2006-11-181-0/+27
| | | | | | | only do these transformations if there are a small number of phi's. This speeds up Ptrdist/ks from 2.35s to 2.19s on my mac pro. llvm-svn: 31853
* If an indvar with a variable stride is used by the exit condition, go aheadChris Lattner2006-11-171-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and handle it like constant stride vars. This fixes some bad codegen in variable stride cases. For example, it compiles this: void foo(int k, int i) { for (k=i+i; k <= 8192; k+=i) flags2[k] = 0; } to: LBB1_1: #bb.preheader movl %eax, %ecx addl %ecx, %ecx movl L_flags2$non_lazy_ptr, %edx LBB1_2: #bb movb $0, (%edx,%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB1_2 #bb LBB1_5: #return ret or (if the array is local and we are in dynamic-nonpic or static mode): LBB3_2: #bb movb $0, _flags2(%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB3_2 #bb and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) slwi r3, r4, 1 LBB1_2: ;bb li r5, 0 add r6, r4, r3 stbx r5, r2, r3 cmpwi cr0, r6, 8192 bgt cr0, LBB1_5 ;return instead of: leal (%eax,%eax,2), %ecx movl %eax, %edx addl %edx, %edx addl L_flags2$non_lazy_ptr, %edx xorl %esi, %esi LBB1_2: #bb movb $0, (%edx,%esi) movl %eax, %edi addl %esi, %edi addl %ecx, %esi cmpl $8192, %esi jg LBB1_5 #return and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) mulli r3, r4, 3 slwi r5, r4, 1 li r6, 0 add r2, r2, r5 LBB1_2: ;bb li r5, 0 add r7, r3, r6 stbx r5, r2, r6 add r6, r4, r6 cmpwi cr0, r7, 8192 ble cr0, LBB1_2 ;bb This speeds up Benchmarks/Shootout/sieve from 8.533s to 6.464s and implements LoopStrengthReduce/var_stride_used_by_compare.ll llvm-svn: 31809
* Fix a gcc 4.2 warning.Chris Lattner2006-11-151-0/+2
| | | | llvm-svn: 31751
* implement InstCombine/shift-simplify.ll by transforming:Chris Lattner2006-11-141-3/+46
| | | | | | | | (X >> Z) op (Y >> Z) -> (X op Y) >> Z for all shifts and all ops={and/or/xor}. llvm-svn: 31729
* implement InstCombine/and-compare.ll:test1. This compiles:Chris Lattner2006-11-141-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | typedef struct { unsigned prefix : 4; unsigned code : 4; unsigned unsigned_p : 4; } tree_common; int foo(tree_common *a, tree_common *b) { return a->code == b->code; } into: _foo: movl 4(%esp), %eax movl 8(%esp), %ecx movl (%eax), %eax xorl (%ecx), %eax # TRUNCATE movb %al, %al shrb $4, %al testb %al, %al sete %al movzbl %al, %eax ret instead of: _foo: movl 8(%esp), %eax movb (%eax), %al shrb $4, %al movl 4(%esp), %ecx movb (%ecx), %cl shrb $4, %cl cmpb %al, %cl sete %al movzbl %al, %eax ret saving one cycle by eliminating a shift. llvm-svn: 31727
* Fix InstCombine/2006-11-10-ashr-miscompile.ll a miscompilation introducedChris Lattner2006-11-101-3/+3
| | | | | | by the shr -> [al]shr patch. This was reduced from 176.gcc. llvm-svn: 31653
* second patch to fix PR992/993.Chris Lattner2006-11-091-4/+17
| | | | llvm-svn: 31610
* Minimal patch to fix PR992/PR993Chris Lattner2006-11-091-2/+1
| | | | llvm-svn: 31608
* Teach ShrinkDemandedConstant how to handle X+C. This implements:Chris Lattner2006-11-091-1/+100
| | | | | | add.ll:test33, add.ll:test34, shift-sra.ll:test2 llvm-svn: 31586
* reenable factoring of GEP expressions, being more precise about theChris Lattner2006-11-081-5/+10
| | | | | | case that it bad to do. llvm-svn: 31563
* make this code more efficient by not creating a phi node we are just going toChris Lattner2006-11-081-36/+33
| | | | | | delete in the first place. This also makes it simpler. llvm-svn: 31562
* Remove redundant <cmath>.Jim Laskey2006-11-081-1/+0
| | | | llvm-svn: 31561
* disable this factoring optzn for GEPs for now, this severely pessimizes someChris Lattner2006-11-081-1/+1
| | | | | | loops. llvm-svn: 31560
* For PR950:Reid Spencer2006-11-085-215/+194
| | | | | | | | This patch converts the old SHR instruction into two instructions, AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not dependent on the sign of their operands. llvm-svn: 31542
* scalarrepl should not split the two elements of the vsiidx array:Chris Lattner2006-11-071-3/+7
| | | | | | | | | | | | | | | int func(vFloat v0, vFloat v1) { int ii; vSInt32 vsiidx[2]; vsiidx[0] = _mm_cvttps_epi32(v0); vsiidx[1] = _mm_cvttps_epi32(v1); ii = ((int *) vsiidx)[4]; return ii; } This fixes Transforms/ScalarRepl/2006-11-07-InvalidArrayPromote.ll llvm-svn: 31524
* Unbreak VC++ build.Jeff Cohen2006-11-052-6/+6
| | | | llvm-svn: 31464
* Remove commented line from earlier debugging.Nick Lewycky2006-11-051-2/+0
| | | | llvm-svn: 31460
* The wrong parameter was being tested to deturmine i32 vs i64Andrew Lenharth2006-11-031-1/+1
| | | | llvm-svn: 31431
* remove dead codeChris Lattner2006-11-031-13/+0
| | | | llvm-svn: 31398
* For PR786:Reid Spencer2006-11-0222-47/+22
| | | | | | | | | | Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting fall out by removing unused variables. Remaining warnings have to do with unused functions (I didn't want to delete code without review) and unused variables in generated code. Maintainers should clean up the remaining issues when they see them. All changes pass DejaGnu tests and Olden. llvm-svn: 31380
* For PR950:Reid Spencer2006-11-023-119/+134
| | | | | | Replace the REM instruction with UREM, SREM and FREM. llvm-svn: 31369
* There can be more than one PHINode at the start of the block.Devang Patel2006-11-011-5/+4
| | | | llvm-svn: 31362
* Handle PHINode with only one incoming value.Devang Patel2006-11-011-5/+9
| | | | | | This fixes http://llvm.org/bugs/show_bug.cgi?id=979 llvm-svn: 31358
* Fix GlobalOpt/2006-11-01-ShrinkGlobalPhiCrash.ll and McGill/chompChris Lattner2006-11-011-8/+14
| | | | llvm-svn: 31352
* Factor gep instructions through phi nodes.Chris Lattner2006-11-011-10/+39
| | | | llvm-svn: 31346
* Turn a phi of many loads into a phi of the address and a single load of theChris Lattner2006-11-011-41/+30
| | | | | | | result. This can significantly shrink code and exposes identities more aggressively. llvm-svn: 31344
* Fix a bug in the previous patchChris Lattner2006-11-011-3/+6
| | | | llvm-svn: 31342
* Fold things like "phi [add (a,b), add(c,d)]" into two phi's and one add.Chris Lattner2006-11-011-3/+57
| | | | | | This triggers thousands of times on multisource. llvm-svn: 31341
* generalize the fix for PR977 to also fixChris Lattner2006-10-311-28/+26
| | | | | | Transforms/LCSSA/2006-10-31-UnreachableBlock-2.ll llvm-svn: 31317
* Fix PR977 and Transforms/LCSSA/2006-10-31-UnreachableBlock.llChris Lattner2006-10-311-1/+8
| | | | llvm-svn: 31315
* Fix SimplifyCFG/2006-10-29-InvokeCrash.ll, a crash compiling QT.Chris Lattner2006-10-291-1/+1
| | | | llvm-svn: 31284
* add option to isCriticalEdgeChris Lattner2006-10-281-3/+12
| | | | llvm-svn: 31258
* break edges more intelligentlyChris Lattner2006-10-281-2/+3
| | | | llvm-svn: 31257
* Expose a smarter way to break critical edges.Chris Lattner2006-10-281-5/+24
| | | | llvm-svn: 31256
* SplitCriticalEdge checks to see if an edge is critical, don't check twiceChris Lattner2006-10-281-2/+1
| | | | llvm-svn: 31255
* prepare for a change I'm about to makeChris Lattner2006-10-281-0/+6
| | | | llvm-svn: 31248
OpenPOWER on IntegriCloud