summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
Commit message (Collapse)AuthorAgeFilesLines
* rename Type::isIntegral to Type::isInteger, eliminating the old Type::isInteger.Chris Lattner2007-01-151-1/+1
| | | | | | | | | rename Type::getIntegralTypeMask to Type::getIntegerTypeMask. This makes naming much more consistent. For example, there are now no longer any instances of IntegerType that are not considered isInteger! :) llvm-svn: 33225
* Eliminate calls to isInteger, generalizing code and tightening checks as needed.Chris Lattner2007-01-151-1/+1
| | | | llvm-svn: 33218
* For PR1097:Reid Spencer2007-01-081-1/+1
| | | | | | | | Enable complex addressing modes on 64-bit platforms involving two induction variables by keeping a size and scale in 64-bits not 32. Patch by Dan Gohman. llvm-svn: 33011
* no need to worry about int vs uint any more.Chris Lattner2007-01-061-2/+1
| | | | llvm-svn: 32946
* For PR950:Reid Spencer2006-12-311-9/+3
| | | | | | | | | | | | | This patch replaces signed integer types with signless ones: 1. [US]Byte -> Int8 2. [U]Short -> Int16 3. [U]Int -> Int32 4. [U]Long -> Int64. 5. Removal of isSigned, isUnsigned, getSignedVersion, getUnsignedVersion and other methods related to signedness. In a few places this warranted identifying the signedness information from other sources. llvm-svn: 32785
* For PR950:Reid Spencer2006-12-231-10/+7
| | | | | | | | This patch removes the SetCC instructions and replaces them with the ICmp and FCmp instructions. The SetCondInst instruction has been removed and been replaced with ICmpInst and FCmpInst. llvm-svn: 32751
* Switch over Transforms/Scalar to use the STATISTIC macro. For each statisticChris Lattner2006-12-191-4/+4
| | | | | | | converted, we lose a static initializer. This also allows GCC to emit warnings about unused statistics. llvm-svn: 32690
* Change the interface to SCEVExpander::InsertCastOfTo to take a cast opcodeReid Spencer2006-12-131-12/+21
| | | | | | | so the decision of which opcode to use is pushed upward to the caller. Adjust the callers to pass the expected opcode. llvm-svn: 32535
* Change inferred getCast into specific getCast. Passes all tests.Reid Spencer2006-12-121-2/+7
| | | | llvm-svn: 32469
* Changed llvm_ostream et all to OStream. llvm_cerr, llvm_cout, llvm_null, areBill Wendling2006-12-071-4/+4
| | | | | | now cerr, cout, and NullStream resp. llvm-svn: 32298
* Detemplatize the Statistic class. The only type it is instantiated withChris Lattner2006-12-061-3/+3
| | | | | | is 'unsigned'. llvm-svn: 32279
* For PR950:Reid Spencer2006-11-271-11/+12
| | | | | | | | | | The long awaited CAST patch. This introduces 12 new instructions into LLVM to replace the cast instruction. Corresponding changes throughout LLVM are provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the exception of 175.vpr which fails only on a slight floating point output difference. llvm-svn: 31931
* Removed #include <iostream> and replaced with llvm_* streams.Bill Wendling2006-11-261-20/+19
| | | | llvm-svn: 31923
* If an indvar with a variable stride is used by the exit condition, go aheadChris Lattner2006-11-171-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and handle it like constant stride vars. This fixes some bad codegen in variable stride cases. For example, it compiles this: void foo(int k, int i) { for (k=i+i; k <= 8192; k+=i) flags2[k] = 0; } to: LBB1_1: #bb.preheader movl %eax, %ecx addl %ecx, %ecx movl L_flags2$non_lazy_ptr, %edx LBB1_2: #bb movb $0, (%edx,%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB1_2 #bb LBB1_5: #return ret or (if the array is local and we are in dynamic-nonpic or static mode): LBB3_2: #bb movb $0, _flags2(%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB3_2 #bb and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) slwi r3, r4, 1 LBB1_2: ;bb li r5, 0 add r6, r4, r3 stbx r5, r2, r3 cmpwi cr0, r6, 8192 bgt cr0, LBB1_5 ;return instead of: leal (%eax,%eax,2), %ecx movl %eax, %edx addl %edx, %edx addl L_flags2$non_lazy_ptr, %edx xorl %esi, %esi LBB1_2: #bb movb $0, (%edx,%esi) movl %eax, %edi addl %esi, %edi addl %ecx, %esi cmpl $8192, %esi jg LBB1_5 #return and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) mulli r3, r4, 3 slwi r5, r4, 1 li r6, 0 add r2, r2, r5 LBB1_2: ;bb li r5, 0 add r7, r3, r6 stbx r5, r2, r6 add r6, r4, r6 cmpwi cr0, r7, 8192 ble cr0, LBB1_2 ;bb This speeds up Benchmarks/Shootout/sieve from 8.533s to 6.464s and implements LoopStrengthReduce/var_stride_used_by_compare.ll llvm-svn: 31809
* For PR786:Reid Spencer2006-11-021-1/+1
| | | | | | | | | | Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting fall out by removing unused variables. Remaining warnings have to do with unused functions (I didn't want to delete code without review) and unused variables in generated code. Maintainers should clean up the remaining issues when they see them. All changes pass DejaGnu tests and Olden. llvm-svn: 31380
* break edges more intelligentlyChris Lattner2006-10-281-2/+3
| | | | llvm-svn: 31257
* prepare for a change I'm about to makeChris Lattner2006-10-281-0/+6
| | | | llvm-svn: 31248
* For PR950:Reid Spencer2006-10-201-3/+3
| | | | | | | | This patch implements the first increment for the Signless Types feature. All changes pertain to removing the ConstantSInt and ConstantUInt classes in favor of just using ConstantInt. llvm-svn: 31063
* eliminate RegisterOpt. It does the same thing as RegisterPass.Chris Lattner2006-08-271-2/+1
| | | | llvm-svn: 29925
* s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|Chris Lattner2006-08-271-1/+1
| | | | llvm-svn: 29911
* Changes:Chris Lattner2006-08-031-17/+46
| | | | | | | | | | | | | | | 1. Update an obsolete comment. 2. Make the sorting by base an explicit (though still N^2) step, so that the code is more clear on what it is doing. 3. Partition uses so that uses inside the loop are handled before uses outside the loop. Note that none of these changes currently changes the code inserted by LSR, but they are a stepping stone to getting there. This code is the result of some crazy pair programming with Nate. :) llvm-svn: 29493
* Only reuse a previous IV if it would not require a type conversion.Evan Cheng2006-07-181-14/+17
| | | | llvm-svn: 29186
* Use hidden visibility to make symbols in an anonymous namespace getChris Lattner2006-06-281-1/+2
| | | | | | dropped. This shrinks libllvmgcc.dylib another 67K llvm-svn: 28975
* RewriteExpr, either the new PHI node of induction variable or theEvan Cheng2006-06-091-0/+3
| | | | | | | | post-increment value, should be first cast to the appropriated type (to the type of the common expr). Otherwise, the rewrite of a use based on (common + iv) may end up with an incorrect type. llvm-svn: 28735
* Get rid of a signed/unsigned compare warning.Reid Spencer2006-04-121-1/+1
| | | | llvm-svn: 27625
* Fix spelloChris Lattner2006-03-241-2/+2
| | | | llvm-svn: 27052
* silence a bogus gcc warningChris Lattner2006-03-221-2/+2
| | | | llvm-svn: 26953
* - Fixed a bogus if condition.Evan Cheng2006-03-181-19/+25
| | | | | | | - Added more debugging info. - Allow reuse of IV of negative stride. e.g. -4 stride == 2 * iv of -2 stride. llvm-svn: 26841
* Sort StrideOrder so we can process the smallest strides first. This allowsEvan Cheng2006-03-181-0/+27
| | | | | | for more IV reuses. llvm-svn: 26837
* Allow users of iv / stride to be rewritten with expression that is a multiplyEvan Cheng2006-03-171-41/+83
| | | | | | of a smaller stride even if they have a common loop invariant expression part. llvm-svn: 26828
* For each loop, keep track of all the IV expressions inserted indexed byEvan Cheng2006-03-161-36/+115
| | | | | | | | | | | | | | stride. For a set of uses of the IV of a stride which is a multiple of another stride, do not insert a new IV expression. Rather, reuse the previous IV and rewrite the uses as uses of IV expression multiplied by the factor. e.g. x = 0 ...; x ++ y = 0 ...; y += 4 then use of y can be rewritten as use of 4*x for x86. llvm-svn: 26803
* Added target lowering hooks which LSR consults to make more intelligentEvan Cheng2006-03-131-25/+33
| | | | | | transformation decisions. llvm-svn: 26738
* Use SCEVExpander::InsertCastOfTo instead of our own code. This reducesChris Lattner2006-02-041-18/+1
| | | | | | #LLVM LOC, and auto-cse's cast instructions. llvm-svn: 25974
* Fix two significant bugs in LSR:Chris Lattner2006-02-041-14/+75
| | | | | | | | | | 1. When rewriting code in outer loops, sometimes we would insert code into inner loops that is invariant in that loop. 2. Notice that 4*(2+x) is 8+4*x and use that to simplify expressions. This is a performance neutral change. llvm-svn: 25964
* Make iostream #inclusion explicitChris Lattner2006-01-221-0/+1
| | | | llvm-svn: 25514
* Switch these to using ETForest instead of DominatorSet to compute itself.Chris Lattner2006-01-111-7/+8
| | | | | | Patch written by Daniel Berlin! llvm-svn: 25202
* getRawValue zero extens for unsigned values, use getsextvalue so that weChris Lattner2005-12-051-3/+3
| | | | | | | know that small negative values fit into the immediate field of addressing modes. llvm-svn: 24608
* My previous patch was too conservative. Reject FP and void types, but doChris Lattner2005-10-211-1/+2
| | | | | | allow pointer types. llvm-svn: 23859
* Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from anChris Lattner2005-10-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | inner loop like this: LBB_RateConvertMono8AltiVec_2: ; no_exit lis r2, ha16(.CPI_RateConvertMono8AltiVec_0) lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2) fmr f3, f3 fadd f0, f2, f0 fadd f3, f0, f3 fcmpu cr0, f3, f1 bge cr0, LBB_RateConvertMono8AltiVec_2 ; no_exit to an inner loop like this: LBB_RateConvertMono8AltiVec_1: ; no_exit fsub f2, f2, f1 fcmpu cr0, f2, f1 fmr f0, f2 bge cr0, LBB_RateConvertMono8AltiVec_1 ; no_exit Doh! good catch! llvm-svn: 23838
* Fix (hopefully the last) issue where LSR is nondeterminstic. When pullingChris Lattner2005-10-111-8/+14
| | | | | | | out CSE's of base expressions it could build a result whose order was nondet. llvm-svn: 23698
* Fix another problem where LSR was being nondeterminstic. Also remove elementsChris Lattner2005-10-111-10/+16
| | | | | | from the end of a vector instead of the beginning llvm-svn: 23697
* Fix another lsr-is-nondeterministic caseChris Lattner2005-10-111-6/+10
| | | | llvm-svn: 23695
* Hrm, you didn't see this.Chris Lattner2005-10-091-3/+0
| | | | llvm-svn: 23673
* Fix a source of non-determinism in the backend: the order of processingChris Lattner2005-10-091-6/+25
| | | | | | | IV strides dependend on the pointer order of the strides in memory. Non-determinism is bad. llvm-svn: 23672
* Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. InChris Lattner2005-10-031-6/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | particular, it should realize that phi's use their values in the pred block not the phi block itself. This change turns our em3d loop from this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_6 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; endif.loopexit.loopexit_crit_edge addi r3, r2, 1 blr LBB_test_6: ; loopexit or r3, r2, r2 blr into: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r6, r6 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 or r2, r6, r6 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r2, r2 blr Unfortunately, this is actually worse code, because the register coallescer is getting confused somehow. If it were doing its job right, it could turn the code into this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r6, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r6, r6 blr ... which I'll work on next. :) llvm-svn: 23604
* Refactor some code into a functionChris Lattner2005-10-031-7/+23
| | | | llvm-svn: 23603
* This break is bogus and I have no idea why it was there. Basically it preventsChris Lattner2005-10-031-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memoizing code when IV's are used by phinodes outside of loops. In a simple example, we were getting this code before (note that r6 and r7 are isomorphic IV's): li r6, 0 or r7, r6, r6 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r7, r7 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r2, r7, 1 addi r7, r7, 1 addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit Now we get: li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit this was noticed in em3d. llvm-svn: 23602
* when checking if we should move a split edge block outside of a loop,Chris Lattner2005-10-031-7/+6
| | | | | | | | check the presplit pred, not the post-split pred. This was causing us to make the wrong decision in some cases, leaving the critical edge block in the loop. llvm-svn: 23601
* Make the pass name simplerChris Lattner2005-09-271-1/+1
| | | | llvm-svn: 23476
* Fix an issue where LSR would miss rewriting a use of an IV expression by a ↵Chris Lattner2005-09-131-4/+8
| | | | | | | | | PHI node that is not the original PHI. This fixes up a dot-product loop in galgel, speeding it up from 18.47s to 16.13s. llvm-svn: 23327
OpenPOWER on IntegriCloud