summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix TryToShrinkGlobalToBoolean in GlobalOpt, so that it does not discard ↵Joey Gouly2013-01-101-11/+16
| | | | | | address spaces. llvm-svn: 172051
* ARM Cost model: Use the size of vector registers and widest vectorizable ↵Nadav Rotem2013-01-093-2/+62
| | | | | | instruction to determine the max vectorization factor. llvm-svn: 172010
* LICM: Hoist insertvalue/extractvalue out of loops.Benjamin Kramer2013-01-091-0/+26
| | | | | | Fixes PR14854. llvm-svn: 171984
* ARM Cost Model: Add a basic vectorization unrolling test.Nadav Rotem2013-01-091-3/+10
| | | | llvm-svn: 171931
* Remove the -licm pass from the loop vectorizer test because the loop ↵Nadav Rotem2013-01-0923-25/+25
| | | | | | vectorizer does it now. llvm-svn: 171930
* Cost Model: Move the 'max unroll factor' variable to the TTI and add initial ↵Nadav Rotem2013-01-093-2/+31
| | | | | | Cost Model support on ARM. llvm-svn: 171928
* Consider expression "0.0 - X" as the negation of X ifShuxin Yang2013-01-091-2/+13
| | | | | | | - this expression is explicitly marked no-signed-zero, or - no-signed-zero of this expression can be derived from some context. llvm-svn: 171922
* Make sure we don't emit instructions before a landingpad instruction.Bill Wendling2013-01-082-0/+89
| | | | | | PR14782 llvm-svn: 171846
* LoopVectorizer: Add support for floating point reductionsNadav Rotem2013-01-071-0/+29
| | | | llvm-svn: 171812
* LoopVectorizer: When we vectorizer and widen loops we process many elements ↵Nadav Rotem2013-01-071-0/+50
| | | | | | | | | at once. This is a good thing, except for small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the rest of the loop. This patch disables widening of loops with a small static trip count. llvm-svn: 171798
* This change is to implement following rules:Shuxin Yang2013-01-071-0/+85
| | | | | | | | | | | o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal) o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp) Let MDC denote multiplication or dividion with one & only one operand being a constant o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2) (so long as the constant-folding doesn't yield any denormal or special value) llvm-svn: 171793
* When code size is the priority (Oz, MinSize attribute), help llvmQuentin Colombet2013-01-071-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | turning a code like this: if (foo) free(foo) into that: free(foo) Move a call to free from basic block FB into FB's predecessor, P, when the path from P to FB is taken only if the argument of free is not equal to NULL. Some restrictions apply on P and FB to be sure that this code motion is profitable. Namely: 1. FB must have only one predecessor P. 2. FB must contain only the call to free plus an unconditional branch to S. 3. P's successors are FB and S. Because of 1., we will not increase the code size when moving the call to free from FB to P. Because of 2., FB will be empty after the move. Because of 2. and 3., P's branch instruction becomes useless, so as FB (simplifycfg will do the job). llvm-svn: 171762
* Switch the SCEV expander and LoopStrengthReduce to useChandler Carruth2013-01-074-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TargetTransformInfo rather than TargetLowering, removing one of the primary instances of the layering violation of Transforms depending directly on Target. This is a really big deal because LSR used to be a "special" pass that could only be tested fully using llc and by looking at the full output of it. It also couldn't run with any other loop passes because it had to be created by the backend. No longer is this true. LSR is now just a normal pass and we should probably lift the creation of LSR out of lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done this, or updated all of the tests to use opt and a triple, because I suspect someone more familiar with LSR would do a better job. This change should be essentially without functional impact for normal compilations, and only change behvaior of targetless compilations. The conversion required changing all of the LSR code to refer to the TTI interfaces, which fortunately are very similar to TargetLowering's interfaces. However, it also allowed us to *always* expect to have some implementation around. I've pushed that simplification through the pass, and leveraged it to simplify code somewhat. It required some test updates for one of two things: either we used to skip some checks altogether but now we get the default "no" answer for them, or we used to have no information about the target and now we do have some. I've also started the process of removing AddrMode, as the TTI interface doesn't use it any longer. In some cases this simplifies code, and in others it adds some complexity, but I think it's not a bad tradeoff even there. Subsequent patches will try to clean this up even further and use other (more appropriate) abstractions. Yet again, almost all of the formatting changes brought to you by clang-format. =] llvm-svn: 171735
* Fix a mistaken commit that included some debugging code.David Tweed2013-01-071-1/+1
| | | | llvm-svn: 171734
* There was a switch fall-through in the parser for textual LLVM that causedDavid Tweed2013-01-072-3/+3
| | | | | | | | bogus comparison operands to default to eq/oeq. Fix that, fix a couple of tests that accidentally passed and test for bogus comparison opeartors explicitly. llvm-svn: 171733
* Switch BBVectorize to directly depend on having a TTI analysis.Chandler Carruth2013-01-0710-15/+15
| | | | | | | | | | | | | This could be simplified further, but Hal has a specific feature for ignoring TTI, and so I preserved that. Also, I needed to use it because a number of tests fail when switching from a null TTI to the NoTTI nonce implementation. That seems suspicious to me and so may be something that you need to look into Hal. I worked it by preserving the old behavior for these tests with the flag that ignores all target info. llvm-svn: 171722
* Fix a crash in LSR replaceCongruentIVs.Andrew Trick2013-01-061-0/+44
| | | | | | | Indirect branch in the preheader crashes replaceCongruentIVs. Fixes rdar://12910141. llvm-svn: 171653
* Fix a typo. Remove the duplicated test.Nadav Rotem2013-01-051-25/+0
| | | | llvm-svn: 171584
* iLoopVectorize: Non commutative operators can be used as reduction variables ↵Nadav Rotem2013-01-052-3/+31
| | | | | | | | as long as the reduction chain is used in the LHS. PR14803. llvm-svn: 171583
* Force a fixed unroll count on the target independent tests.Nadav Rotem2013-01-0527-27/+27
| | | | | | This should fix clang-native-arm-cortex-a9. Thanks Renato. llvm-svn: 171582
* tabs-to-spacesAndrew Trick2013-01-041-44/+43
| | | | llvm-svn: 171550
* Do not vectorize loops with subtraction reductionsPaul Redmond2013-01-042-1/+51
| | | | | | | | | Since subtraction does not commute the loop vectorizer incorrectly vectorizes reductions such as x = A[i] - x. Disabling for now. llvm-svn: 171537
* Memory Dependence Analysis: fix a miscompile that uses DT to approxmiate theManman Ren2013-01-041-0/+54
| | | | | | | | | | | | reachablity. We conservatively approximate the reachability analysis by saying it is not reachable if there is a single path starting from "From" and the path does not reach "To". rdar://12801584 llvm-svn: 171512
* LoopVectorizer:Nadav Rotem2013-01-042-2/+58
| | | | | | | | 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469
* LoopVectorizer: Test the unrolling flag.Nadav Rotem2013-01-031-0/+39
| | | | llvm-svn: 171446
* Avoid vectorization when the function has the "noimplicitflot" attribute.Nadav Rotem2013-01-021-0/+29
| | | | llvm-svn: 171429
* Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ↵Dmitri Gribenko2013-01-012-3/+8
| | | | | | | | | | ModuleID This is done to avoid odd test failures, like the one fixed in r171243. While there, FileCheck'ize tests. llvm-svn: 171344
* Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ↵Dmitri Gribenko2013-01-0129-30/+30
| | | | | | | | | | ModuleID This is done to avoid odd test failures, like the one fixed in r171243. My previous regex was not good enough to find these. llvm-svn: 171343
* Make opt grab the triple from the module and use it to initialize the target ↵Nadav Rotem2013-01-011-1/+1
| | | | | | machine. llvm-svn: 171341
* recommit r171298 (add support for PHI nodes to ObjectSizeOffsetVisitor). ↵Nuno Lopes2012-12-311-0/+128
| | | | | | Hopefully with bugs corrected now. llvm-svn: 171325
* Revert "add support for PHI nodes to ObjectSizeOffsetVisitor"Benjamin Kramer2012-12-311-54/+0
| | | | | | This reverts r171298. Breaks clang selfhost. llvm-svn: 171318
* Add extra CHECK to make sure that 'or' instruction was replaced.Jakub Staszak2012-12-311-0/+1
| | | | | | Also add an assert to avoid confusion in the code where is known that C1 <= C2. llvm-svn: 171310
* add support for PHI nodes to ObjectSizeOffsetVisitorNuno Lopes2012-12-311-0/+54
| | | | llvm-svn: 171298
* Fix LICM's memory promotion optimization to preserve TBAA tags whenChris Lattner2012-12-311-2/+40
| | | | | | | promoting a store in a loop. This was noticed when working on PR14753, but isn't directly related. llvm-svn: 171281
* teach instcombine to preserve TBAA tag when merging two stores, part ofChris Lattner2012-12-311-0/+34
| | | | | | PR14753 llvm-svn: 171279
* Transform (A == C1 || A == C2) into (A & ~(C1 ^ C2)) == C1Jakub Staszak2012-12-311-0/+11
| | | | | | | if C1 and C2 differ only with one bit. Fixes PR14708. llvm-svn: 171270
* LoopVectorizer: Fix a bug in the code that updates the loop exiting block.Nadav Rotem2012-12-301-0/+29
| | | | | | | | | LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs. The bug happened because undefs are not loop values. This patch handles these PHIs. PR14725 llvm-svn: 171251
* Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ↵Dmitri Gribenko2012-12-3021-23/+23
| | | | | | | | ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171250
* Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ↵Dmitri Gribenko2012-12-3047-48/+48
| | | | | | | | ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171246
* llvm/test/Transforms/GVN/null-aliases-nothing.ll: Fix a RUN line not to emit ↵NAKAMURA Takumi2012-12-301-1/+1
| | | | | | | | ModuleID. Larry Evans reported it fails if source tree contains "load", like "download". llvm-svn: 171243
* Fix a stunning oversight in the inline cost analysis. It was neverChandler Carruth2012-12-281-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | propagating one of the values it simplified to a constant across a myriad of instructions. Notably, ptrtoint instructions when we had a constant pointer (say, 0) didn't propagate that, blocking a massive number of down-stream optimizations. This was uncovered when investigating why we fail to inline and delete the boilerplate in: void f() { std::vector<int> v; v.push_back(1); } It turns out most of the efforts I've made thus far to improve the analysis weren't making it far purely because of this. After this is fixed, the store-to-load forwarding patch enables LLVM to optimize the above to an empty function. We still can't nuke a second push_back, but for different reasons. There is a very real chance this will cause somewhat noticable changes in inlining behavior, so please let me know if you see regressions (or improvements!) because of this patch. llvm-svn: 171196
* Teach the inline cost analysis about calls that can be simplified andChandler Carruth2012-12-281-0/+38
| | | | | | | | | | | | | | | | | | | | | how to propagate constants through insert and extract value instructions. With the recent improvements to instsimplify, this allows inline cost analysis to constant fold through intrinsic functions, including notably the with.overflow intrinsic math routines which often show up inside of STL abstractions. This is yet another piece in the puzzle of breaking down the code for: void f() { std::vector<int> v; v.push_back(1); } But it still isn't enough. There are a pile of bugs in inline cost still blocking this. llvm-svn: 171195
* Teach instsimplify to use the constant folder where appropriate forChandler Carruth2012-12-281-0/+52
| | | | | | | | constant folding calls. Add the initial tests for this which show that now instsimplify can simplify blindingly obvious code patterns expressed with both intrinsics and library calls. llvm-svn: 171194
* If all of the write objects are identified then we can vectorize the loop ↵Nadav Rotem2012-12-261-0/+53
| | | | | | | | even if the read objects are unidentified. PR14719. llvm-svn: 171124
* LoopVectorizer: Optimize the vectorization of consecutive memory access when ↵Nadav Rotem2012-12-261-1/+2
| | | | | | the iteration step is -1 llvm-svn: 171114
* BBVectorize: Use VTTI to compute costs for intrinsics vectorizationHal Finkel2012-12-261-0/+79
| | | | | | | | | | | | For the time being this includes only some dummy test cases. Once the generic implementation of the intrinsics cost function does something other than assuming scalarization in all cases, or some target specializes the interface, some real test cases can be added. Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID in a few other places. llvm-svn: 171079
* LoopVectorize: Enable vectorization of the fmuladd intrinsicHal Finkel2012-12-251-0/+60
| | | | llvm-svn: 171076
* BBVectorize: Enable vectorization of the fmuladd intrinsicHal Finkel2012-12-251-0/+28
| | | | llvm-svn: 171075
* Fix typo "Makre" -> "Make".Nick Lewycky2012-12-241-6/+4
| | | | llvm-svn: 171043
* LoopVectorizer: When checking for vectorizable types, also checkNadav Rotem2012-12-241-0/+29
| | | | | | | | the StoreInst operands. PR14705. llvm-svn: 171023
OpenPOWER on IntegriCloud