summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* instcombine: Migrate strto* optimizationsMeador Inge2012-10-312-31/+29
| | | | | | | This patch migrates the strto* optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167119
* Fix false -> NULL conversion from r167115 spotted by Benjamin Kramer.Hans Wennborg2012-10-311-1/+1
| | | | llvm-svn: 167117
* Replace some instances of UniqueVector with SetVector, which is slightly ↵Benjamin Kramer2012-10-311-3/+3
| | | | | | | | cheaper. No functionality change. llvm-svn: 167116
* Do simple constant propagation in lookup table formation for switchesHans Wennborg2012-10-311-15/+98
| | | | | | | | | | | | | | | | | | | By propagating the value for the switch condition, LLVM can now build lookup tables for code such as: switch (x) { case 1: return 5; case 2: return 42; case 3: case 4: case 5: return x - 123; default: return 123; } Given that x is known for each case, "x - 123" becomes a constant for cases 3, 4, and 5. llvm-svn: 167115
* LCSSA: Add a workaround for another nasty SCEV cache invalidation issue.Benjamin Kramer2012-10-311-0/+5
| | | | | | | I'm not entirely happy with this solution, but I don't see a smarter way currently. Fixes PR14214. llvm-svn: 167112
* instcombine: Migrate strpbrk optimizationsMeador Inge2012-10-312-40/+37
| | | | | | | This patch migrates the strpbrk optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167105
* instcombine: Migrate strlen optimizationsMeador Inge2012-10-312-44/+45
| | | | | | | This patch migrates the strlen optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167103
* instcombine: Migrate strncpy optimizationsMeador Inge2012-10-312-52/+49
| | | | | | | This patch migrates the strncpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167102
* LoopVectorize: Do not vectorize loops with tiny constant trip counts.Nadav Rotem2012-10-311-0/+8
| | | | llvm-svn: 167101
* Add support for loops that don't start with Zero.Nadav Rotem2012-10-311-12/+19
| | | | | | | This is important for loops in the LAPACK test-suite. These loops start at 1 because they are auto-converted from fortran. llvm-svn: 167084
* instcombine: Migrate stpcpy optimizationsMeador Inge2012-10-312-53/+40
| | | | | | | | This patch migrates the stpcpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. Note that the __stpcpy_chk simplifications were migrated in a previous commit. llvm-svn: 167083
* instcombine: Split out the __stpcpy_chk simplifications from StrCpyChkOptMeador Inge2012-10-311-3/+54
| | | | | | | | | | | | | | | | r166198 migrated the strcpy optimization to instcombine. The strcpy simplifier that was migrated from Transforms/Scalar/SimplifyLibCalls.cpp was also doing some __strcpy_chk simplifications. Those fortified simplifications were migrated as well, but introduced a bug in the __stpcpy_chk simplifier in the process. This happened because the __strcpy_chk and __stpcpy_chk simplifiers were both mapped to StrCpyChkOpt which was updated with simplifications that worked for __strcpy_chk, but not __stpcpy_chk. This patch fixes the problem by adding proper test coverage and creating a new simplifier for __stpcpy_chk (instead of sharing one with __strcpy_chk). llvm-svn: 167082
* Add documentation.Nadav Rotem2012-10-301-0/+5
| | | | llvm-svn: 167055
* Fix PR14212: For some strange reason I treated vectors differently fromChandler Carruth2012-10-301-4/+3
| | | | | | | | | integers in that the code to handle split alloca-wide integer loads or stores doesn't come first. It should, for the same reasons as with integers, and the PR attests to that. Also had to fix a busted assert in that this test case also covers. llvm-svn: 167051
* BBVectorize: Cache fixed-order pairs instead of recomputing pointer info.Hal Finkel2012-10-301-51/+34
| | | | | | | | | | | | Instead of recomputing relative pointer information just prior to fusing, cache this information (which also needs to be computed during the candidate-pair selection process). This cuts down on the total number of SE queries made, and also is a necessary intermediate step on the road toward including shuffle costs in the pair selection procedure. No functionality change is intended. llvm-svn: 167049
* LoopIdiom: Fix a serious missed optimization: we only turned top-level loops ↵Benjamin Kramer2012-10-301-4/+5
| | | | | | | | into memmove. Thanks to Preston Briggs for catching this! llvm-svn: 167045
* BBVectorize: Fix a small bug introduced in r167042.Hal Finkel2012-10-301-1/+0
| | | | | | | We need to make sure that we take the correct load/store alignment when the inputs are flipped. llvm-svn: 167044
* BBVectorize: Simplify how input swapping is handled.Hal Finkel2012-10-301-43/+25
| | | | | | | | | Stop propagating the FlipMemInputs variable into the routines that create the replacement instructions. Instead, just flip the arguments of those routines. This allows for some associated cleanup (not all of which is done here). No functionality change is intended. llvm-svn: 167042
* BBVectorize: Don't make calls to SE when the result is unused.Hal Finkel2012-10-301-2/+5
| | | | | | | SE was being called during the instruction-fusion process (when the result is unreliable, and thus ignored). No functionality change is intended. llvm-svn: 167037
* 80-colNadav Rotem2012-10-301-1/+2
| | | | llvm-svn: 167036
* LoopVectorize: Add support for write-only loops when the write destination ↵Nadav Rotem2012-10-301-0/+7
| | | | | | | | is a single pointer. Speedup SciMark by 1% llvm-svn: 167035
* LoopVectorize: Fix a bug in the initialization of reduction variables. AND ↵Nadav Rotem2012-10-301-7/+21
| | | | | | | | needs to start at all-one while XOR, and OR need to start at zero. llvm-svn: 167032
* Fix isEliminableCastPair to work correctly in the presence of pointersDuncan Sands2012-10-301-6/+10
| | | | | | with different sizes. llvm-svn: 167018
* Enable some additional constant folding for PPCDoubleDouble.Ulrich Weigand2012-10-301-4/+2
| | | | | | This fixes Clang :: CodeGen/complex-builtints.c on PowerPC. llvm-svn: 167013
* Use TargetTransformInfo to control switch-to-lookup table transformationHans Wennborg2012-10-303-122/+25
| | | | | | | | | | | | | | When the switch-to-lookup tables transform landed in SimplifyCFG, it was pointed out that this could be inappropriate for some targets. Since there was no way at the time for the pass to know anything about the target, an awkward reverse-transform was added in CodeGenPrepare that turned lookup tables back into switches for some targets. This patch uses the new TargetTransformInfo to determine if a switch should be transformed, and removes CodeGenPrepare::ConvertLoadToSwitch. llvm-svn: 167011
* LoopVectorizer: change debug prints: Print the module identifier when ↵Nadav Rotem2012-10-301-4/+6
| | | | | | deciding to vectorize. When deciding not to vectorize do not print the called function name because it can be null. llvm-svn: 166989
* LoopVectorize: Update and preserve the dominator tree info.Nadav Rotem2012-10-291-9/+37
| | | | llvm-svn: 166970
* In various places throughout the code generator, there were specialUlrich Weigand2012-10-291-2/+0
| | | | | | | | | checks to avoid performing compile-time arithmetic on PPCDoubleDouble. Now that APFloat supports arithmetic on PPCDoubleDouble, those checks are no longer needed, and we can treat the type like any other. llvm-svn: 166958
* Rename the BB-vectorize flag to match the dragonegg nameNadav Rotem2012-10-291-2/+2
| | | | llvm-svn: 166948
* Remove a wrapper around getIntPtrType added to GVN by Hal in commit 166624 (theDuncan Sands2012-10-293-16/+6
| | | | | | | | | wrapper returns a vector of integers when passed a vector of pointers) by having getIntPtrType itself return a vector of integers in this case. Outside of this wrapper, I didn't find anywhere in the codebase that was relying on the old behaviour for vectors of pointers, so give this a whirl through the buildbots. llvm-svn: 166939
* Change the PassManagerBuilder (used by -O3) loop vectorizer flag from ↵Nadav Rotem2012-10-291-4/+8
| | | | | | -vectorize to -vectorize-loops because we dont want to share the same flag as the bb-vectorizer. llvm-svn: 166937
* llvm-extract changes linkages so that functions on both sides of theRafael Espindola2012-10-291-12/+25
| | | | | | | split module can see each other. If it is keeping a symbol that already has a non local linkage, it doesn't need to change it. llvm-svn: 166908
* llvm-extract was unable to handle aliases. It would leave a copy on theRafael Espindola2012-10-291-0/+30
| | | | | | | | | | | | | | | | | | output of both llvm-extract foo.ll -func=bar and llvm-extract foo.ll -func=bar -delete so the two new files could not be linked together anymore. With this change alias are handled almost like functions and global variables. Almost because with alias we cannot just clear the initializer/body, we have to create a new declaration and replace the alias with it. The net result is that now the output of the above commands can be linked even if foo.ll has aliases. llvm-svn: 166907
* LoopIdiom: Add checks to avoid turning memmove into an infinite loop.Benjamin Kramer2012-10-271-2/+2
| | | | | | I don't think this is possible with the current implementation but that may change eventually. llvm-svn: 166877
* LoopIdiom: Recognize memmove loops.Benjamin Kramer2012-10-271-10/+24
| | | | | | | | | | | This turns loops like for (unsigned i = 0; i != n; ++i) p[i] = p[i+1]; into memmove, which has a highly optimized implementation in most libcs. This was really easy with the new DependenceAnalysis :) llvm-svn: 166875
* LoopIdiom: Replace custom dependence analysis with DependenceAnalysis.Benjamin Kramer2012-10-271-80/+45
| | | | | | | | | | | Requires a lot less code and complexity on loop-idiom's side and the more precise analysis can catch more cases, like the one I included as a test case. This also fixes the edge-case miscompilation from PR9481. Compile time performance seems to be slightly worse, but this is mostly due to an extra LCSSA run scheduled by the PassManager and should be fixed there. llvm-svn: 166874
* Update BBVectorize to use the new VTTI instr. cost interfaces.Hal Finkel2012-10-271-3/+58
| | | | | | | | The monolithic interface for instruction costs has been split into several functions. This is the corresponding change. No functionality change is intended. llvm-svn: 166865
* 1. Fix a bug in getTypeConversion. When a *simple* type is split, we need to ↵Nadav Rotem2012-10-271-1/+1
| | | | | | | | | return the type of the split result. 2. Change the maximum vectorization width from 4 to 8. 3. A test for both. llvm-svn: 166864
* Refactor the VectorTargetTransformInfo interface.Nadav Rotem2012-10-261-9/+52
| | | | | | | | | | Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc. Port the LoopVectorizer to the new API. The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions. llvm-svn: 166836
* Change the internalize pass to internalize all symbols when given an emptyRafael Espindola2012-10-263-32/+18
| | | | | | | list of externals. This makes sense since a shared library with no symbols can still be useful if it has static constructors. llvm-svn: 166795
* LoopSimplify: Preserve DependenceAnalysis.Benjamin Kramer2012-10-261-0/+2
| | | | | | | | | | This is currently true, but may change when DA grows more aggressive caching. Without this setting it's impossible to use DA from a LoopPass because DA is a function pass and cannot be properly scheduled in between LoopPasses. The LoopManager reacts to this with an infinite loop which made this really annoying to debug. llvm-svn: 166788
* Fix SCEV cache invalidation in LCSSA and LoopSimplify.Benjamin Kramer2012-10-262-0/+19
| | | | | | | | | | | | | | The LoopSimplify bug is pretty harmless because the loop goes from unanalyzable to analyzable but the LCSSA bug is very nasty. It only comes into play with a specific order of the LoopPassManager worklist and can cause actual miscompilations, when a SCEV refers to a value that has been replaced with PHI node. SCEVExpander may then insert code into the wrong place, either violating domination or randomly miscompiling stuff. Comes with an extensive test case reduced from the test-suite with bugpoint+SCEVValidator. llvm-svn: 166787
* Use VTTI->getNumberOfParts in BBVectorize.Hal Finkel2012-10-261-11/+12
| | | | | | This change reflects VTTI refactoring; no functionality change intended. llvm-svn: 166752
* Disable generation of pointer vectors by BBVectorize.Hal Finkel2012-10-261-1/+2
| | | | | | Once vector-of-pointer support works, then this can be reverted. llvm-svn: 166741
* BBVectorize, when using VTTI, should not form types that will be split.Hal Finkel2012-10-251-0/+19
| | | | | | | | | This is needed so that perl's SHA can be compiled (otherwise BBVectorize takes far too long to find its fixed point). I'll try to come up with a reduced test case. llvm-svn: 166738
* Begin incorporating target information into BBVectorize.Hal Finkel2012-10-251-43/+134
| | | | | | | | | | | | | | | | | | | | | | | | This is the first of several steps to incorporate information from the new TargetTransformInfo infrastructure into BBVectorize. Two things are done here: 1. Target information is used to determine if it is profitable to fuse two instructions. This means that the cost of the vector operation must not be more expensive than the cost of the two original operations. Pairs that are not profitable are no longer considered (because current cost information is incomplete, for intrinsics for example, equal-cost pairs are still considered). 2. The 'cost savings' computed for the profitability check are also used to rank the DAGs that represent the potential vectorization plans. Specifically, for nodes of non-trivial depth, the cost savings is used as the node weight. The next step will be to incorporate the shuffle costs into the DAG weighting; this will give the edges of the DAG weights as well. Once that is done, when target information is available, we should be able to dispense with the depth heuristic. llvm-svn: 166716
* LoopVectorize: Teach the cost model to query scalar costs as scalar types ↵Nadav Rotem2012-10-251-41/+61
| | | | | | and not vectors of 1. llvm-svn: 166715
* Also optimize large switch statements.Jakob Stoklund Olesen2012-10-251-22/+20
| | | | | | | | | | The isValueEqualityComparison() guard at the top of SimplifySwitch() only applies to some of the possible transformations. The newer transformations work just fine on large switches, and the check on predecessor count is nonsensical. llvm-svn: 166710
* Teach SROA how to split whole-alloca integer loads and stores intoChandler Carruth2012-10-251-4/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smaller integer loads and stores. The high-level motivation is that the frontend sometimes generates a single whole-alloca integer load or store during ABI lowering of splittable allocas. We need to be able to break this apart in order to see the underlying elements and properly promote them to SSA values. The hope is that this fixes some performance regressions on x86-32 with the new SROA pass. Unfortunately, this causes quite a bit of churn in the test cases, and bloats some IR that comes out. When we see an alloca that consists soley of bits and bytes being extracted and re-inserted, we now do some splitting first, before building widened integer "bucket of bits" representations. These are always well folded by instcombine however, so this shouldn't actually result in missed opportunities. If this splitting of all-integer allocas does cause problems (perhaps due to smaller SSA values going into the RA), we could potentially go to some extreme measures to only do this integer splitting trick when there are non-integer component accesses of an alloca, but discovering this is quite expensive: it adds yet another complete walk of the recursive use tree of the alloca. Either way, I will be watching build bots and LNT bots to see what fallout there is here. If anyone gets x86-32 numbers before & after this change, I would be very interested. llvm-svn: 166662
* Add support for additional reduction variables: AND, OR, XOR.Nadav Rotem2012-10-251-7/+42
| | | | | | Patch by Paul Redmond <paul.redmond@intel.com>. llvm-svn: 166649
OpenPOWER on IntegriCloud