summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* In various places throughout the code generator, there were specialUlrich Weigand2012-10-291-2/+0
| | | | | | | | | checks to avoid performing compile-time arithmetic on PPCDoubleDouble. Now that APFloat supports arithmetic on PPCDoubleDouble, those checks are no longer needed, and we can treat the type like any other. llvm-svn: 166958
* Rename the BB-vectorize flag to match the dragonegg nameNadav Rotem2012-10-291-2/+2
| | | | llvm-svn: 166948
* Remove a wrapper around getIntPtrType added to GVN by Hal in commit 166624 (theDuncan Sands2012-10-293-16/+6
| | | | | | | | | wrapper returns a vector of integers when passed a vector of pointers) by having getIntPtrType itself return a vector of integers in this case. Outside of this wrapper, I didn't find anywhere in the codebase that was relying on the old behaviour for vectors of pointers, so give this a whirl through the buildbots. llvm-svn: 166939
* Change the PassManagerBuilder (used by -O3) loop vectorizer flag from ↵Nadav Rotem2012-10-291-4/+8
| | | | | | -vectorize to -vectorize-loops because we dont want to share the same flag as the bb-vectorizer. llvm-svn: 166937
* llvm-extract changes linkages so that functions on both sides of theRafael Espindola2012-10-291-12/+25
| | | | | | | split module can see each other. If it is keeping a symbol that already has a non local linkage, it doesn't need to change it. llvm-svn: 166908
* llvm-extract was unable to handle aliases. It would leave a copy on theRafael Espindola2012-10-291-0/+30
| | | | | | | | | | | | | | | | | | output of both llvm-extract foo.ll -func=bar and llvm-extract foo.ll -func=bar -delete so the two new files could not be linked together anymore. With this change alias are handled almost like functions and global variables. Almost because with alias we cannot just clear the initializer/body, we have to create a new declaration and replace the alias with it. The net result is that now the output of the above commands can be linked even if foo.ll has aliases. llvm-svn: 166907
* LoopIdiom: Add checks to avoid turning memmove into an infinite loop.Benjamin Kramer2012-10-271-2/+2
| | | | | | I don't think this is possible with the current implementation but that may change eventually. llvm-svn: 166877
* LoopIdiom: Recognize memmove loops.Benjamin Kramer2012-10-271-10/+24
| | | | | | | | | | | This turns loops like for (unsigned i = 0; i != n; ++i) p[i] = p[i+1]; into memmove, which has a highly optimized implementation in most libcs. This was really easy with the new DependenceAnalysis :) llvm-svn: 166875
* LoopIdiom: Replace custom dependence analysis with DependenceAnalysis.Benjamin Kramer2012-10-271-80/+45
| | | | | | | | | | | Requires a lot less code and complexity on loop-idiom's side and the more precise analysis can catch more cases, like the one I included as a test case. This also fixes the edge-case miscompilation from PR9481. Compile time performance seems to be slightly worse, but this is mostly due to an extra LCSSA run scheduled by the PassManager and should be fixed there. llvm-svn: 166874
* Update BBVectorize to use the new VTTI instr. cost interfaces.Hal Finkel2012-10-271-3/+58
| | | | | | | | The monolithic interface for instruction costs has been split into several functions. This is the corresponding change. No functionality change is intended. llvm-svn: 166865
* 1. Fix a bug in getTypeConversion. When a *simple* type is split, we need to ↵Nadav Rotem2012-10-271-1/+1
| | | | | | | | | return the type of the split result. 2. Change the maximum vectorization width from 4 to 8. 3. A test for both. llvm-svn: 166864
* Refactor the VectorTargetTransformInfo interface.Nadav Rotem2012-10-261-9/+52
| | | | | | | | | | Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc. Port the LoopVectorizer to the new API. The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions. llvm-svn: 166836
* Change the internalize pass to internalize all symbols when given an emptyRafael Espindola2012-10-263-32/+18
| | | | | | | list of externals. This makes sense since a shared library with no symbols can still be useful if it has static constructors. llvm-svn: 166795
* LoopSimplify: Preserve DependenceAnalysis.Benjamin Kramer2012-10-261-0/+2
| | | | | | | | | | This is currently true, but may change when DA grows more aggressive caching. Without this setting it's impossible to use DA from a LoopPass because DA is a function pass and cannot be properly scheduled in between LoopPasses. The LoopManager reacts to this with an infinite loop which made this really annoying to debug. llvm-svn: 166788
* Fix SCEV cache invalidation in LCSSA and LoopSimplify.Benjamin Kramer2012-10-262-0/+19
| | | | | | | | | | | | | | The LoopSimplify bug is pretty harmless because the loop goes from unanalyzable to analyzable but the LCSSA bug is very nasty. It only comes into play with a specific order of the LoopPassManager worklist and can cause actual miscompilations, when a SCEV refers to a value that has been replaced with PHI node. SCEVExpander may then insert code into the wrong place, either violating domination or randomly miscompiling stuff. Comes with an extensive test case reduced from the test-suite with bugpoint+SCEVValidator. llvm-svn: 166787
* Use VTTI->getNumberOfParts in BBVectorize.Hal Finkel2012-10-261-11/+12
| | | | | | This change reflects VTTI refactoring; no functionality change intended. llvm-svn: 166752
* Disable generation of pointer vectors by BBVectorize.Hal Finkel2012-10-261-1/+2
| | | | | | Once vector-of-pointer support works, then this can be reverted. llvm-svn: 166741
* BBVectorize, when using VTTI, should not form types that will be split.Hal Finkel2012-10-251-0/+19
| | | | | | | | | This is needed so that perl's SHA can be compiled (otherwise BBVectorize takes far too long to find its fixed point). I'll try to come up with a reduced test case. llvm-svn: 166738
* Begin incorporating target information into BBVectorize.Hal Finkel2012-10-251-43/+134
| | | | | | | | | | | | | | | | | | | | | | | | This is the first of several steps to incorporate information from the new TargetTransformInfo infrastructure into BBVectorize. Two things are done here: 1. Target information is used to determine if it is profitable to fuse two instructions. This means that the cost of the vector operation must not be more expensive than the cost of the two original operations. Pairs that are not profitable are no longer considered (because current cost information is incomplete, for intrinsics for example, equal-cost pairs are still considered). 2. The 'cost savings' computed for the profitability check are also used to rank the DAGs that represent the potential vectorization plans. Specifically, for nodes of non-trivial depth, the cost savings is used as the node weight. The next step will be to incorporate the shuffle costs into the DAG weighting; this will give the edges of the DAG weights as well. Once that is done, when target information is available, we should be able to dispense with the depth heuristic. llvm-svn: 166716
* LoopVectorize: Teach the cost model to query scalar costs as scalar types ↵Nadav Rotem2012-10-251-41/+61
| | | | | | and not vectors of 1. llvm-svn: 166715
* Also optimize large switch statements.Jakob Stoklund Olesen2012-10-251-22/+20
| | | | | | | | | | The isValueEqualityComparison() guard at the top of SimplifySwitch() only applies to some of the possible transformations. The newer transformations work just fine on large switches, and the check on predecessor count is nonsensical. llvm-svn: 166710
* Teach SROA how to split whole-alloca integer loads and stores intoChandler Carruth2012-10-251-4/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smaller integer loads and stores. The high-level motivation is that the frontend sometimes generates a single whole-alloca integer load or store during ABI lowering of splittable allocas. We need to be able to break this apart in order to see the underlying elements and properly promote them to SSA values. The hope is that this fixes some performance regressions on x86-32 with the new SROA pass. Unfortunately, this causes quite a bit of churn in the test cases, and bloats some IR that comes out. When we see an alloca that consists soley of bits and bytes being extracted and re-inserted, we now do some splitting first, before building widened integer "bucket of bits" representations. These are always well folded by instcombine however, so this shouldn't actually result in missed opportunities. If this splitting of all-integer allocas does cause problems (perhaps due to smaller SSA values going into the RA), we could potentially go to some extreme measures to only do this integer splitting trick when there are non-integer component accesses of an alloca, but discovering this is quite expensive: it adds yet another complete walk of the recursive use tree of the alloca. Either way, I will be watching build bots and LNT bots to see what fallout there is here. If anyone gets x86-32 numbers before & after this change, I would be very interested. llvm-svn: 166662
* Add support for additional reduction variables: AND, OR, XOR.Nadav Rotem2012-10-251-7/+42
| | | | | | Patch by Paul Redmond <paul.redmond@intel.com>. llvm-svn: 166649
* revert accidental changeNadav Rotem2012-10-241-1/+1
| | | | llvm-svn: 166643
* Implement a basic cost model for vector and scalar instructions. Nadav Rotem2012-10-242-18/+33
| | | | llvm-svn: 166642
* Fix a compiler warning with an unused variable.Micah Villmow2012-10-241-1/+0
| | | | llvm-svn: 166634
* Update GVN to support vectors of pointers.Hal Finkel2012-10-241-20/+30
| | | | | | | GVN will now generate ptrtoint instructions for vectors of pointers. Fixes PR14166. llvm-svn: 166624
* whitespaceNadav Rotem2012-10-241-3/+3
| | | | llvm-svn: 166622
* LoopVectorizer: Add a basic cost model which uses the VTTI interface.Nadav Rotem2012-10-241-30/+273
| | | | llvm-svn: 166620
* Add some cleanup to the DataLayout changes requested by Chandler.Micah Villmow2012-10-243-7/+6
| | | | llvm-svn: 166607
* Back out r166591, not sure why this made it through since I cancelled the ↵Micah Villmow2012-10-245-12/+10
| | | | | | command. Bleh, sorry about this! llvm-svn: 166596
* Delete a directory that wasn't supposed to be checked in yet.Micah Villmow2012-10-245-10/+12
| | | | llvm-svn: 166591
* Add in support for getIntPtrType to get the pointer type based on the ↵Micah Villmow2012-10-2418-279/+300
| | | | | | | | | address space. This checkin also adds in some tests that utilize these paths and updates some of the clients. llvm-svn: 166578
* Use the AliasAnalysis isIdentifiedObj because it also understands mallocs ↵Nadav Rotem2012-10-231-19/+2
| | | | | | | | and c++ news. PR14158. llvm-svn: 166491
* Fix typo that somehow escaped both testing and code inspection.Duncan Sands2012-10-231-1/+1
| | | | llvm-svn: 166475
* Transform code like thisDuncan Sands2012-10-232-52/+305
| | | | | | | | | | | | | | | | | | | | | | | | | | | %V = mul i64 %N, 4 %t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V into %t1 = getelementptr i32* %arr, i32 %N %t = bitcast i32* %t1 to i8* incorporating the multiplication into the getelementptr. This happens all the time in dragonegg, for example for int foo(int *A, int N) { return A[N]; } because gcc turns this into byte pointer arithmetic before it hits the plugin: D.1590_2 = (long unsigned int) N_1(D); D.1591_3 = D.1590_2 * 4; D.1592_5 = A_4(D) + D.1591_3; D.1589_6 = *D.1592_5; return D.1589_6; The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the transform fires on. An analogous transform (with no testcases!) already existed for bitcasts of arrays, so I rewrote it to share code with this one. llvm-svn: 166474
* Per the C++ standard, we need to include the definition of llvm::Calculate inRichard Smith2012-10-231-0/+1
| | | | | | | every TU where it's implicitly instantiated, even if there's an implicit instantiation for the same types available in another TU. llvm-svn: 166470
* Fix typo.Julien Lerouge2012-10-231-1/+1
| | | | llvm-svn: 166456
* Explain why DenseMap is still used here instead of MapVector.Julien Lerouge2012-10-231-1/+9
| | | | llvm-svn: 166454
* Iterating over a DenseMap<std::pair<BasicBlock*, unsigned>, PHINode*> is notJulien Lerouge2012-10-221-4/+4
| | | | | | | | | deterministic, replace it with a DenseMap<std::pair<unsigned, unsigned>, PHINode*> (we already have a map from BasicBlock to unsigned). <rdar://problem/12541389> llvm-svn: 166435
* Don't crash if the load/store pointer is not a GEP.Nadav Rotem2012-10-221-1/+1
| | | | | | Fix by Shivarama Rao <Shivarama.Rao@amd.com> llvm-svn: 166427
* Revert r166407 because it caused analyzer tests to crash and broke self-host ↵Argyrios Kyrtzidis2012-10-221-67/+56
| | | | | | bots. llvm-svn: 166424
* BBVectorize should ignore unreachable blocks.Hal Finkel2012-10-221-0/+13
| | | | | | | | | Unreachable blocks can have invalid instructions. For example, jump threading can produce self-referential instructions in unreachable blocks. Also, we should not be spending time optimizing unreachable code. Fixes PR14133. llvm-svn: 166423
* Rename a variable.Nadav Rotem2012-10-221-13/+13
| | | | llvm-svn: 166410
* Vectorizer: optimize the generation of selects. If the condition is uniform, ↵Nadav Rotem2012-10-221-6/+16
| | | | | | generate a scalar-cond select (i1 as selector). llvm-svn: 166409
* Update the loop vectorizer docs.Nadav Rotem2012-10-221-17/+38
| | | | llvm-svn: 166408
* Reapply r166405, teaching tailcallelim to be smarter about nocapture, with aNick Lewycky2012-10-221-56/+67
| | | | | | | | | | | | | | | very small but very important bugfix: bool shouldExplore(Use *U) { Value *V = U->get(); if (isa<CallInst>(V) || isa<InvokeInst>(V)) [...] should have read: bool shouldExplore(Use *U) { Value *V = U->getUser(); if (isa<CallInst>(V) || isa<InvokeInst>(V)) Fixes PR14143! llvm-svn: 166407
* Revert r166405, "Teach TailRecursionElimination to consider 'nocapture' when ↵NAKAMURA Takumi2012-10-221-67/+56
| | | | | | | | deciding whether" It broke selfhosting stage2 in several builders. llvm-svn: 166406
* Teach TailRecursionElimination to consider 'nocapture' when deciding whetherNick Lewycky2012-10-211-56/+67
| | | | | | calls can be marked tail. llvm-svn: 166405
* Revert r166390 "LoopIdiom: Replace custom dependence analysis with ↵Benjamin Kramer2012-10-211-26/+74
| | | | | | | | | | | | | LoopDependenceAnalysis." It passes all tests, produces better results than the old code but uses the wrong pass, LoopDependenceAnalysis, which is old and unmaintained. "Why is it still in tree?", you might ask. The answer is obviously: "To confuse developers." Just swapping in the new dependency pass sends the pass manager into an infinte loop, I'll try to figure out why tomorrow. llvm-svn: 166399
OpenPOWER on IntegriCloud