summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector ↵Craig Topper2013-07-149-52/+57
| | | | | | size. llvm-svn: 186274
* LoopVectorizer: Disallow reductions whose header phi is used outside the loopArnold Schwaighofer2013-07-131-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an outside loop user of the reduction value uses the header phi node we cannot just reduce the vectorized phi value in the vector code epilog because we would loose VF-1 reductions. lp: p = phi (0, lv) lv = lv + 1 ... brcond , lp, outside outside: usr = add 0, p (Say the loop iterates two times, the value of p coming out of the loop is one). We cannot just transform this to: vlp: p = phi (<0,0>, lv) lv = lv + <1,1> .. brcond , lp, outside outside: p_reduced = p[0] + [1]; usr = add 0, p_reduced (Because the original loop iterated two times the vectorized loop would iterate one time, but p_reduced ends up being zero instead of one). We would have to execute VF-1 iterations in the scalar remainder loop in such cases. For now, just disable vectorization. PR16522 llvm-svn: 186256
* LoopVectorize fix: LoopInfo must be valid when invoking utils like SCEVExpander.Andrew Trick2013-07-131-18/+18
| | | | | | | | | | | In general, one should always complete CFG modifications first, update CFG-based analyses, like Dominatores and LoopInfo, then generate instruction sequences. LoopVectorizer was creating a new loop, calling SCEVExpander to generate checks, then updating LoopInfo. I just changed the order. llvm-svn: 186241
* Add a microoptimization for urem.Nick Lewycky2013-07-131-0/+7
| | | | llvm-svn: 186235
* Fix a crash in EvaluateInDifferentElementOrder where it would generate anJoey Gouly2013-07-121-1/+3
| | | | | | | | undef vector of the wrong type. LGTM'd by Nick Lewycky on IRC. llvm-svn: 186224
* LFTR improvement to avoid truncation.Andrew Trick2013-07-121-6/+32
| | | | | | This is a reimplemntation of the patch originally in r186107. llvm-svn: 186215
* Cleanup LFTR logic.Andrew Trick2013-07-121-28/+9
| | | | llvm-svn: 186214
* Cleanup: rename a variable to make the logic easier to follow.Andrew Trick2013-07-121-7/+7
| | | | llvm-svn: 186213
* TargetTransformInfo: address calculation parameter for gather/scatherArnold Schwaighofer2013-07-121-1/+56
| | | | | | | | | | | Address calculation for gather/scather in vectorized code can incur a significant cost making vectorization unbeneficial. Add infrastructure to add cost. Tests and cost model for targets will be in follow-up commits. radar://14351991 llvm-svn: 186187
* Revert "indvars: Improve LFTR by eliminating truncation when comparingChandler Carruth2013-07-121-23/+4
| | | | | | | | | | | | | | | | | | | against a constant." This reverts commit r186107. It didn't handle wrapping arithmetic in the loop correctly and thus caused the following C program to count from 0 to UINT64_MAX instead of from 0 to 255 as intended: #include <stdio.h> int main() { unsigned char first = 0, last = 255; do { printf("%d\n", first); } while (first++ != last); } Full test case and instructions to reproduce with just the -indvars pass sent to the original review thread rather than to r186107's commit. llvm-svn: 186152
* SLPVectorizer: Sink and enable CSE for ExtractElements.Nadav Rotem2013-07-121-11/+25
| | | | llvm-svn: 186145
* SLPVectorize: Replace the code that checks for vectorization candidates in ↵Nadav Rotem2013-07-121-25/+22
| | | | | | | | successor blocks with code that scans PHINodes. Before we could vectorize PHINodes scanning successors was a good way of finding candidates. Now we can vectorize the phinodes which is simpler. llvm-svn: 186139
* Remove an argument that we dont use anymore.Nadav Rotem2013-07-111-15/+12
| | | | llvm-svn: 186116
* indvars: Improve LFTR by eliminating truncation when comparing against a ↵Andrew Trick2013-07-111-4/+23
| | | | | | | | | | | | | | | | | constant. Patch by Michele Scandale! Adds a special handling of the case where, during the loop exit condition rewriting, the exit value is a constant of bitwidth lower than the type of the induction variable: instead of introducing a trunc operation in order to match correctly the operand types, it allows to convert the constant value to an equivalent constant, depending on the initial value of the induction variable and the trip count, in order have an equivalent comparison between the induction variable and the new constant. llvm-svn: 186107
* Don't use a potentially expensive shift if all we want is one set bit.Benjamin Kramer2013-07-112-2/+2
| | | | | | No functionality change. llvm-svn: 186095
* LoopVectorize: Vectorize all accesses in address space zero with unit strideArnold Schwaighofer2013-07-111-8/+16
| | | | | | | | | | | We can vectorize them because in the case where we wrap in the address space the unvectorized code would have had to access a pointer value of zero which is undefined behavior in address space zero according to the LLVM IR semantics. (Thank you Duncan, for pointing this out to me). Fixes PR16592. llvm-svn: 186088
* TryToSimplifyUncondBranchFromEmptyBlock was checking that any commonDuncan Sands2013-07-111-23/+147
| | | | | | | | | | predecessors of the two blocks it is attempting to merge supply the same incoming values to any phi in the successor block. This change allows merging in the case where there is one or more incoming values that are undef. The undef values are rewritten to match the non-undef value that flows from the other edge. Patch by Mark Lacey. llvm-svn: 186069
* Fix a warning.Nadav Rotem2013-07-111-2/+1
| | | | llvm-svn: 186064
* SLPVectorizer: refactor the code that places extracts. Place the code that ↵Nadav Rotem2013-07-111-41/+131
| | | | | | decides where to put extracts in the build-tree phase. This allows us to take the cost of the extracts into account. llvm-svn: 186058
* Teach TailRecursionElimination to handle certain cases of nocapture escaping ↵Michael Gottesman2013-07-111-64/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allocas. Without the changes introduced into this patch, if TRE saw any allocas at all, TRE would not perform TRE *or* mark callsites with the tail marker. Because TRE runs after mem2reg, this inadequacy is not a death sentence. But given a callsite A without escaping alloca argument, A may not be able to have the tail marker placed on it due to a separate callsite B having a write-back parameter passed in via an argument with the nocapture attribute. Assume that B is the only other callsite besides A and B only has nocapture escaping alloca arguments (*NOTE* B may have other arguments that are not passed allocas). In this case not marking A with the tail marker is unnecessarily conservative since: 1. By assumption A has no escaping alloca arguments itself so it can not access the caller's stack via its arguments. 2. Since all of B's escaping alloca arguments are passed as parameters with the nocapture attribute, we know that B does not stash said escaping allocas in a manner that outlives B itself and thus could be accessed indirectly by A. With the changes introduced by this patch: 1. If we see any escaping allocas passed as a capturing argument, we do nothing and bail early. 2. If we do not see any escaping allocas passed as captured arguments but we do see escaping allocas passed as nocapture arguments: i. We do not perform TRE to avoid PR962 since the code generator produces significantly worse code for the dynamic allocas that would be created by the TRE algorithm. ii. If we do not return twice, mark call sites without escaping allocas with the tail marker. *NOTE* This excludes functions with escaping nocapture allocas. 3. If we do not see any escaping allocas at all (whether captured or not): i. If we do not have usage of setjmp, mark all callsites with the tail marker. ii. If there are no dynamic/variable sized allocas in the function, attempt to perform TRE on all callsites in the function. Based off of a patch by Nick Lewycky. rdar://14324281. llvm-svn: 186057
* [objc-arc] Changed 'mode: c++' => 'C++' at Nick Lewycky's suggestion. Also ↵Michael Gottesman2013-07-107-7/+7
| | | | | | removed unnecessary mode: c++ lines from .cpp files. llvm-svn: 186026
* Implement categories for special case lists.Peter Collingbourne2013-07-092-27/+93
| | | | | | | | | | | | | | | | | | | | | | | | A special case list can now specify categories for specific globals, which can be used to instruct an instrumentation pass to treat certain functions or global variables in a specific way, such as by omitting certain aspects of instrumentation while keeping others, or informing the instrumentation pass that a specific uninstrumentable function has certain semantics, thus allowing the pass to instrument callers according to those semantics. For example, AddressSanitizer now uses the "init" category instead of global-init prefixes for globals whose initializers should not be instrumented, but which in all other respects should be instrumented. The motivating use case is DataFlowSanitizer, which will have a number of different categories for uninstrumentable functions, such as "functional" which specifies that a function has pure functional semantics, or "discard" which indicates that a function's return value should not be labelled. Differential Revision: http://llvm-reviews.chandlerc.com/D1092 llvm-svn: 185978
* Introduce a SpecialCaseList ctor which takes a MemoryBuffer to makePeter Collingbourne2013-07-091-1/+9
| | | | | | | | it more unit testable, and fix memory leak in the other ctor. Differential Revision: http://llvm-reviews.chandlerc.com/D1090 llvm-svn: 185976
* Rename BlackList class to SpecialCaseList and move it to Transforms/Utils.Peter Collingbourne2013-07-096-21/+21
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D1089 llvm-svn: 185975
* Fix PR16571, which is a bug in the code that checks that all of the types in ↵Nadav Rotem2013-07-091-1/+3
| | | | | | the bundle are uniform. llvm-svn: 185970
* Set the default insert point to the first instruction, and not to end()Nadav Rotem2013-07-091-1/+1
| | | | llvm-svn: 185953
* InstCombine: Fix typo in comment for visitICmpInstWithInstAndIntCstDavid Majnemer2013-07-091-2/+2
| | | | llvm-svn: 185916
* InstCombine: variations on 0xffffffff - x >= 4David Majnemer2013-07-091-0/+12
| | | | | | | | | | The following transforms are valid if -C is a power of 2: (icmp ugt (xor X, C), ~C) -> (icmp ult X, C) (icmp ult (xor X, C), -C) -> (icmp uge X, C) These are nice, they get rid of the xor. llvm-svn: 185915
* InstCombine: X & -C != -C -> X <= u ~CDavid Majnemer2013-07-091-0/+9
| | | | | | Tests were added in r185910 somehow. llvm-svn: 185912
* Commit r185909 was a misapplied patch, fix itDavid Majnemer2013-07-091-21/+13
| | | | llvm-svn: 185910
* InstCombine: add more transformsDavid Majnemer2013-07-091-0/+42
| | | | | | | | | C1-X <u C2 -> (X|(C2-1)) == C1 C1-X >u C2 -> (X|C2) == C1 X-C1 <u C2 -> (X & -C2) == C1 X-C1 >u C2 -> (X & ~C2) == C1 llvm-svn: 185909
* Fix commentEli Bendersky2013-07-081-3/+2
| | | | llvm-svn: 185888
* This patch changes the saved IRBuilder insert point from ↵Nadav Rotem2013-07-081-1/+2
| | | | | | | | | | BasicBlock::iterator to AssertingVH. Commit 185883 fixes a bug in the IRBuilder that should fix the ASan bot. AssertingVH can help in exposing some RAUW problems. Thanks Ben and Alexey! llvm-svn: 185886
* [objc-arc] Fix assertion in EraseInstruction so that noop on null calls when ↵Michael Gottesman2013-07-081-1/+3
| | | | | | | | passed null do not trigger the assert. The specific case of interest is when objc_retainBlock is passed null. llvm-svn: 185885
* InstCombine: Fold X-C1 <u 2 -> (X & -2) == C1David Majnemer2013-07-081-0/+8
| | | | | | | | | | | Back in r179493 we determined that two transforms collided with each other. The fix back then was to reorder the transforms so that the preferred transform would give it a try and then we would try the secondary transform. However, it was noted that the best approach would canonicalize one transform into the other, removing the collision and allowing us to optimize IR given to us in that form. llvm-svn: 185808
* Clear the builder insert point between tree-vectorization phases.Nadav Rotem2013-07-071-0/+1
| | | | llvm-svn: 185777
* SLPVectorizer: Implement DCE as part of vectorization.Nadav Rotem2013-07-071-1011/+1041
| | | | | | | | | This is a complete re-write if the bottom-up vectorization class. Before this commit we scanned the instruction tree 3 times. First in search of merge points for the trees. Second, for estimating the cost. And finally for vectorization. There was a lot of code duplication and adding the DCE exposed bugs. The new design is simpler and DCE was a part of the design. In this implementation we build the tree once. After that we estimate the cost by scanning the different entries in the constructed tree (in any order). The vectorization phase also works on the built tree. llvm-svn: 185774
* [objc-arc] Remove the alias analysis part of r185764.Michael Gottesman2013-07-071-8/+0
| | | | | | | Upon further reflection, the alias analysis part of r185764 is not a safe change. llvm-svn: 185770
* [objc-arc] Teach the ARC optimizer that objc_sync_enter/objc_sync_exit do ↵Michael Gottesman2013-07-072-0/+10
| | | | | | not modify the ref count of an objc object and additionally are inert for modref purposes. llvm-svn: 185769
* [objc-arc] When we initialize ARCRuntimeEntryPoints, make sure we reset all ↵Michael Gottesman2013-07-061-0/+9
| | | | | | references to entrypoint declarations as well. llvm-svn: 185764
* Reassociate: Remove unnecessary default operator=.Benjamin Kramer2013-07-061-10/+0
| | | | llvm-svn: 185757
* [objc-arc] Performed some small cleanups in ARCRuntimeEntryPoints and added ↵Michael Gottesman2013-07-061-3/+5
| | | | | | an llvm_unreachable after the switch to quiet -Wreturn_type errors. llvm-svn: 185746
* [objc-arc] Renamed Module => TheModule in ARCRuntimeEntryPoints. Also did ↵Michael Gottesman2013-07-061-17/+14
| | | | | | | | some small cleanups. This fixes an issue that came up due to -fpermissive on the bots. llvm-svn: 185744
* Removed trailing whitespace.Michael Gottesman2013-07-061-2/+2
| | | | llvm-svn: 185743
* [objc-arc] Updated ObjCARCContract to use ARCRuntimeEntryPoints.Michael Gottesman2013-07-061-99/+11
| | | | llvm-svn: 185742
* [objc-arc] Updated ObjCARCOpts to use ARCRuntimeEntryPoints.Michael Gottesman2013-07-061-123/+22
| | | | llvm-svn: 185741
* [objc-arc] Refactor runtime entrypoint declaration entrypoint creation.Michael Gottesman2013-07-061-0/+178
| | | | | | | | | | | | | | | | This is the first patch in a series of 3 patches which clean up how we create runtime function declarations in the ARC optimizer when they do not exist already in the IR. Currently we have a bunch of duplicated code in ObjCARCOpts, ObjCARCContract that does this. This patch refactors that code into a separate class called ARCRuntimeEntryPoints which lazily creates the declarations for said entrypoints. The next two patches will consist of the work of refactoring ObjCARCContract/ObjCARCOpts to use this new code. llvm-svn: 185740
* Fix annotation of unlink. Should fix builder.Nick Lewycky2013-07-061-1/+1
| | | | llvm-svn: 185738
* Extend 'readonly' and 'readnone' to work on function arguments as well asNick Lewycky2013-07-061-37/+364
| | | | | | | functions. Make the function attributes pass add it to known library functions and when it can deduce it. llvm-svn: 185735
* Use sys::fs::createTemporaryFile.Rafael Espindola2013-07-051-2/+1
| | | | llvm-svn: 185719
OpenPOWER on IntegriCloud