summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Move helper classes into anonymous namespaces.Benjamin Kramer2013-06-291-0/+6
| | | | llvm-svn: 185262
* InstCombine: FoldGEPICmp shouldn't change sign of base pointer comparisonDavid Majnemer2013-06-291-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | Changing the sign when comparing the base pointer would introduce all sorts of unexpected things like: %gep.i = getelementptr inbounds [1 x i8]* %a, i32 0, i32 0 %gep2.i = getelementptr inbounds [1 x i8]* %b, i32 0, i32 0 %cmp.i = icmp ult i8* %gep.i, %gep2.i %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = icmp ne i1 %cmp.i, %cmp.i1 ret i1 %cmp into: %cmp.i = icmp slt [1 x i8]* %a, %b %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = xor i1 %cmp.i, %cmp.i1 ret i1 %cmp By preserving the original sign, we now get: ret i1 false This fixes PR16483. llvm-svn: 185259
* InstCombine: Small whitespace cleanup in FoldGEPICmpDavid Majnemer2013-06-291-1/+1
| | | | llvm-svn: 185258
* InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denomsDavid Majnemer2013-06-291-44/+77
| | | | | | | | | | | | | | | | | Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185257
* We preserve the CFG and some of the analysis passes.Nadav Rotem2013-06-291-0/+3
| | | | llvm-svn: 185251
* Update docs.Nadav Rotem2013-06-291-3/+2
| | | | llvm-svn: 185250
* InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2)David Majnemer2013-06-281-2/+72
| | | | | | | | | | | | | | We may, after other optimizations, find ourselves with IR that looks like: %shl = shl i32 1, %y %cmp = icmp ult i32 %shl, 32 Instead, we should just compare the shift count: %cmp = icmp ult i32 %y, 5 llvm-svn: 185242
* SLP Vectorizer: Add support for trees with external users.Nadav Rotem2013-06-281-24/+142
| | | | | | | To support this we have to insert 'extractelement' instructions to pick the right lane. We had this functionality before but I removed it when we moved to the multi-block design because it was too complicated. llvm-svn: 185230
* LoopVectorizer: Refactor the code that checks if it is safe to predicate ↵Nadav Rotem2013-06-281-87/+30
| | | | | | | | | blocks. In this code we keep track of pointers that we are allowed to read from, if they are accessed by non-predicated blocks. We use this list to allow vectorization of conditional loads in predicated blocks because we know that these addresses don't segfault. llvm-svn: 185214
* Remove needless include (unistd.h) in DebugIR passDaniel Malea2013-06-281-2/+0
| | | | | | - should unbreak Windows builds llvm-svn: 185198
* Add missing header for DebugIRDaniel Malea2013-06-281-0/+99
| | | | | | - missed svn add... llvm-svn: 185194
* Remove limitation on DebugIR that made it require existing debug metadata.Daniel Malea2013-06-281-153/+463
| | | | | | | | - Build debug metadata for 'bare' Modules using DIBuilder - DebugIR can be constructed to generate an IR file (to be seen by a debugger) or not in cases where the user already has an IR file on disk. llvm-svn: 185193
* LoopVectorize: Pull dyn_cast into setDebugLocFromInstArnold Schwaighofer2013-06-281-6/+5
| | | | llvm-svn: 185168
* LoopVectorize: Use static function instead of DebugLocSetter classArnold Schwaighofer2013-06-281-52/+30
| | | | | | | | | I used the class to safely reset the state of the builder's debug location. I think I have caught all places where we need to set the debug location to a new one. Therefore, we can replace the class by a function that just sets the debug location. llvm-svn: 185165
* Debug Info: clean up usage of Verify.Manman Ren2013-06-284-8/+28
| | | | | | | | | | | No functionality change. It should suffice to check the type of a debug info metadata, instead of calling Verify. For cases where we know the type of a DI metadata, use assert. Also update testing cases to make them conform to the format of DI classes. llvm-svn: 185135
* LoopVectorize: Preserve debug location infoArnold Schwaighofer2013-06-281-1/+74
| | | | | | radar://14169017 llvm-svn: 185122
* Fix using arg_end() - arg_begin() instead of arg_size()Matt Arsenault2013-06-281-3/+3
| | | | llvm-svn: 185121
* Revert "Revert "[APFloat] Removed APFloat constructor which initialized to ↵Michael Gottesman2013-06-271-4/+4
| | | | | | | | | | | | | | | | | | | | either zero/NaN but allowed you to arbitrarily set the category of the float."" This reverts commit r185099. Looks like both the ppc-64 and mips bots are still failing after I reverted this change. Since: 1. The mips bot always performs a clean build, 2. The ppc64-bot failed again after a clean build (I asked the ppc-64 maintainers to clean the bot which they did... Thanks Will!), I think it is safe to assume that this change was not the cause of the failures that said builders were seeing. Thus I am recomitting. llvm-svn: 185111
* Revert "[APFloat] Removed APFloat constructor which initialized to either ↵Michael Gottesman2013-06-271-4/+4
| | | | | | | | | | | | zero/NaN but allowed you to arbitrarily set the category of the float." This reverts commit r185095. This is causing a FileCheck failure on the 3dnow intrinsics on at least the mips/ppc bots but not on the x86 bots. Reverting while I figure out what is going on. llvm-svn: 185099
* LoopVectorize: Cache edge masks created during if-conversionArnold Schwaighofer2013-06-271-0/+15
| | | | | | | Otherwise, we end up with an exponential IR blowup. Fixes PR16472. llvm-svn: 185097
* [APFloat] Removed APFloat constructor which initialized to either zero/NaN ↵Michael Gottesman2013-06-271-4/+4
| | | | | | | | | | | | | | but allowed you to arbitrarily set the category of the float. The category which an APFloat belongs to should be dependent on the actual value that the APFloat has, not be arbitrarily passed in by the user. This will prevent inconsistency bugs where the category and the actual value in APFloat differ. I also fixed up all of the references to this constructor (which were only in LLVM). llvm-svn: 185095
* LoopVectorize: Use vectorized loop invariant gep index anchored in loopArnold Schwaighofer2013-06-271-8/+20
| | | | | | | | | Use vectorized instruction instead of original instruction anchored in the original loop. Fixes PR16452 and t2075.c of PR16455. llvm-svn: 185081
* LoopVectorize: Don't store a reversed value in the vectorized value mapArnold Schwaighofer2013-06-271-1/+4
| | | | | | | | | | When we store values for reversed induction stores we must not store the reversed value in the vectorized value map. Another instruction might use this value. This fixes 3 test cases of PR16455. llvm-svn: 185051
* Added support for the Builtin attribute.Michael Gottesman2013-06-271-1/+1
| | | | | | | | The Builtin attribute is an attribute that can be placed on function call site that signal that even though a function is declared as being a builtin, rdar://problem/13727199 llvm-svn: 185049
* No need to use a Set when a vector would do.Nadav Rotem2013-06-271-3/+3
| | | | llvm-svn: 185047
* SLP: When searching for vectorization opportunities scan the blocks in ↵Nadav Rotem2013-06-261-2/+4
| | | | | | post-order because we grow chains upwards. llvm-svn: 185041
* SLP: Dont erase instructions during vectorization because it prevents the ↵Nadav Rotem2013-06-261-2/+0
| | | | | | outerloops from iterating over the instructions. llvm-svn: 185040
* In InstCombine{AddSub,MulDivRem} convert APFloat.isFiniteNonZero() && ↵Michael Gottesman2013-06-262-5/+5
| | | | | | !APFloat.isDenormal => APFloat.isNormal. llvm-svn: 185037
* Revert "Debug Info: clean up usage of Verify." as it's breaking bots.Eric Christopher2013-06-262-5/+5
| | | | | | This reverts commit r185020 llvm-svn: 185032
* Debug Info: clean up usage of Verify.Manman Ren2013-06-262-5/+5
| | | | | | | | No functionality change. It should suffice to check the type of a debug info metadata, instead of calling Verify. llvm-svn: 185020
* Erase all of the instructions that we RAUWedNadav Rotem2013-06-261-0/+9
| | | | llvm-svn: 184969
* Do not add cse-ed instructions into the visited map because we dont want to ↵Nadav Rotem2013-06-261-5/+9
| | | | | | consider them as a candidate for replacement of instructions to be visited. llvm-svn: 184966
* [asan] workaround for PR16277: don't instrument AllocaInstr with alignment ↵Kostya Serebryany2013-06-261-1/+2
| | | | | | more than the redzone size llvm-svn: 184928
* [asan] add option -asan-keep-uninstrumented-functionsKostya Serebryany2013-06-261-4/+47
| | | | llvm-svn: 184927
* dbgs() << Instruction doesn't print a newline on the end any more. Update theseNick Lewycky2013-06-261-5/+5
| | | | | | | debug statements to add a missing newline. Also canonicalize to '\n' instead of "\n"; the latter calls a function with a loop the former does not. llvm-svn: 184897
* SLPVectorizer: support slp-vectorization of PHINodes between basic blocksNadav Rotem2013-06-251-1/+96
| | | | llvm-svn: 184888
* Fix SROA to avoid unnecessary scalar conversions for 1-element vectors.Bob Wilson2013-06-251-15/+16
| | | | | | | | | | | When a 1-element vector alloca is promoted, a store instruction can often be rewritten without converting the value to a scalar and using an insertelement instruction to stuff it into the new alloca. This patch just adds a check to skip that conversion when it is unnecessary. This turns out to be really important for some ARM Neon operations where <1 x i64> is used to get around the fact that i64 is not a legal type. llvm-svn: 184870
* Fix a typo in the code that collected the costs recursively.Nadav Rotem2013-06-251-1/+1
| | | | llvm-svn: 184827
* Rename the variable to fix a warning. Thanks Andy Gibbs.Nadav Rotem2013-06-241-2/+2
| | | | llvm-svn: 184749
* Reapply 184685 after the SetVector iteration order fix.Arnold Schwaighofer2013-06-241-232/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This should hopefully have fixed the stage2/stage3 miscompare on the dragonegg testers. "LoopVectorize: Use the dependence test utility class We now no longer need alias analysis - the cases that alias analysis would handle are now handled as accesses with a large dependence distance. We can now vectorize loops with simple constant dependence distances. for (i = 8; i < 256; ++i) { a[i] = a[i+4] * a[i+8]; } for (i = 8; i < 256; ++i) { a[i] = a[i-4] * a[i-8]; } We would be able to vectorize about 200 more loops (in many cases the cost model instructs us no to) in the test suite now. Results on x86-64 are a wash. I have seen one degradation in ammp. Interestingly, the function in which we now vectorize a loop is never executed so we probably see some instruction cache effects. There is a 2% improvement in h264ref. There is one or the other TSCV loop kernel that speeds up. radar://13681598" llvm-svn: 184724
* LoopVectorize: Use SetVector for the access setArnold Schwaighofer2013-06-241-1/+2
| | | | | | | We are creating the runtime checks using this set so we need a deterministic iteration order. llvm-svn: 184723
* Add a flag to defer vectorization into a phase after the inliner and itsChandler Carruth2013-06-241-16/+66
| | | | | | | | | | | | | CGSCC pass manager. This should insulate the inlining decisions from the vectorization decisions, however it may have both compile time and code size problems so it is just an experimental option right now. Adding this based on a discussion with Arnold and it seems at least worth having this flag for us to both run some experiments to see if this strategy is workable. It may solve some of the regressions seen with the loop vectorizer. llvm-svn: 184698
* Revert "LoopVectorize: Use the dependence test utility class"Arnold Schwaighofer2013-06-241-104/+232
| | | | | | | | This reverts commit cbfa1ca993363ca5c4dbf6c913abc957c584cbac. We are seeing a stage2 and stage3 miscompare on some dragonegg bots. llvm-svn: 184690
* LoopVectorize: Use the dependence test utility classArnold Schwaighofer2013-06-241-232/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | We now no longer need alias analysis - the cases that alias analysis would handle are now handled as accesses with a large dependence distance. We can now vectorize loops with simple constant dependence distances. for (i = 8; i < 256; ++i) { a[i] = a[i+4] * a[i+8]; } for (i = 8; i < 256; ++i) { a[i] = a[i-4] * a[i-8]; } We would be able to vectorize about 200 more loops (in many cases the cost model instructs us no to) in the test suite now. Results on x86-64 are a wash. I have seen one degradation in ammp. Interestingly, the function in which we now vectorize a loop is never executed so we probably see some instruction cache effects. There is a 2% improvement in h264ref. There is one or the other TSCV loop kernel that speeds up. radar://13681598 llvm-svn: 184685
* LoopVectorize: Add utility class for checking dependency among accessesArnold Schwaighofer2013-06-241-0/+379
| | | | | | | | | | | | This class checks dependences by subtracting two Scalar Evolution access functions allowing us to catch very simple linear dependences. The checker assumes source order in determining whether vectorization is safe. We currently don't reorder accesses. Positive true dependencies need to be a multiple of VF otherwise we impede store-load forwarding. llvm-svn: 184684
* LoopVectorize: Add utility class for building sets of dependent accessesArnold Schwaighofer2013-06-241-0/+247
| | | | | | | Sets of dependent accesses are built by unioning sets based on underlying objects. This class will be used by the upcoming dependence checker. llvm-svn: 184683
* SLP Vectorizer: Add support for vectorizing parts of the tree.Nadav Rotem2013-06-241-5/+25
| | | | | | | | | | | | Untill now we detected the vectorizable tree and evaluated the cost of the entire tree. With this patch we can decide to trim-out branches of the tree that are not profitable to vectorizer. Also, increase the max depth from 6 to 12. In the worse possible case where all of the code is made of diamond-shaped graph this can bring the cost to 2**10, but diamonds are not very common. llvm-svn: 184681
* SLP Vectorizer: Fix a bug in the code that does CSE on the generated gather ↵Nadav Rotem2013-06-231-4/+10
| | | | | | | | sequences. Make sure that we don't replace and RAUW two sequences if one does not dominate the other. llvm-svn: 184674
* SLP Vectorizer: Erase instructions outside the vectorizeTree method.Nadav Rotem2013-06-231-3/+11
| | | | | | The RAII builder location guard is saving a reference to instructions, so we can't erase instructions during vectorization. llvm-svn: 184671
* SLP Vectorizer: Implement a simple CSE optimization for the gather sequences.Nadav Rotem2013-06-231-24/+45
| | | | llvm-svn: 184660
OpenPOWER on IntegriCloud