summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
* Improved compile time:Stepan Dyatkovskiy2012-01-111-38/+98
| | | | | | | | | | | | 1. Size heuristics changed. Now we calculate number of unswitching branches only once per loop. 2. Some checks was moved from UnswitchIfProfitable to processCurrentLoop, since it is not changed during processCurrentLoop iteration. It allows decide to skip some loops at an early stage. Extended statistics: - Added total number of instructions analyzed. llvm-svn: 147935
* Enable LSR IV Chains with sufficient heuristics.Andrew Trick2012-01-101-5/+210
| | | | | | | | | | | | | | | | | | | | | | | | | These heuristics are sufficient for enabling IV chains by default. Performance analysis has been done for i386, x86_64, and thumbv7. The optimization is rarely important, but can significantly speed up certain cases by eliminating spill code within the loop. Unrolled loops are prime candidates for IV chains. In many cases, the final code could still be improved with more target specific optimization following LSR. The goal of this feature is for LSR to make the best choice of induction variables. Instruction selection may not completely take advantage of this feature yet. As a result, there could be cases of slight code size increase. Code size can be worse on x86 because it doesn't support postincrement addressing. In fact, when chains are formed, you may see redundant address plus stride addition in the addressing mode. GenerateIVChains tries to compensate for the common cases. On ARM, code size increase can be mitigated by using postincrement addressing, but downstream codegen currently misses some opportunities. llvm-svn: 147826
* Adding IV chain generation to LSR.Andrew Trick2012-01-091-5/+228
| | | | | | | | | | | | | | | | | | After collecting chains, check if any should be materialized. If so, hide the chained IV users from the LSR solver. LSR will only solve for the head of the chain. GenerateIVChains will then materialize the chained IV users by computing the IV relative to its previous value in the chain. In theory, chained IV users could be exposed to LSR's solver. This would be considerably complicated to implement and I'm not aware of a case where we need it. In practice it's more important to intelligently prune the search space of nontrivial loops before running the solver, otherwise the solver is often forced to prune the most optimal solutions. Hiding the chained users does this well, so that LSR is more likely to find the best IV for the chain as a whole. llvm-svn: 147801
* Adding collection of IV chains to LSR.Andrew Trick2012-01-091-0/+242
| | | | | | | | This collects a set of IV uses within the loop whose values can be computed relative to each other in a sequence. Following checkins will make use of this information. llvm-svn: 147797
* "Minor LSR debugging stuff"Andrew Trick2012-01-091-1/+4
| | | | llvm-svn: 147785
* Enable redundant phi elimination after LSR.Andrew Trick2012-01-071-1/+3
| | | | | | This will be more important as we extend the LSR pass in ways that don't rely on the formula solver. In particular, we need it for constructing IV chains. llvm-svn: 147724
* LSR: Don't optimize loops if an outer loop has no preheader.Andrew Trick2012-01-071-1/+8
| | | | | | | | LoopSimplify may not run on some outer loops, e.g. because of indirect branches. SCEVExpander simply cannot handle outer loops with no preheaders. Fixes rdar://10655343 SCEVExpander segfault. llvm-svn: 147718
* LSR: run DeleteDeadPhis before replaceCongruentPhis.Andrew Trick2012-01-071-19/+15
| | | | llvm-svn: 147711
* Extended replaceCongruentPhis to handle mixed phi types.Andrew Trick2012-01-071-2/+2
| | | | llvm-svn: 147707
* Turn cos(-x) into cos(x). Patch by Alexander Malyshev!Nick Lewycky2011-12-271-5/+27
| | | | llvm-svn: 147291
* Fix warning.Rafael Espindola2011-12-261-1/+2
| | | | llvm-svn: 147284
* Fix typo "infinte".Nick Lewycky2011-12-231-1/+2
| | | | llvm-svn: 147226
* Add the actual code for r147175.Chad Rosier2011-12-221-11/+82
| | | | llvm-svn: 147176
* Speculatively revert r146578 to determine if it is the cause of a number ofChad Rosier2011-12-221-82/+11
| | | | | | | | | | | performance regressions (both execution-time and compile-time) on our nightly testers. Original commit message: Fix for bug #11429: Wrong behaviour for switches. Small improvement for code size heuristics. llvm-svn: 147131
* Fix a copy+pasto. No testcase, because the symptoms of dereferencingDan Gohman2011-12-211-1/+1
| | | | | | an invalid iterator aren't reproducible. rdar://10614085. llvm-svn: 147098
* Move Instruction::isSafeToSpeculativelyExecute out of VMCore andDan Gohman2011-12-142-2/+4
| | | | | | | | | into Analysis as a standalone function, since there's no need for it to be in VMCore. Also, update it to use isKnownNonZero and other goodies available in Analysis, making it more precise, enabling more aggressive optimization. llvm-svn: 146610
* Fix for bug #11429: Wrong behaviour for switches. Small improvement for code ↵Stepan Dyatkovskiy2011-12-141-11/+82
| | | | | | size heuristics. llvm-svn: 146578
* It turns out that clang does use pointer-to-function types toDan Gohman2011-12-141-2/+6
| | | | | | point to ARC-managed pointers sometimes. This fixes rdar://10551239. llvm-svn: 146577
* Cleanup. Clarify LSRInstance public methods.Andrew Trick2011-12-131-1/+1
| | | | llvm-svn: 146459
* Indvars: guard against exponential behavior in isHighCostExpansion.Andrew Trick2011-12-121-2/+7
| | | | | | | | This should always be done as a matter of principal. I don't have a case that exposes the problem. I just noticed this recently while scanning the code and realized I meant to fix it long ago. llvm-svn: 146438
* Only replace fwrite with fputc, if the return value is unused.Joerg Sonnenberger2011-12-121-1/+2
| | | | llvm-svn: 146411
* LLVMBuild: Remove trailing newline, which irked me.Daniel Dunbar2011-12-121-1/+0
| | | | llvm-svn: 146409
* When computing reverse-CFG reverse-post-order, skip backedges, asDan Gohman2011-12-121-38/+94
| | | | | | | | | | | | | detected in the forward-CFG DFS. This prevents the reverse-CFG from visiting blocks inside loops after blocks that dominate them in the case where loops have multiple exits. No testcase, because this fixes a bug which in practice only shows up in a full optimizer run, due to the use-list order. This fixes rdar://10422791 and others. llvm-svn: 146408
* Add a TODO comment.Dan Gohman2011-12-121-0/+1
| | | | llvm-svn: 146389
* Fix a copy+pasto in a comment.Dan Gohman2011-12-121-1/+1
| | | | llvm-svn: 146385
* Use getArgOperand instead of getOperand on a call.Dan Gohman2011-12-121-1/+1
| | | | llvm-svn: 146384
* Inline SetSeqToRelease into its only caller, since it's more clear that way.Dan Gohman2011-12-121-11/+4
| | | | llvm-svn: 146383
* Fix omitted break statements in a switch.Dan Gohman2011-12-121-0/+2
| | | | llvm-svn: 146380
* Switch llvm.cttz and llvm.ctlz to accept a second i1 parameter whichChandler Carruth2011-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | indicates whether the intrinsic has a defined result for a first argument equal to zero. This will eventually allow these intrinsics to accurately model the semantics of GCC's __builtin_ctz and __builtin_clz and the X86 instructions (prior to AVX) which implement them. This patch merely sets the stage by extending the signature of these intrinsics and establishing auto-upgrade logic so that the old spelling still works both in IR and in bitcode. The upgrade logic preserves the existing (inefficient) semantics. This patch should not change any behavior. CodeGen isn't updated because it can use the existing semantics regardless of the flag's value. Note that this will be followed by API updates to Clang and DragonEgg. Reviewed by Nick Lewycky! llvm-svn: 146357
* LSR: ignore strides in outer loops.Andrew Trick2011-12-101-1/+2
| | | | | | | | Since we're not rewriting IVs in other loops, there's not much reason to consider their stride when generating formulae. This should reduce the number of useless formulas considered by LSR. llvm-svn: 146302
* SplitBlockPredecessors uses ArrayRef instead of Data and Size.Jakub Staszak2011-12-092-8/+4
| | | | llvm-svn: 146277
* Add -unroll-runtime for unrolling loops with run-time trip counts.Andrew Trick2011-12-091-7/+28
| | | | | | | | | | | | | Patch by Brendon Cahoon! This extends the existing LoopUnroll and LoopUnrollPass. Brendon measured no regressions in the llvm test suite with -unroll-runtime enabled. This implementation works by using the existing loop unrolling code to unroll the loop by a power-of-two (default 8). It generates an if-then-else sequence of code prior to the loop to execute the extra iterations before entering the unrolled loop. llvm-svn: 146245
* Fix infinite loop in DSE when deleting a free in a reachable loop that's alsoNick Lewycky2011-12-081-1/+1
| | | | | | trivially infinite. llvm-svn: 146197
* Push StringRefs through the metadata interface.Benjamin Kramer2011-12-061-1/+1
| | | | llvm-svn: 145934
* LSR: prune undesirable formulae early.Andrew Trick2011-12-061-46/+85
| | | | | | | | | It's always good to prune early, but formulae that are unsatisfactory in their own right need to be removed before running any other pruning heuristics. We easily avoid generating such formulae, but we need them as an intermediate basis for forming other good formulae. llvm-svn: 145906
* Update comment.Chad Rosier2011-12-051-1/+1
| | | | llvm-svn: 145866
* Make the MemCpyOptimizer a bit more aggressive. I can't think of a scenerioChad Rosier2011-12-051-1/+1
| | | | | | | | where this would be bad as the backend shouldn't have a problem inlining small memcpys. rdar://10510150 llvm-svn: 145865
* Add support for vectors of pointers.Nadav Rotem2011-12-052-0/+7
| | | | llvm-svn: 145801
* Fixed deadstoreelimination bug where negative indices were incorrectly ↵Pete Cooper2011-12-031-1/+1
| | | | | | | | | | causing the optimisation to occur Turns out long long + unsigned long long is unsigned. Doh! Fixes http://llvm.org/bugs/show_bug.cgi?id=11455 llvm-svn: 145731
* Fix a few more places where TargetData/TargetLibraryInfo is not being passed.Chad Rosier2011-12-022-2/+17
| | | | | | Add FIXMEs to places that are non-trivial to fix. llvm-svn: 145661
* Last bit of TargetLibraryInfo propagation. Also fixed a case for TargetDataChad Rosier2011-12-012-8/+32
| | | | | | | where it appeared beneficial to pass. More of rdar://10500969 llvm-svn: 145630
* Propagate TargetLibraryInfo throughout ConstantFolding.cpp and Chad Rosier2011-12-015-7/+28
| | | | | | | InstructionSimplify.cpp. Other fixups as needed. Part of rdar://10500969 llvm-svn: 145559
* Make GlobalMerge honor the preferred alignment on globals without an ↵Eli Friedman2011-11-301-1/+1
| | | | | | | | explicitly specified alignment. <rdar://problem/10497732>. llvm-svn: 145523
* Potential bug in RewriteLoopBodyWithConditionConstant: use iterator should ↵Stepan Dyatkovskiy2011-11-291-1/+5
| | | | | | not be changed inside the uses enumeration loop. llvm-svn: 145432
* build/CMake: Finish removal of add_llvm_library_dependencies.Daniel Dunbar2011-11-291-9/+0
| | | | llvm-svn: 145420
* SCEV fix. In general, Add/Mul expressions should not inherit NSW/NUW.Andrew Trick2011-11-291-2/+6
| | | | | | | This reverts r139450, fixes r139453, and adds much needed comments and a unit test. llvm-svn: 145367
* Remove the temporary flag -disable-unroll-scev and dead code.Andrew Trick2011-11-281-19/+7
| | | | | | SCEV should now be used for trip count analysis, not LoopInfo. llvm-svn: 145262
* Move code into anonymous namespaces.Benjamin Kramer2011-11-262-19/+15
| | | | llvm-svn: 145154
* Refactor code to use new attribute getters on CallSite for NoCapture and ByVal.Nick Lewycky2011-11-202-4/+3
| | | | | | | | Suggested in code review by Eli. That code in InstCombine looks kinda suspicious. llvm-svn: 145013
* Add support for custom names for library functions in TargetLibraryInfo. ↵Eli Friedman2011-11-171-7/+13
| | | | | | | | | | Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom names for fwrite and fputs. Fixes <rdar://problem/9815881>. llvm-svn: 144876
OpenPOWER on IntegriCloud