summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
* Teach DeadStoreElimination to eliminate exit-block stores with phi addresses.Dan Gohman2012-05-101-3/+19
| | | | llvm-svn: 156558
* teach DSE and isInstructionTriviallyDead() about callocNuno Lopes2012-05-101-3/+16
| | | | llvm-svn: 156553
* Fix the objc_storeStrong recognizer to stop before walking off theDan Gohman2012-05-091-1/+4
| | | | | | end of a basic block if there's no store. llvm-svn: 156520
* Remove unused variable to get rid of warning.Craig Topper2012-05-091-1/+1
| | | | llvm-svn: 156466
* Miscellaneous accumulated cleanups.Dan Gohman2012-05-081-104/+78
| | | | llvm-svn: 156445
* Fix objc_storeStrong pattern matching to catch a potential use of theDan Gohman2012-05-081-9/+29
| | | | | | | old value after the store but before it is released. This fixes rdar:/11116986. llvm-svn: 156442
* Calling ReassociateExpression recursively is extremely dangerous since it willDuncan Sands2012-05-081-7/+7
| | | | | | | | | | | | | replace the operands of expressions with only one use with undef and generate a new expression for the original without using RAUW to update the original. Thus any copies of the original expression held in a vector may end up referring to some bogus value - and using a ValueHandle won't help since there is no RAUW. There is already a mechanism for getting the effect of recursion non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't being used systematically. Have various places where recursion had snuck in at some point use the RedoInsts mechanism instead. Fixes PR12169. llvm-svn: 156379
* Teach reassociate to commute FMul's and FAdd's in order to canonicalize the ↵Owen Anderson2012-05-071-4/+28
| | | | | | order of their operands across instructions. This allows for greater CSE opportunities. llvm-svn: 156323
* Switch the select to branch transformation on by default.Benjamin Kramer2012-05-061-3/+4
| | | | | | | | | The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258
* CodeGenPrepare: Add a transform to turn selects into branches in some cases.Benjamin Kramer2012-05-051-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234
* Add 'landingpad' instructions to the list of instructions to ignore.Bill Wendling2012-05-041-7/+9
| | | | | | Also combine the code in the 'assert' statement. llvm-svn: 156155
* A pile of long over-due refactorings here. There are some very, *very*Chandler Carruth2012-05-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | minor behavior changes with this, but nothing I have seen evidence of in the wild or expect to be meaningful. The real goal is unifying our logic and simplifying the interfaces. A summary of the changes follows: - Make 'callIsSmall' actually accept a callsite so it can handle intrinsics, and simplify callers appropriately. - Nuke a completely bogus declaration of 'callIsSmall' that was still lurking in InlineCost.h... No idea how this got missed. - Teach the 'isInstructionFree' about the various more intelligent 'free' heuristics that got added to the inline cost analysis during review and testing. This mostly surrounds int->ptr and ptr->int casts. - Switch most of the interesting parts of the inline cost analysis that were essentially computing 'is this instruction free?' to use the code metrics routine instead. This way we won't keep duplicating logic. All of this is motivated by the desire to allow other passes to compute a roughly equivalent 'cost' metric for a particular basic block as the inline cost analysis. Sadly, re-using the same analysis for both is really messy because only the actual inline cost analysis is ever going to go to the contortions required for simplification, SROA analysis, etc. llvm-svn: 156140
* Whitespace cleanup.Bill Wendling2012-05-021-87/+80
| | | | llvm-svn: 156034
* The value held in the vector may be RAUW'ed by some of the canonicalizationBill Wendling2012-05-021-2/+3
| | | | | | | methods. Use a weak value handle to keep up with this. PR12245 llvm-svn: 155984
* An instruction in a loop is not guaranteed to be executed just because the loopNick Lewycky2012-05-011-0/+5
| | | | | | has no exit blocks. Fixes PR12706! llvm-svn: 155884
* Second attempt at PR12573:Bill Wendling2012-04-301-2/+2
| | | | | | | | | | | Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If the pass is *sure* that it thinks it knows what it's doing, then it may go ahead and specify that the landing pad can have its critical edge split. The loop unswitch pass is one of these passes. It will split the critical edges of all edges coming from a loop to a landing pad not within the loop. Doing so will retain important loop analysis information, such as loop simplify. llvm-svn: 155817
* Remove hack from r154987. The problem persists even with it, so it's not ↵Bill Wendling2012-04-301-11/+1
| | | | | | even a good hack. llvm-svn: 155813
* Make sure HoistInsertPosition finds a position that is dominated by allRafael Espindola2012-04-301-1/+1
| | | | | | inputs. llvm-svn: 155809
* Change recurse depth limit to uint32 to fix warning.David Blaikie2012-04-271-1/+1
| | | | llvm-svn: 155727
* Miscellaneous accumulated cleanups.Dan Gohman2012-04-271-71/+57
| | | | llvm-svn: 155725
* Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks.Mon P Wang2012-04-271-3/+12
| | | | | | | The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow issues. <rdar://problem/11286839>. llvm-svn: 155722
* Break up getProfitableChainIncrement().Jakob Stoklund Olesen2012-04-261-39/+47
| | | | | | | | | | | The required checks are moved to ChainInstruction() itself and the policy decisions are moved to IVChain::isProfitableInc(). Also cache the ExprBase in IVChain to avoid frequent recomputations. No functional change intended. llvm-svn: 155676
* Turn IVChain into a struct.Jakob Stoklund Olesen2012-04-261-19/+42
| | | | | | No functional change intended. llvm-svn: 155675
* Teach the reassociate pass to fold chains of multiplies with repeatedChandler Carruth2012-04-261-10/+247
| | | | | | | | | | | | | | | | | elements to minimize the number of multiplies required to compute the final result. This uses a heuristic to attempt to form near-optimal binary exponentiation-style multiply chains. While there are some cases it misses, it seems to at least a decent job on a very diverse range of inputs. Initial benchmarks show no interesting regressions, and an 8% improvement on SPASS. Let me know if any other interesting results (in either direction) crop up! Credit to Richard Smith for the core algorithm, and helping code the patch itself. llvm-svn: 155616
* Print IV chain numbers while collecting them.Jakob Stoklund Olesen2012-04-251-4/+5
| | | | llvm-svn: 155567
* Simplify the known retain count tracking; use a boolean state insteadDan Gohman2012-04-251-41/+34
| | | | | | | of a precise count. Also, move RRInfo's Partial field into PtrState, now that it won't increase the size. llvm-svn: 155513
* Build custom predecessor and successor lists for each basic block.Dan Gohman2012-04-241-115/+101
| | | | | | | | These lists exclude invoke unwind edges and loop backedges which are being ignored. This makes it easier to ignore them consistently. llvm-svn: 155500
* Put this expensive check below the less expensive ones.Bill Wendling2012-04-191-9/+9
| | | | llvm-svn: 155166
* Avoid a bug in the path count computation, preventing an infiniteDan Gohman2012-04-191-1/+1
| | | | | | loop repeatedlt making the same change. This is for rdar://11256239. llvm-svn: 155160
* Don't crash on code where the user put __attribute__((constructor)) onDan Gohman2012-04-181-1/+5
| | | | | | a function with arguments. This fixes rdar://11265785. llvm-svn: 155073
* Use a heavy hammer to fix PR12573.Bill Wendling2012-04-181-0/+9
| | | | | | | | | | | If the loop contains invoke instructions, whose unwind edge escapes the loop, then don't try to unswitch the loop. Doing so may cause the unwind edge to be split, which not only is non-trivial but doesn't preserve loop simplify information. Fixes PR12573 llvm-svn: 154987
* loop-reduce: Add an early bailout to catch extremely large loops.Andrew Trick2012-04-181-0/+17
| | | | | | | | | | | | | | This introduces a threshold of 200 IV Users, which is very conservative but should be sufficient to avoid serious compile time sink or stack overflow. The llvm test-suite with LTO never exceeds 190 users per loop. The bug doesn't relate to a specific type of loop. Checking in an arbitrary giant loop as a unit test would be silly. Fixes rdar://11262507. llvm-svn: 154983
* fix pr12559: mark unavailable win32 math libcallsJoe Groff2012-04-171-15/+10
| | | | | | also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint llvm-svn: 154960
* Add some comments, and fix a few places that missed setting Changed.Dan Gohman2012-04-131-2/+24
| | | | llvm-svn: 154687
* Consider ObjC runtime calls objc_storeWeak and others which make a copy ofDan Gohman2012-04-131-14/+29
| | | | | | | their argument as "escape" points for objc_retainBlock optimization. This fixes rdar://11229925. llvm-svn: 154682
* Use the new Use-aware dominates method to apply the objc runtimeDan Gohman2012-04-131-8/+5
| | | | | | | library return value optimization for phi uses. Even when the phi itself is not dominated, the specific use may be dominated. llvm-svn: 154647
* Don't move objc_autorelease calls past autorelease pool boundaries whenDan Gohman2012-04-131-3/+43
| | | | | | | optimizing autorelease calls on phi nodes with null operands. This fixes rdar://11207070. llvm-svn: 154642
* Typo.Chad Rosier2012-04-111-1/+1
| | | | llvm-svn: 154522
* Fix 12513: Loop unrolling breaks with indirect branches.Andrew Trick2012-04-101-29/+12
| | | | | | | | Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386
* whitespaceAndrew Trick2012-04-101-140/+140
| | | | llvm-svn: 154385
* Make GVN's propagateEquality non-recursive. No intended functionality change.Duncan Sands2012-04-061-98/+105
| | | | | | The modifications are a lot more trivial than they appear to be in the diff! llvm-svn: 154174
* Fix accidentally inverted logic from r152803, and make theDan Gohman2012-04-051-1/+1
| | | | | | testcase slightly less trivial. This fixes rdar://11171718. llvm-svn: 154118
* Pass the right sign to TLI->isLegalICmpImmediate.Jakob Stoklund Olesen2012-04-051-2/+11
| | | | | | | | | | | | | | | | | | LSR can fold three addressing modes into its ICmpZero node: ICmpZero BaseReg + Offset => ICmp BaseReg, -Offset ICmpZero -1*ScaleReg + Offset => ICmp ScaleReg, Offset ICmpZero BaseReg + -1*ScaleReg => ICmp BaseReg, ScaleReg The first two cases are only used if TLI->isLegalICmpImmediate() likes the offset. Make sure the right Offset sign is passed to this method in the second case. The ARM version is not symmetric. <rdar://problem/11184260> llvm-svn: 154079
* LoopUnrollPass: Use variable "Threshold" instead of "CurrentThreshold" whenHongbin Zheng2012-04-041-2/+2
| | | | | | | reducing unroll count, otherwise the reduced unroll count is not taking the "OptimizeForSize" attribute into account. llvm-svn: 154007
* Fast fix for PR12343:Stepan Dyatkovskiy2012-04-021-4/+29
| | | | | | | | | | http://llvm.org/bugs/show_bug.cgi?id=12343 We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling. Fix forbids this case for unswitching. llvm-svn: 153879
* Don't PRE compares.Jakob Stoklund Olesen2012-03-291-1/+8
| | | | | | | | | | | | CodeGenPrepare sinks compare instructions down to their uses to prevent live flags and predicate registers across basic blocks. PRE of a compare instruction prevents that, forcing the i1 compare result into a general purpose register. That is usually more expensive than the redundant compare PRE was trying to eliminate in the first place. llvm-svn: 153657
* Fix 80-column violation.Chad Rosier2012-03-281-2/+2
| | | | llvm-svn: 153556
* LSR ivchain bug fix: corner case with ConstantExpr.Andrew Trick2012-03-261-2/+3
| | | | | | Fixes PR11950. llvm-svn: 153463
* comment typoAndrew Trick2012-03-261-1/+1
| | | | llvm-svn: 153462
* LSR cleanup: potential bug caught by PVS-Studio.Andrew Trick2012-03-261-2/+3
| | | | | | Thanks Andrey. llvm-svn: 153451
OpenPOWER on IntegriCloud