summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
* Drop spurious handle in comment.Benjamin Kramer2013-09-221-1/+1
| | | | llvm-svn: 191172
* SROA: Handle casts involving vectors of pointers and integer scalars.Benjamin Kramer2013-09-211-11/+47
| | | | | | | | | | SROA wants to convert any types of equivalent widths but it's not possible to convert vectors of pointers to an integer scalar with a single cast. As a workaround we add a bitcast to the corresponding int ptr type first. This type of cast used to be an edge case but has become common with SLP vectorization. Fixes PR17271. llvm-svn: 191143
* Resurrect r191017 " GVN proceeds in the presence of dead code" plus a fix to ↵Shuxin Yang2013-09-201-6/+168
| | | | | | | | | PR17307 & 17308. The problem of r191017 is that when GVN fabricate a val-number for a dead instruction (in order to make following expr-PRE happy), it forget to fabricate a leader-table entry for it as well. llvm-svn: 191118
* Revert r191017, it results in segmentation faults in Qt.Joerg Sonnenberger2013-09-201-164/+6
| | | | llvm-svn: 191104
* GVN proceeds in the presence of dead code.Shuxin Yang2013-09-191-6/+164
| | | | | | | | | | | | | | | | | | | | | | This is how it ignores the dead code: 1) When a dead branch target, say block B, is identified, all the blocks dominated by B is dead as well. 2) The PHIs of those blocks in dominance-frontier(B) is updated such that the operands corresponding to dead predecessors are replaced by "UndefVal". Using lattice's jargon, the "UndefVal" is the "Top" in essence. Phi node like this "phi(v1 bb1, undef xx)" will be optimized into "v1" if v1 is constant, or v1 is an instruction which dominate this PHI node. 3) When analyzing the availability of a load L, all dead mem-ops which L depends on disguise as a load which evaluate exactly same value as L. 4) The dead mem-ops will be materialized as "UndefVal" during code motion. llvm-svn: 191017
* MemCpyOptimizer: Use max legal int size instead of pointer sizeMatt Arsenault2013-09-161-5/+8
| | | | | | | | | | | | If there are no legal integers, assume 1 byte. This makes more sense than using the pointer size as a guess for the maximum GPR width. It is conceivable to want to use some 64-bit pointers on a target where 64-bit integers aren't legal. llvm-svn: 190817
* Remove the long, long defunct IR block placement pass.Chandler Carruth2013-09-143-154/+0
| | | | | | | | | | | | | | | | | This pass was based on the previous (essentially unused) profiling infrastructure and the assumption that by ordering the basic blocks at the IR level in a particular way, the correct layout would happen in the end. This sometimes worked, and mostly didn't. It also was a really naive implementation of the classical paper that dates from when branch predictors were primarily directional and when loop structure wasn't commonly available. It also didn't factor into the equation non-fallthrough branches and other machine level details. Anyways, for all of these reasons and more, I wrote MachineBlockPlacement, which completely supercedes this pass. It both uses modern profile information infrastructure, and actually works. =] llvm-svn: 190748
* Add getUnrollingPreferences to TTIHal Finkel2013-09-111-7/+25
| | | | | | | | | Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 190542
* Teach loop-idiom about address space pointer sizesMatt Arsenault2013-09-111-12/+21
| | | | llvm-svn: 190491
* Add bracesMatt Arsenault2013-09-111-6/+9
| | | | llvm-svn: 190490
* Get rid of unused isPodLike definitions.Eli Friedman2013-09-111-10/+0
| | | | llvm-svn: 190461
* Fix mistake in r190442.Eli Friedman2013-09-101-0/+7
| | | | llvm-svn: 190446
* Remove unused functions.Eli Friedman2013-09-101-5/+0
| | | | llvm-svn: 190442
* Teach ScalarEvolution about pointer address spacesMatt Arsenault2013-09-101-1/+1
| | | | llvm-svn: 190425
* Use type helper functions.Matt Arsenault2013-09-061-2/+1
| | | | llvm-svn: 190113
* Teach CodeGenPrepare about address spacesMatt Arsenault2013-09-061-4/+2
| | | | llvm-svn: 190112
* Revert: r189565 - Add getUnrollingPreferences to TTIHal Finkel2013-08-291-17/+5
| | | | | | | | | | | | | | | Revert unintentional commit (of an unreviewed change). Original commit message: Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189566
* Add getUnrollingPreferences to TTIHal Finkel2013-08-291-5/+17
| | | | | | | | | Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189565
* Turn MipsOptimizeMathLibCalls into a target-independent scalar transformRichard Sandiford2013-08-233-0/+162
| | | | | | | | | | ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097
* Revert r187191, which broke opt -mem2reg on the testcases included in PR16867.Nick Lewycky2013-08-132-83/+14
| | | | | | | | | | | | However, opt -O2 doesn't run mem2reg directly so nobody noticed until r188146 when SROA started sending more things directly down the PromoteMemToReg path. In order to revert r187191, I also revert dependent revisions r187296, r187322 and r188146. Fixes PR16867. Does not add the testcases from that PR, but both of them should get added for both mem2reg and sroa when this revert gets unreverted. llvm-svn: 188327
* Reapply r188119 now that the bug it exposed is fixed.Peter Collingbourne2013-08-121-160/+5
| | | | llvm-svn: 188217
* Re-instate r187323 which fast-tracks promotable allocas as soon as theChandler Carruth2013-08-111-12/+81
| | | | | | | | | | | | | | | | | | | | | | | | | SROA-based analysis has enough information. This should work now that both mem2reg *and* the SSAUpdater-based AllocaPromoter have been updated to be able to promote the types of allocas that the SROA analysis detects. I've included tests for the AllocaPromoter that were only possible to write once we fast-tracked promotable allocas without rewriting them. This includes a test both for r187347 and r188145. Original commit log for r187323: """ Now that mem2reg understands how to cope with a slightly wider set of uses of an alloca, we can pre-compute promotability while analyzing an alloca for splitting in SROA. That lets us short-circuit the common case of a bunch of trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to within 20% of ScalarRepl for such code. My current benchmark for these numbers is PR15412, but it fits the general pattern of IR emitted by Clang so it should be widely applicable. """ llvm-svn: 188146
* Finish fixing the SSAUpdater-based AllocaPromoter strategy in SROA to cope withChandler Carruth2013-08-111-2/+23
| | | | | | | | | | | | | | | | | | | | the more general set of patterns that are now handled by mem2reg and that we can detect quickly while doing SROA's initial analysis. Notably, this allows it to promote through no-op bitcast and GEP sequences. A core part of the SSAUpdater approach is the ability to test whether a particular instruction is part of the set being promoted. Testing this becomes significantly more complex in the world where the operand to every load and store isn't the alloca itself. I ended up using the approach of walking up the def-chain until we find the alloca. I benchmarked this against keeping a set of pointer operands and keeping a set of the loads and stores we care about, and this one seemed faster although the difference was very small. No test case yet because currently the rewriting always "fixes" the inputs to not require this. The next patch which re-enables early promotion of easy cases in SROA will include a test case that specifically exercises this aspect of the alloca promoter. llvm-svn: 188145
* Reformat some bits of AllocaPromoter and simplify the name and type ofChandler Carruth2013-08-111-21/+20
| | | | | | | | | our visiting datastructures in the AllocaPromoter/SSAUpdater path of SROA. Also shift the order if clears around to be more consistent. No functionality changed here, this is just a cleanup. llvm-svn: 188144
* Revert r188119 "Kill some duplicated code for removing unreachable BBs."Arnold Schwaighofer2013-08-101-5/+160
| | | | | | | | | | | | | | It is breaking builbots with libgmalloc enabled on Mac OS X. $ cd llvm ; mkdir release ; cd release $ ../configure --enable-optimized —prefix=$PWD/install $ make $ make check $ Release+Asserts/bin/llvm-lit -v --param use_gmalloc=1 --param \ gmalloc_path=/usr/lib/libgmalloc.dylib \ ../test/Instrumentation/DataFlowSanitizer/args-unreachable-bb.ll llvm-svn: 188142
* Kill some duplicated code for removing unreachable BBs.Peter Collingbourne2013-08-091-160/+5
| | | | | | | | | | | This moves removeUnreachableBlocksFromFn from SimplifyCFGPass.cpp to Utils/Local.cpp and uses it to replace the implementation of llvm::removeUnreachableBlocks, which appears to do a strict subset of what removeUnreachableBlocksFromFn does. Differential Revision: http://llvm-reviews.chandlerc.com/D1334 llvm-svn: 188119
* JumpThreading: Turn a select instruction into branching if it allows to ↵Benjamin Kramer2013-08-071-0/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | thread one half of the select. This is a common pattern coming out of simplifycfg generating gross code. a: ; preds = %entry %sel = select i1 %cmp1, double %add, double 0.000000e+00 br label %b b: %cond5 = phi double [ %sel, %a ], [ %sub, %entry ] %cmp6 = fcmp oeq double %cond5, 0.000000e+00 br i1 %cmp6, label %if.then, label %if.end becomes a: br i1 %cmp1, label %b, label %if.then b: %cond5 = phi double [ %sub, %entry ], [ %add, %a ] %cmp6 = fcmp oeq double %cond5, 0.000000e+00 br i1 %cmp6, label %if.then, label %if.end Skipping block b completely if possible. llvm-svn: 187880
* Adjust file to the coding standard.Jakub Staszak2013-08-061-53/+49
| | | | llvm-svn: 187808
* Factor FlattenCFG out from SimplifyCFGTom Stellard2013-08-064-53/+94
| | | | | | Patch by: Mei Ye llvm-svn: 187764
* Teach the AllocaPromoter which is wrapped around the SSAUpdaterChandler Carruth2013-07-291-15/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | infrastructure to do promotion without a domtree the same smarts about looking through GEPs, bitcasts, etc., that I just taught mem2reg about. This way, if SROA chooses to promote an alloca which still has some noisy instructions this code can cope with them. I've not used as principled of an approach here for two reasons: 1) This code doesn't really need it as we were already set up to zip through the instructions used by the alloca. 2) I view the code here as more of a hack, and hopefully a temporary one. The SSAUpdater path in SROA is a real sore point for me. It doesn't make a lot of architectural sense for many reasons: - We're likely to end up needing the domtree anyways in a subsequent pass, so why not compute it earlier and use it. - In the future we'll likely end up needing the domtree for parts of the inliner itself. - If we need to we could teach the inliner to preserve the domtree. Part of the re-work of the pass manager will allow this to be very powerful even in large SCCs with many functions. - Ultimately, computing a domtree has gotten significantly faster since the original SSAUpdater-using code went into ScalarRepl. We no longer use domfrontiers, and much of domtree is lazily done based on queries rather than eagerly. - At this point keeping the SSAUpdater-based promotion saves a total of 0.7% on a build of the 'opt' tool for me. That's not a lot of performance given the complexity! So I'm leaving this a bit ugly in the hope that eventually we just remove all of this nonsense. I can't even readily test this because this code isn't reachable except through SROA. When I re-instate the patch that fast-tracks allocas already suitable for promotion, I'll add a testcase there that failed before this change. Before that, SROA will fix any test case I give it. llvm-svn: 187347
* Temporarily revert r187323 until I update SSAUpdater to match mem2reg.Chandler Carruth2013-07-281-81/+12
| | | | | | I forgot that we had two totally independent things here. :: sigh :: llvm-svn: 187327
* Now that mem2reg understands how to cope with a slightly wider set ofChandler Carruth2013-07-281-12/+81
| | | | | | | | | | | | uses of an alloca, we can pre-compute promotability while analyzing an alloca for splitting in SROA. That lets us short-circuit the common case of a bunch of trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to within 20% of ScalarRepl for such code. My current benchmark for these numbers is PR15412, but it fits the general pattern of IR emitted by Clang so it should be widely applicable. llvm-svn: 187323
* Thread DataLayout through the callers and into mem2reg. This will beChandler Carruth2013-07-282-2/+2
| | | | | | | useful in a subsequent patch, but causes an unfortunate amount of noise, so I pulled it out into a separate patch. llvm-svn: 187322
* Don't use all the #ifdefs to hide the stats counters and instead rely onChandler Carruth2013-07-271-18/+0
| | | | | | | | | their being optimized out in debug mode. Realistically, this just isn't going to be the slow part anyways. This also fixes unused variable warnings that are breaking LLD build bots. =/ I didn't see these at first, and kept losing track of the fact that they were broken. llvm-svn: 187297
* Reimplement isPotentiallyReachable to make nocapture deduction much stronger.Nick Lewycky2013-07-272-2/+2
| | | | | | | | | | Adds unit tests for it too. Split BasicBlockUtils into an analysis-half and a transforms-half, and put the analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable into llvm::isPotentiallyReachable and move it into Analysis/CFG. llvm-svn: 187283
* SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch ↵Tom Stellard2013-07-272-23/+61
| | | | | | | | | | | | | | conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
* TRE: Move class into anonymous namespace.Benjamin Kramer2013-07-241-4/+6
| | | | | | While there shrink a dangerously large SmallPtrSet. llvm-svn: 187050
* Fix a problem I introduced in r187029 where we would over-eagerlyChandler Carruth2013-07-241-3/+9
| | | | | | | | schedule an alloca for another iteration in SROA. This only showed up with a mixture of promotable and unpromotable selects and phis. Added a test case for this. llvm-svn: 187031
* Fix PR16687 where we were incorrectly promoting an alloca that hadChandler Carruth2013-07-241-12/+29
| | | | | | | | | | | | | | | | | | | | | | pending speculation for a phi node. The problem here is that we were using growth of the specluation set as an indicator of whether speculation would occur, and if the phi node is already in the set we don't see it grow. This is a symptom of the fact that this signal is a total hack. Unfortunately, I couldn't really come up with a non-hacky way of signaling that promotion remains valid *after* speculation occurs, such that we only speculate when all else looks good for promotion. In the end, I went with at least a much more explicit approach of doing the work of queuing inside the phi and select processing and setting a preposterously named flag to convey that we're in the special state of requiring speculating before promotion. Thanks to Richard Trieu and Nick Lewycky for the excellent work reducing a testcase for this from a pretty giant, nasty assert in a big application. =] The testcase was excellent. llvm-svn: 187029
* Remove extraneous null statement. No functionality change!Nick Lewycky2013-07-221-1/+1
| | | | llvm-svn: 186893
* Use switch instead of if. No functionality change.Jakub Staszak2013-07-221-14/+17
| | | | llvm-svn: 186892
* OldPtr is llvm::Instruction. Remove unneeded cast<>.Jakub Staszak2013-07-221-1/+1
| | | | llvm-svn: 186880
* Change tabs to spaces.Jakub Staszak2013-07-221-2/+2
| | | | llvm-svn: 186877
* Fix spelling and grammarMatt Arsenault2013-07-221-12/+12
| | | | llvm-svn: 186858
* SROA: Microoptimization: Remove dead entries first, then sort.Benjamin Kramer2013-07-201-9/+4
| | | | | | While there replace an explicit struct with std::mem_fun. llvm-svn: 186761
* Cleanup the stats counters for the new implementation. These actuallyChandler Carruth2013-07-191-12/+36
| | | | | | count the right things and have the right names. llvm-svn: 186667
* Fix another assert failure very similar to PR16651's test case. ThisChandler Carruth2013-07-191-0/+2
| | | | | | | test case came from Benjamin and found the parallel bug in the vector promotion code. llvm-svn: 186666
* Try to move to a more reasonable set of naming conventions given the newChandler Carruth2013-07-191-322/+305
| | | | | | | | | | | | | | | | implementation of the SROA algorithm. We were using the term 'partition' in many places that no longer ever represented an actual partition, but rather just an arbitrary slice of an alloca. No functionality change intended here. Mostly just renaming of types, functions, variables, and rewording of comments. Several comments were rewritten to make a lot more sense in the new structure of things. The stats are still weird and not reflective of how this really works. I'll fix those up in a separate patch as it is a touch more semantic of a change... llvm-svn: 186659
* A long overdue cleanup in SROA to use 'DL' instead of 'TD' for theChandler Carruth2013-07-191-123/+123
| | | | | | DataLayout variables. llvm-svn: 186656
* Fix PR16651, an assert introduced in my recent re-work of the innards ofChandler Carruth2013-07-191-9/+3
| | | | | | | | | | | | | | SROA. The crux of the issue is that now we track uses of a partition of the alloca in two places: the iterators over the partitioning uses and the previously collected split uses vector. We weren't accounting for the fact that the split uses might invalidate integer widening in ways other than due to their width (in this case due to being volatile). Further reduced testcase added to the tests. llvm-svn: 186655
OpenPOWER on IntegriCloud