summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Consistently use LoopAnalysisManagerSean Silva2016-08-091-1/+1
| | | | | | | | | | | | | | | | | One exception here is LoopInfo which must forward-declare it (because the typedef is in LoopPassManager.h which depends on LoopInfo). Also, some includes for LoopPassManager.h were needed since that file provides the typedef. Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278079
* [LoopUnroll] Include hotness of region in opt remarkAdam Nemet2016-07-291-22/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter is added to the common function analysis passes that loop passes depend on. The BFI and indirectly BPI used in this pass is computed lazily so no overhead should be observed unless -pass-remarks-with-hotness is used. This is how the patch affects the O3 pipeline: Dominator Tree Construction Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Rotate Loops Loop Invariant Code Motion Unswitch loops Simplify the CFG Dominator Tree Construction Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Combine redundant instructions Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Induction Variable Simplification Recognize loop idioms Delete dead loops Unroll loops ... llvm-svn: 277203
* [PM] Port LoopUnroll.Sean Silva2016-07-191-0/+33
| | | | | | | | | We just set PreserveLCSSA to always true since we don't have an analogous method `mustPreserveAnalysisID(LCSSA)`. Also port LoopInfo verifier pass to test LoopUnrollPass. llvm-svn: 276063
* [LoopUnroll] Don't crash trying to unroll loop with EH pad exitDavid Majnemer2016-06-151-0/+5
| | | | | | | | | | | | | | We do not support splitting cleanuppad or catchswitches. This is problematic for passes which assume that a loop is in loop simplify form (the loop would have a dedicated exit block instead of sharing it). While it isn't great that we don't support this for cleanups, we still cannot make loop-simplify form an assertable precondition because indirectbr will also disable these sorts of CFG cleanups. This fixes PR28132. llvm-svn: 272739
* The patch set unroll disable pragma when unrollEvgeny Stupachenko2016-06-081-11/+11
| | | | | | | | | | | | | | | | | with user specified count has been applied. Summary: Previously SetLoopAlreadyUnrolled() set the disable pragma only if there was some loop metadata. Now it set the pragma in all cases. This helps to prevent multiple unroll when -unroll-count=N is given. Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D20765 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 272195
* [LoopUnroll] Set correct thresholds for new recently enabled unrolling ↵Michael Zolotukhin2016-06-031-2/+2
| | | | | | | | | | | heuristic. In r270478, where I enabled the new heuristic I posted testing results, which I got when explicitly passed the thresholds values via CL options. However, setting the CL options init-values is not enough to change the default values of thresholds, so I'm changing them in another place now. llvm-svn: 271615
* The patch fixes r271071Evgeny Stupachenko2016-05-281-3/+4
| | | | | | | | | | | Summary: unused variables in Release mode: BasicBlock *Header unsigned OrigCount put under DEBUG From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 271076
* The patch refactors unroll pass.Evgeny Stupachenko2016-05-271-201/+244
| | | | | | | | | | | | | | | | Summary: Unroll factor (Count) calculations moved to a new function. Early exits on pragma and "-unroll-count" defined factor added. New type of unrolling "Force" introduced (previously used implicitly). New unroll preference "AllowRemainder" introduced and set "true" by default. (should be set to false for architectures that suffers from it). Reviewers: hfinkel, mzolotukhin, zzheng Differential Revision: http://reviews.llvm.org/D19553 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 271071
* Apply clang-tidy's misc-move-constructor-init throughout LLVM.Benjamin Kramer2016-05-271-2/+4
| | | | | | No functionality change intended, maybe a tiny performance improvement. llvm-svn: 270997
* [LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.Michael Zolotukhin2016-05-261-20/+16
| | | | | | | | | Condition might be simplified to a Constant, but it doesn't have to be ConstantInt, so we should dyn_cast, instead of cast. This fixes PR27886. llvm-svn: 270924
* Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one ↵Michael Zolotukhin2016-05-241-3/+3
| | | | | | | | more time. This reverts commit r270577. llvm-svn: 270630
* Revert r270518, which re-enabled "[LoopUnroll] Enable advanced unrolling ↵Hans Wennborg2016-05-241-3/+3
| | | | | | | | analysis by default. Chromium builds are still hitting the assert in PR27874. llvm-svn: 270577
* Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by ↵Michael Zolotukhin2016-05-241-3/+3
| | | | | | | | | default."" This reverts commit r270512 and reapplies r270478. Originally it caused PR27847, but it was fixed in r270517. llvm-svn: 270518
* Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default."Hans Wennborg2016-05-231-3/+3
| | | | | | This caused PR27847. llvm-svn: 270512
* [LoopUnroll] Enable advanced unrolling analysis by default.Michael Zolotukhin2016-05-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch turns on LoopUnrollAnalyzer by default. To mitigate compile time regressions, I chose very conservative thresholds for now. Later we can make them more aggressive, but it might require being smarter in which loops we're optimizing. E.g. currently the biggest issue is that with more agressive thresholds we unroll many cold loops, which increases compile time for no performance benefit (performance of those loops is improved, but it doesn't matter since they are cold). Test results for compile time(using 4 samples to reduce noise): ``` MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19% SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect 4.19% MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow 3.39% MultiSource/Applications/JM/lencod/lencod 1.47% MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06% ``` I didn't see any performance changes in the testsuite, but it improves some internal tests. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20482 llvm-svn: 270478
* [LoopUnrollAnalyzer] Take into account cost of instructions controlling ↵Michael Zolotukhin2016-05-181-0/+1
| | | | | | | | | branches, along with their operands. Previously, we didn't add their and their operands cost, which could've resulted in unrolling loops for no actual benefit. llvm-svn: 269985
* Revert "Revert "[Unroll] Implement a conservative and monotonically ↵Michael Zolotukhin2016-05-131-13/+178
| | | | | | | | | | increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."" This reverts commit r269395. Try to reapply with a fix from chapuni. llvm-svn: 269486
* Revert "[Unroll] Implement a conservative and monotonically increasing cost ↵Michael Zolotukhin2016-05-131-178/+13
| | | | | | | | | | | tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..." This reverts commit r269388. It caused some bots to fail, I'm reverting it until I investigate the issue. llvm-svn: 269395
* [Unroll] Implement a conservative and monotonically increasing cost tracking ↵Michael Zolotukhin2016-05-131-13/+178
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... Summary: ...loop after the last iteration. This is really hard to do correctly. The core problem is that we need to model liveness through the induction PHIs from iteration to iteration in order to get the correct results, and we need to correctly de-duplicate the common subgraphs of instructions feeding some subset of the induction PHIs. All of this can be driven either from a side effect at some iteration or from the loop values used after the loop finishes. This patch implements this by storing the forward-propagating analysis of each instruction in a cache to recall whether it was free and whether it has become live and thus counted toward the total unroll cost. Then, at each sink for a value in the loop, we recursively walk back through every value that feeds the sink, including looping back through the iterations as needed, until we have marked the entire input graph as live. Because we cache this, we never visit instructions more than twice -- once when we analyze them and put them into the cache, and once when we count their cost towards the unrolled loop. Also, because the cache is only two bits and because we are dealing with relatively small iteration counts, we can store all of this very densely in memory to avoid this from becoming an excessively slow analysis. The code here is still pretty gross. I would appreciate suggestions about better ways to factor or split this up, I've stared too long at the algorithmic side to really have a good sense of what the design should probably look at. Also, it might seem like we should do all of this bottom-up, but I think that is a red herring. Specifically, the simplification power is *much* greater working top-down. We can forward propagate very effectively, even across strange and interesting recurrances around the backedge. Because we use data to propagate, this doesn't cause a state space explosion. Doing this level of constant folding, etc, would be very expensive to do bottom-up because it wouldn't be until the last moment that you could collapse everything. The current solution is essentially a top-down simplification with a bottom-up cost accounting which seems to get the best of both worlds. It makes the simplification incremental and powerful while leaving everything dead until we *know* it is needed. Finally, a core property of this approach is its *monotonicity*. At all times, the current UnrolledCost is a conservatively low estimate. This ensures that we will never early-exit from the analysis due to exceeding a threshold when if we had continued, the cost would have gone back below the threshold. These kinds of bugs can cause incredibly hard to track down random changes to behavior. We could use a techinque similar (but much simpler) within the inliner as well to avoid considering speculated code in the inline cost. Reviewers: chandlerc Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D11758 llvm-svn: 269388
* Loop unroller: set thresholds for optsize and minsize functions to zeroHans Wennborg2016-05-101-2/+2
| | | | | | | | | | | | | | | Before r268509, Clang would disable the loop unroll pass when optimizing for size. That commit enabled it to be able to support unroll pragmas in -Os builds. However, this regressed binary size in one of Chromium's DLLs with ~100 KB. This restores the original behaviour of no unrolling at -Os, but doing it in LLVM instead of Clang makes more sense, and also allows the pragmas to keep working. Differential revision: http://reviews.llvm.org/D20115 llvm-svn: 269124
* clang-format some files in preparation of coming patch reviews.Dehao Chen2016-05-051-30/+31
| | | | llvm-svn: 268583
* Re-commit optimization bisect support (r267022) without new pass manager ↵Andrew Kaylor2016-04-221-1/+1
| | | | | | | | | | support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
* Revert "Initial implementation of optimization bisect support."Vedant Kumar2016-04-221-1/+1
| | | | | | | | This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115
* Initial implementation of optimization bisect support.Andrew Kaylor2016-04-211-1/+1
| | | | | | | | | | | | This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022
* Loop Unroll: add options and tweak to make Partial unrolling more usefulFiona Glaser2016-04-061-3/+20
| | | | | | | | | | | | | | | | 1. Add FullUnrollMaxCount option that works like MaxCount, but also limits the unroll count for fully unrolled loops. So if a loop has an iteration count over this, it won't fully unroll. 2. Add CLI options for MaxCount and the new option, so they can be tested (plus a test). 3. Make partial unrolling obey MaxCount. An example use-case (the out of tree one this is originally designed for) is a target’s TTI can analyze a loop and decide on a max unroll count separate from the size threshold, e.g. based on register pressure, then constrain LoopUnroll to not exceed that, regardless of the size of the unrolled loop. llvm-svn: 265562
* LoopUnroll: only allow non-modulo Partial unrolling when Runtime=trueFiona Glaser2016-04-061-2/+4
| | | | | | Patch by Evgeny Stupachenko <evstupac@gmail.com>. llvm-svn: 265558
* Enable unroll for constant bound loops when TripCount is not modulo of ↵Zia Ansari2016-04-041-0/+10
| | | | | | | | | | unroll factor, reducing it to maximum power-of-2 that satisfies threshold limit. Commit for Evgeny Stupachenko (evstupac@gmail.com) Differential Revision: http://reviews.llvm.org/D18290 llvm-svn: 265337
* Enable non-power-of-2 #pragma unroll counts.David L Kreitzer2016-03-251-5/+4
| | | | | | | | Patch by Evgeny Stupachenko. Differential Revision: http://reviews.llvm.org/D18202 llvm-svn: 264407
* [LoopUnroll] Respect the convergent attribute.Justin Lebar2016-03-141-4/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Specifically, when we perform runtime loop unrolling of a loop that contains a convergent op, we can only unroll k times, where k divides the loop trip multiple. Without this change, we'll happily unroll e.g. the following loop for (int i = 0; i < N; ++i) { if (i == 0) convergent_op(); foo(); } into int i = 0; if (N % 2 == 1) { convergent_op(); foo(); ++i; } for (; i < N - 1; i += 2) { if (i == 0) convergent_op(); foo(); foo(); }. This is unsafe, because we've just added a control-flow dependency to the convergent op in the prelude. In general, runtime unrolling loops that contain convergent ops is safe only if we don't have emit a prelude, which occurs when the unroll count divides the trip multiple. Reviewers: resistor Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17526 llvm-svn: 263509
* fix variable name; NFCSanjay Patel2016-03-081-3/+3
| | | | llvm-svn: 262953
* use range-based loop; NFCISanjay Patel2016-03-081-3/+2
| | | | llvm-svn: 262952
* [LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're ↵Michael Zolotukhin2016-02-261-1/+1
| | | | | | | | | | | | | | simulating. Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect. Reviewers: chandlerc, hfinkel Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17632 llvm-svn: 261958
* [LPM] Factor all of the loop analysis usage updates into a common helperChandler Carruth2016-02-191-21/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | routine. We were getting this wrong in small ways and generally being very inconsistent about it across loop passes. Instead, let's have a common place where we do this. One minor downside is that this will require some analyses like SCEV in more places than they are strictly needed. However, this seems benign as these analyses are complete no-ops, and without this consistency we can in many cases end up with the legacy pass manager scheduling deciding to split up a loop pass pipeline in order to run the function analysis half-way through. It is very, very annoying to fix these without just being very pedantic across the board. The only loop passes I've not updated here are ones that use AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer. They seemed less relevant. With this patch, almost all of the problems in PR24804 around loop pass pipelines are fixed. The one remaining issue is that we run simplify-cfg and instcombine in the middle of the loop pass pipeline. We've recently added some loop variants of these passes that would seem substantially cleaner to use, but this at least gets us much closer to the previous state. Notably, the seven loop pass managers is down to three. I've not updated the loop passes using LoopAccessAnalysis because that analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't clear that those transforms want to support those forms anyways. They all run late anyways, so this is harmless. Similarly, LSR is left alone because it already carefully manages its forms and doesn't need to get fused into a single loop pass manager with a bunch of other loop passes. LoopReroll didn't use loop simplified form previously, and I've updated the test case to match the trivially different output. Finally, I've also factored all the pass initialization for the passes that use this technique as well, so that should be done regularly and reliably. Thanks to James for the help reviewing and thinking about this stuff, and Ben for help thinking about it as well! Differential Revision: http://reviews.llvm.org/D17435 llvm-svn: 261316
* Factor out UnrollAnalyzer to Analysis, and add unit tests for it.Michael Zolotukhin2016-02-081-239/+1
| | | | | | | | | | | | | | | | Summary: Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests. The change itself is supposed to be NFC, except adding a couple of tests. I plan to add more tests as I add new functionality and find/fix bugs. Reviewers: chandlerc, hfinkel, sanjoy Subscribers: zzheng, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16623 llvm-svn: 260169
* LoopUnroll: Move the actual unrolling logic to a standalone function. NFCJustin Bogner2016-01-121-86/+95
| | | | | | | This is pure code motion - break the actual work out of runOnLoop into a reusable standalone function. llvm-svn: 257445
* LoopUnroll: Make canUnrollCompletely static - it doesn't use any state. NFCJustin Bogner2016-01-121-11/+5
| | | | llvm-svn: 257427
* LoopUnroll: Clean up the maze of initialization for unroll parameters. NFCJustin Bogner2016-01-121-199/+141
| | | | | | | | | | The layering of where the various loop unroll parameters are initialized and overridden here was very confusing, making it pretty difficult to tell just how the various sources interacted. Instead, we put all of the initialization logic together in a single function so that it's obvious what overrides what. llvm-svn: 257426
* LoopUnroll: Use the optsize threshold for minsize as wellJustin Bogner2016-01-111-4/+1
| | | | | | | | | | | | Currently we're unrolling loops more in minsize than in optsize, which means -Oz will have a larger code size than -Os. That doesn't make any sense. This resolves the FIXME about this in LoopUnrollPass and extends the optsize test to make sure we use the smaller threshold for minsize as well. llvm-svn: 257402
* LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFCJustin Bogner2015-12-161-3/+3
| | | | | | | | | | | As of r255720, the loop pass manager will DTRT when passes update the loop info for removed loops, so they no longer need to reach into LPPassManager APIs to do this kind of transformation. This change very nearly removes the need for the LPPassManager to even be passed into loop passes - the only remaining pass that uses the LPM argument is LoopUnswitch. llvm-svn: 255797
* LPM: Stop threading `Pass *` through all of the loop utility APIs. NFCJustin Bogner2015-12-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | A large number of loop utility functions take a `Pass *` and reach into it to find out which analyses to preserve. There are a number of problems with this: - The APIs have access to pretty well any Pass state they want, so it's hard to tell what they may or may not do. - Other APIs have copied these and pass around a `Pass *` even though they don't even use it. Some of these just hand a nullptr to the API since the callers don't even have a pass available. - Passes in the new pass manager don't work like the current ones, so the APIs can't be used as is there. Instead, we should explicitly thread the analysis results that we actually care about through these APIs. This is both simpler and more reusable. llvm-svn: 255669
* [ScalarOpts] Remove dead code.Benjamin Kramer2015-10-151-11/+3
| | | | | | Does not touch debug dumpers. NFC. llvm-svn: 250417
* [Unroll] Do not crash trying to propagate a value to vector load.Michael Zolotukhin2015-09-221-0/+6
| | | | llvm-svn: 248333
* [Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad.Michael Zolotukhin2015-09-221-1/+1
| | | | | | | | Apart from checking that GlobalVariable is a constant, we should check that it's not a weak constant, in which case we can't propagate its value. llvm-svn: 248327
* [Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad.Michael Zolotukhin2015-09-161-1/+1
| | | | | | | | We only checked that a global is initialized with constants, which is incorrect. We should be checking that GlobalVariable *is* a constant, not just initialized with it. llvm-svn: 247769
* Add GlobalsAA as preserved to a bunch of transformsJames Molloy2015-09-101-0/+2
| | | | | | GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA. llvm-svn: 247263
* Make helper functions static. NFC.Benjamin Kramer2015-08-201-1/+1
| | | | llvm-svn: 245549
* [PM] Port ScalarEvolution to the new pass manager.Chandler Carruth2015-08-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never *actually* invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there *is* actually a key that we don't update with value handles -- there is a map keyed off of Loop*s. Because LoopInfo *does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that *don't* trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in *just* such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193
* Add new llvm.loop.unroll.enable metadata.Mark Heffernan2015-08-101-20/+40
| | | | | | | | | | | | | This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time. The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll". llvm-svn: 244466
* Fix some comment typos.Benjamin Kramer2015-08-081-2/+2
| | | | llvm-svn: 244402
* [Unroll] Switch to using 'int' cost types in preparation for a somewhatChandler Carruth2015-08-051-6/+6
| | | | | | more involved change to the cost computation pattern. llvm-svn: 244095
OpenPOWER on IntegriCloud