summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
...
* [PM] The order of evaluation of these analyses is actually significant,Chandler Carruth2016-03-111-5/+10
| | | | | | | | | | | | | | | | | | | | much to my horror, so use variables to fix it in place. This terrifies me. Both basic-aa and memdep will provide more precise information when the domtree and/or the loop info is available. Because of this, if your pass (like GVN) requires domtree, and then queries memdep or basic-aa, it will get more precise results. If it does this in the other order, it gets less precise results. All of the ideas I have for fixing this are, essentially, terrible. Here I've just caused us to stop having unspecified behavior as different implementations evaluate the order of these arguments differently. I'm actually rather glad that they do, or the fragility of memdep and basic-aa would have gone on unnoticed. I've left comments so we don't immediately break this again. This should fix bots whose host compilers evaluate the order of arguments differently from Clang. llvm-svn: 263231
* [PM] Make the AnalysisManager parameter to run methods a reference.Chandler Carruth2016-03-114-17/+17
| | | | | | | | | | | | This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, *many* parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219
* [PM] Port GVN to the new pass manager, wire it up, and teach a couple ofChandler Carruth2016-03-112-350/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | tests to run GVN in both modes. This is mostly the boring refactoring just like SROA and other complex transformation passes. There is some trickiness in that GVN's ValueNumber class requires hand holding to get to compile cleanly. I'm open to suggestions about a better pattern there, but I tried several before settling on this. I was trying to balance my desire to sink as much implementation detail into the source file as possible without introducing overly many layers of abstraction. Much like with SROA, the design of this system is made somewhat more cumbersome by the need to support both pass managers without duplicating the significant state and logic of the pass. The same compromise is struck here. I've also left a FIXME in a doxygen comment as the GVN pass seems to have pretty woeful documentation within it. I'd like to submit this with the FIXME and let those more deeply familiar backfill the information here now that we have a nice place in an interface to put that kind of documentaiton. Differential Revision: http://reviews.llvm.org/D18019 llvm-svn: 263208
* [LLE] Add missed LoopSimplify dependenceAdam Nemet2016-03-101-0/+3
| | | | | | | | | The code assumed that we always had a preheader without making the pass dependent on LoopSimplify. Thanks to Mattias Eriksson V for reporting this. llvm-svn: 263173
* [SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights outChandler Carruth2016-03-101-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of, and I misdiagnosed for months and months. Andrea has had a patch for this forever, but I just couldn't see how it was fixing the root cause of the problem. It didn't make sense to me, even though the patch was perfectly good and the analysis of the actual failure event was *fantastic*. Well, I came back to it today because the patch has sat for *far* too long and needs attention and decided I wouldn't let it go until I really understood what was going on. After quite some time in the debugger, I finally realized that in fact I had just missed an important case with my previous attempt to fix PR22093 in r225149. Not only do we need to handle loads that won't be split, but stores-of-loads that we won't split. We *do* actually have enough logic in the presplitting to form new slices for split stores.... *unless* we decided not to split them! I'm so sorry that it took me this long to come to the realization that this is the issue. It seems so obvious in hind sight (of course). Anyways, the fix becomes *much* smaller and more focused. The fact that we're left doing integer smashing is related to the FIXME in my original commit: fundamentally, we're not aggressive about pre-splitting for loads and stores to the same alloca. If we want to get aggressive about this, it'll need both what Andrea had put into the proposed fix, but also a *lot* more logic to essentially iteratively pre-split the alloca until we can't do any more. As I said in that commit log, its really unclear that this is the right call. Instead, the integer blending and letting targets lower this to narrower stores seems slightly better. But we definitely shouldn't really go down that path just to fix this bug. Again, tons of thanks are owed to Andrea and others at Sony for working on this bug. I really should have seen what was going on here and re-directed them sooner. =//// llvm-svn: 263121
* [SROA] Clean up some really weird code, no functionality changed.Chandler Carruth2016-03-101-3/+3
| | | | | | | | | | | We already have the instruction extracted into 'I', just cast that to a store the way we do for loads. Also, we don't enter the if unless SI is non-null, so don't test it again for null. I'm pretty sure the entire test there can be nuked, but this is just the trivial cleanup. llvm-svn: 263112
* [gvn] Fix more indenting and formatting in regions of code that willChandler Carruth2016-03-101-64/+62
| | | | | | | | | | | need to be changed for porting to the new pass manager. Also sink the comment on the ValueTable class back to that class instead of it dangling on an anonymous namespace. No functionality changed. llvm-svn: 263084
* [gvn] Reformat a chunk of the GVN code that is strangely indented priorChandler Carruth2016-03-101-241/+240
| | | | | | | | to restructuring it for porting to the new pass manager. No functionality changed. llvm-svn: 263083
* [PM] Port memdep to the new pass manager.Chandler Carruth2016-03-104-24/+25
| | | | | | | | | | | | | | | | | | | | | | | This is a fairly straightforward port to the new pass manager with one exception. It removes a very questionable use of releaseMemory() in the old pass to invalidate its caches between runs on a function. I don't think this is really guaranteed to be safe. I've just used the more direct port to the new PM to address this by nuking the results object each time the pass runs. While this could cause some minor malloc traffic increase, I don't expect the compile time performance hit to be noticable, and it makes the correctness and other aspects of the pass much easier to reason about. In some cases, it may make things faster by making the sets and maps smaller with better locality. Indeed, the measurements collected by Bruno (thanks!!!) show mostly compile time improvements. There is sadly very limited testing at this point as there are only two tests of memdep, and both rely on GVN. I'll be porting GVN next and that will exercise this heavily though. Differential Revision: http://reviews.llvm.org/D17962 llvm-svn: 263082
* Fix the buildPhilip Reames2016-03-091-0/+1
| | | | | | I screwed up rebasing 263072. This change fixes the build and passes all make check. llvm-svn: 263073
* [LICM] Store promotion when memory is thread localPhilip Reames2016-03-091-11/+56
| | | | | | | | | | | | This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop. Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped. In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem. Differential Revision: http://reviews.llvm.org/D16783 llvm-svn: 263072
* [LLE] Add missing check for unit strideAdam Nemet2016-03-091-5/+13
| | | | | | | | | | I somehow missed this. The case in GCC (global_alloc) was similar to the new testcase except it had an array of structs rather than a two dimensional array. Fixes RP26885. llvm-svn: 263058
* [LoopDataPrefetch] Add stats and debug outputAdam Nemet2016-03-091-0/+9
| | | | llvm-svn: 262998
* Return StringRef instead of a naked char*; NFCSanjoy Das2016-03-091-2/+2
| | | | llvm-svn: 262989
* [IRCE] Reflow comments; NFCSanjoy Das2016-03-091-4/+2
| | | | llvm-svn: 262988
* fix variable name; NFCSanjay Patel2016-03-081-3/+3
| | | | llvm-svn: 262953
* use range-based loop; NFCISanjay Patel2016-03-081-3/+2
| | | | llvm-svn: 262952
* [LoopDataPrefetch] If prefetch distance is not set, skip passAdam Nemet2016-03-071-2/+5
| | | | | | | | | | | | | | This lets select sub-targets enable this pass. The patch implements the idea from the recent llvm-dev thread: http://thread.gmane.org/gmane.comp.compilers.llvm.devel/94925 The goal is to enable the LoopDataPrefetch pass for the Cyclone sub-target only within Aarch64. Positive and negative tests will be included in an upcoming patch that enables selective prefetching of large-strided accesses on Cyclone. llvm-svn: 262844
* [LLE] Fix a commentAdam Nemet2016-02-291-3/+3
| | | | llvm-svn: 262270
* [LLE] Fix SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper with PollyAdam Nemet2016-02-291-0/+5
| | | | | | | | | We can actually have dependences between accesses with different underlying types. Bail in this case. A test will follow shortly. llvm-svn: 262267
* [LICM] Teach LICM how to handle cases where the alias set tracker wasChandler Carruth2016-02-271-20/+32
| | | | | | | | | | | | | | | | | | | | merged into a loop that was subsequently unrolled (or otherwise nuked). In this case it can't merge in the ASTs for any remaining nested loops, it needs to re-add their instructions dircetly. The fix is very isolated, but I've pulled the code for merging blocks into the AST into a single place in the process. The only behavior change is in the case which would have crashed before. This fixes a crash reported by Mikael Holmen on the list after r261316 restored much of the loop pass pipelining and allowed us to actually do this kind of nested transformation sequenc. I've taken that test case and further reduced it into the somewhat twisty maze of loops in the included test case. This does in fact trigger the bug even in this reduced form. llvm-svn: 262108
* [JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()Haicheng Wu2016-02-261-20/+35
| | | | | | This change tries to find more opportunities to thread over basic blocks. llvm-svn: 261981
* [LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're ↵Michael Zolotukhin2016-02-261-1/+1
| | | | | | | | | | | | | | simulating. Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect. Reviewers: chandlerc, hfinkel Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17632 llvm-svn: 261958
* [LoopDataPrefetch] Make it testable with optAdam Nemet2016-02-221-0/+1
| | | | | | | | | | | | | | | Summary: Since this is an IR pass it's nice to be able to write tests without llc. This is the counterpart of the llc test under CodeGen/PowerPC/loop-data-prefetch.ll. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17464 llvm-svn: 261578
* [RS4GC] "Constant fold" the rs4gc-split-vector-values flagPhilip Reames2016-02-221-156/+0
| | | | | | This flag was part of a migration to a new means of handling vectors-of-points which was described in the llvm-dev thread "FYI: Relocating vector of pointers". The old code path has been off by default for a while without complaints, so time to cleanup. llvm-svn: 261569
* [RS4GC] Revert optimization attempt due to memory corruptionPhilip Reames2016-02-221-63/+3
| | | | | | | | This change reverts "246133 [RewriteStatepointsForGC] Reduce the number of new instructions for base pointers" and a follow on bugfix 12575. As pointed out in pr25846, this code suffers from a memory corruption bug. Since I'm (empirically) not going to get back to this any time soon, simply reverting the problematic change is the right answer. llvm-svn: 261565
* Allow setting MaxRerollIterations above 16Elena Demikhovsky2016-02-221-5/+4
| | | | | | | | By Ayal Zaks. Differential Revision http://reviews.llvm.org/D17258 llvm-svn: 261517
* ADT: Remove == and != comparisons between ilist iterators and pointersDuncan P. N. Exon Smith2016-02-211-1/+1
| | | | | | | | | | | | | | I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
* [LoopDeletion] Add an assert that verifies LCSSASanjoy Das2016-02-211-1/+3
| | | | | | | This is inspired by PR24804 -- had this assert been there before, isolating the root cause for PR24804 would have been far easier. llvm-svn: 261481
* [LPM] Factor all of the loop analysis usage updates into a common helperChandler Carruth2016-02-1911-171/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | routine. We were getting this wrong in small ways and generally being very inconsistent about it across loop passes. Instead, let's have a common place where we do this. One minor downside is that this will require some analyses like SCEV in more places than they are strictly needed. However, this seems benign as these analyses are complete no-ops, and without this consistency we can in many cases end up with the legacy pass manager scheduling deciding to split up a loop pass pipeline in order to run the function analysis half-way through. It is very, very annoying to fix these without just being very pedantic across the board. The only loop passes I've not updated here are ones that use AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer. They seemed less relevant. With this patch, almost all of the problems in PR24804 around loop pass pipelines are fixed. The one remaining issue is that we run simplify-cfg and instcombine in the middle of the loop pass pipeline. We've recently added some loop variants of these passes that would seem substantially cleaner to use, but this at least gets us much closer to the previous state. Notably, the seven loop pass managers is down to three. I've not updated the loop passes using LoopAccessAnalysis because that analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't clear that those transforms want to support those forms anyways. They all run late anyways, so this is harmless. Similarly, LSR is left alone because it already carefully manages its forms and doesn't need to get fused into a single loop pass manager with a bunch of other loop passes. LoopReroll didn't use loop simplified form previously, and I've updated the test case to match the trivially different output. Finally, I've also factored all the pass initialization for the passes that use this technique as well, so that should be done regularly and reliably. Thanks to James for the help reviewing and thinking about this stuff, and Ben for help thinking about it as well! Differential Revision: http://reviews.llvm.org/D17435 llvm-svn: 261316
* Bug fix: use dyn_cast_or_null instead of dyn_castLawrence Hu2016-02-191-2/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D17154 llvm-svn: 261299
* Remove uses of builtin comma operator.Richard Trieu2016-02-181-2/+4
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
* [PPCLoopDataPrefetch] Move pass to Transforms/Scalar/LoopDataPrefetch. NFCAdam Nemet2016-02-182-0/+227
| | | | | | | | | | | | | This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). Obviously the pass still only used from PPC at this point. Subsequent patches will start driving this from ARM64 as well. Due to the previous patch most lines should show up as moved lines. llvm-svn: 261265
* Minor code cleanup. NFC.Junmo Park2016-02-181-1/+1
| | | | llvm-svn: 261200
* [LIR] Avoid turning non-temporal stores into memsetHaicheng Wu2016-02-171-0/+4
| | | | | | This is to fix PR26645. llvm-svn: 261149
* Tweak the LICM code to reuse the first sub-loop instead of creating a new oneRoman Gareev2016-02-151-14/+32
| | | | | | | | | | | | | LICM starts with an *empty* AST, and then merges in each sub-loop. While the add code is appropriate for sub-loop 2 and up, it's utterly unnecessary for sub-loop 1. If the AST starts off empty, we can just clone/move the contents of the subloop into the containing AST. Reviewed-by: Philip Reames <listmail@philipreames.com> Differential Revision: http://reviews.llvm.org/D16753 llvm-svn: 260892
* [LIR] Allow merging of memsets in negatively strided loops.Chad Rosier2016-02-121-5/+7
| | | | | | Last part of PR25166. llvm-svn: 260732
* [LoopRotate] Don't perform loop rotation if the loop header calls a ↵Justin Lebar2016-02-121-0/+5
| | | | | | | | | | | | | | | | | convergent function. Summary: Calls to convergent functions can be duplicated, but only if the duplicates are not control-flow dependent on any additional values. Loop rotation doesn't meet the bar. Reviewers: jingyue Subscribers: mzolotukhin, llvm-commits, arsenm, joker.eph, resistor, tra, hfinkel, broune Differential Revision: http://reviews.llvm.org/D17127 llvm-svn: 260729
* Remove unused variableDavid Majnemer2016-02-121-1/+0
| | | | llvm-svn: 260722
* [GVN] Common code for local and non-local load availability [NFCI]Philip Reames2016-02-121-248/+148
| | | | | | | | | | | | The attached patch removes all of the block local code for performing X-load forwarding by reusing the code used in the non-local case. The motivation here is to remove duplication and in the process increase our test coverage of some fairly tricky code. I have some upcoming changes I'll be proposing in this area and wanted to have the code cleaned up a bit first. Note: The review for this mostly happened in email which didn't make it to phabricator on the 258882 commit thread. Differential Revision: http://reviews.llvm.org/D16608 llvm-svn: 260711
* [LIR] Partially revert r252926(NFC), which introduced a very subtle change.Chad Rosier2016-02-121-8/+8
| | | | | | | | | | | | | In short, before r252926 we were comparing an unsigned (StoreSize) against an a APInt (Stride), which is fine and well. After we were zero extending the Stride and then converting to an unsigned, which is not the same thing. Obviously, Stides can also be negative. This commit just restores the original behavior. AFAICT, it's not possible to write a test case to expose the issue because the code already has checks to make sure the StoreSize can't overflow an unsigned (which prevents the Stride from overflowing an unsigned as well). llvm-svn: 260706
* Fix MSVC 2013 build after rL260504Tamas Berghammer2016-02-111-1/+1
| | | | llvm-svn: 260511
* Fixed typo in comment & coding style for LoopVersioningLICM.Ashutosh Nema2016-02-111-12/+11
| | | | llvm-svn: 260504
* Follow up to 260439, Speculative fix to clang buildersPhilip Reames2016-02-101-1/+4
| | | | | | It looks like clang has a couple of test cases which caught the fact LVI was not slightly more precise after 260439. When looking at the failures, it struck me as wasteful to be querying nullness of a constant via LVI, so instead of tweaking the clang tests, let's just stop querying constants from this source. llvm-svn: 260451
* StructurizeCFG: Initialize SkipUniformRegions in the default constructorTom Stellard2016-02-101-1/+1
| | | | | | This should fix some random bot failures caused by r260336. llvm-svn: 260342
* StructurizeCFG: Add an option for skipping regions with only uniform branchesTom Stellard2016-02-101-3/+38
| | | | | | | | | | | | | | Summary: Tests for this will be added once the AMDGPU backend enables this option. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16602 llvm-svn: 260336
* Factor out UnrollAnalyzer to Analysis, and add unit tests for it.Michael Zolotukhin2016-02-081-239/+1
| | | | | | | | | | | | | | | | Summary: Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests. The change itself is supposed to be NFC, except adding a couple of tests. I plan to add more tests as I add new functionality and find/fix bugs. Reviewers: chandlerc, hfinkel, sanjoy Subscribers: zzheng, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16623 llvm-svn: 260169
* [JumpThreading] Change a return of ComputeValueKnownInPredecessors()Haicheng Wu2016-02-081-1/+1
| | | | | | | | | Change a return statement of ComputeValueKnownInPredecessors() to be the same as the rest return statements of the function. Otherwise, it might return true with an empty Result when the current basic block has no predecessors and trigger the first assert of JumpThreading::ProcessThreadableEdges(). llvm-svn: 260110
* New Loop Versioning LICM PassAshutosh Nema2016-02-063-0/+622
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When alias analysis is uncertain about the aliasing between any two accesses, it will return MayAlias. This uncertainty from alias analysis restricts LICM from proceeding further. In cases where alias analysis is uncertain we might use loop versioning as an alternative. Loop Versioning will create a version of the loop with aggressive aliasing assumptions in addition to the original with conservative (default) aliasing assumptions. The version of the loop making aggressive aliasing assumptions will have all the memory accesses marked as no-alias. These two versions of loop will be preceded by a memory runtime check. This runtime check consists of bound checks for all unique memory accessed in loop, and it ensures the lack of memory aliasing. The result of the runtime check determines which of the loop versions is executed: If the runtime check detects any memory aliasing, then the original loop is executed. Otherwise, the version with aggressive aliasing assumptions is used. The pass is off by default and can be enabled with command line option -enable-loop-versioning-licm. Reviewers: hfinkel, anemet, chatur01, reames Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga, llvm-commits Differential Revision: http://reviews.llvm.org/D9151 llvm-svn: 259986
* [RS4GC] Pass DenseMap by reference, NFCJoseph Tremoulet2016-02-051-5/+4
| | | | | | | | | | | | | | | Summary: Passing the rematerialized values map to insertRematerializationStores by value looks to be a simple oversight; update it to pass by reference. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16911 llvm-svn: 259867
OpenPOWER on IntegriCloud