summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
...
* [LoopUnswitch] Require DominatorTree info.Michael Zolotukhin2015-09-221-11/+7
| | | | | | | | | | | | | | | | | | | | | Summary: We should either require the DT info to be available, or check if it's available in every place we use DT (and we already miss such check in one place, which causes failures in some cases). As other loop passes preserve DT and it's usually available, it makes sense to just require it here. There is no regression test, because the bug only shows up if pass manager decides to clean DT info right before LoopUnswitch. If loop-unswitch is run separately, DT is available, so bug isn't exposed. Reviewers: chandlerc, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13036 llvm-svn: 248230
* [LICM] Hoist calls to readonly argmemonly functions even with stores in the loopPhilip Reames2015-09-211-0/+11
| | | | | | | | | | | | We know that an argmemonly function can only access memory pointed to by it's pointer arguments. Rather than needing to consider all possible stores as aliasing (as we do for a readonly function), we can only consider the aliasing of the pointer arguments. Note that this change only addresses hoisting. I'm thinking about how to address speculation safety as well, but that will be a different change. FYI, argmemonly disallows accessing memory through non-pointer typed arguments. Differential Revision: http://reviews.llvm.org/D12771 llvm-svn: 248220
* Fix UB: can't bind a reference to nullptr (NFC)Mehdi Amini2015-09-211-1/+1
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 248213
* [IndVars] Use C++11 style field initialization; NFCI.Sanjoy Das2015-09-201-14/+7
| | | | llvm-svn: 248131
* [IndVars] Don't add a level of indentation for namespace {. NFC.Sanjoy Das2015-09-201-77/+77
| | | | | | Whitespace-only change. llvm-svn: 248130
* [IndVars] Don't repeat function names in comment; NFC.Sanjoy Das2015-09-201-65/+62
| | | | | | Only changes comments. llvm-svn: 248112
* [IndVars] Fix a bug in r248045.Sanjoy Das2015-09-201-14/+19
| | | | | | | | | | | | | | Because -indvars widens induction variables through arithmetic, `NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV` manages information for all transitive uses of an IV being widened, including uses of `-1 * IV`). Instead it must live on `NarrowIVDefUse` which manages information for a specific def-use edge in the transitive use list of an induction variable. This change also adds a test case that demonstrates the problem with r248045. llvm-svn: 248107
* [IndVars] Widen more comparisons for non-negative induction varsSanjoy Das2015-09-181-3/+26
| | | | | | | | | | | | | | | | | | | | Summary: If an induction variable is provably non-negative, its sign extension is equal to its zero extension. This means narrow uses like icmp slt iNarrow %indvar, %rhs can be widened into icmp slt iWide zext(%indvar), sext(%rhs) Reviewers: atrick, mcrosier, hfinkel Subscribers: hfinkel, reames, llvm-commits Differential Revision: http://reviews.llvm.org/D12745 llvm-svn: 248045
* Clean up: Refactoring the hardcoded value of 6 for ↵Larisse Voufo2015-09-181-3/+4
| | | | | | FindAvailableLoadedValue()'s parameter MaxInstsToScan. (Complete version of r247497. See D12886) llvm-svn: 248022
* gvn small fixPiotr Padlewski2015-09-171-3/+1
| | | | | | http://reviews.llvm.org/D12928 llvm-svn: 247935
* Test commit: Fixed a few typos in the comments.David L Kreitzer2015-09-161-6/+6
| | | | llvm-svn: 247793
* [Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad.Michael Zolotukhin2015-09-161-1/+1
| | | | | | | | We only checked that a global is initialized with constants, which is incorrect. We should be checking that GlobalVariable *is* a constant, not just initialized with it. llvm-svn: 247769
* [IndVars] Fix PR24783.Sanjoy Das2015-09-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In `IndVarSimplify::ExpandSCEVIfNeeded`, `SCEVExpander::findExistingExpansion` may return an `llvm::Value` that differs in type from the SCEV it was asked to find an expansion for (but computes the same value). In such cases, we fall back on `expandCodeFor`; and rely on LLVM to CSE the two equivalent expressions (different only by a no-op cast) into a single computation. I tried a few other approaches to fixing PR24783, all of which turned out to be more complex than this current version: 1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`. This got problematic because currently we do not pass in the `Loop *` into `expandCodeFor`. Changing the interface to do this is a more invasive change, and really does not make much semantic sense unless the SCEV being passed in is an add recurrence. There is also the problem of `expandCodeFor` being used in places other than `indvars` -- there may be performance / correctness issues elsewhere if `expandCodeFor` is moved from always generating IR from scratch to cache-like model. 2. Have `findExistingExpansion` only return expression with the correct type. This would make `isHighCostExpansionHelper` and thus `isHighCostExpansion` more conservative than necessary. 3. Insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo`. This is complicated because `InsertNoopCastOfTo` depends on internal state of its `SCEVExpander` (specifically `Builder.GetInserPoint()`), and this may not be set up when `ExpandSCEVIfNeeded` is called. 4. Manually insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo` via `CastInst::Create`. This is probably workable, but figuring out the location where the cast instruction needs to be inserted has enough edge cases (arguments, constants, invokes, LCSSA must be preserved) makes me feel what I have right now is simplest solution. llvm-svn: 247749
* [IndVars] Rename variable; NFC.Sanjoy Das2015-09-151-2/+2
| | | | llvm-svn: 247748
* Revert "Clean up: Refactoring the hardcoded value of 6 for ↵Larisse Voufo2015-09-151-4/+3
| | | | | | FindAvailableLoadedValue()'s parameter MaxInstsToScan." for preliminary community discussion (See. D12886) llvm-svn: 247716
* [CorrelatedValuePropagation] Infer nonnull attributesIgor Laevsky2015-09-151-0/+31
| | | | | | | | | LazuValueInfo can prove that value is nonnull based on the context information. Make use of this ability to infer nonnull attributes for the call arguments. Differential Revision: http://reviews.llvm.org/D12836 llvm-svn: 247707
* [NaryReassociate] Add support for Mul instructionsMarcello Maggioni2015-09-151-24/+77
| | | | | | | | | This patch extends the current pass by handling Mul instructions as well. Patch by: Volkan Keles (vkeles@apple.com) llvm-svn: 247705
* [PlaceSafepoints] Make the width of a counted loop settable.Sanjoy Das2015-09-151-18/+11
| | | | | | | | | | | | | | | | | | Summary: This change lets a `PlaceSafepoints` client change how wide the trip count of a loop has to be for the loop to be considerd "counted", via `CountedLoopTripWidth`. It also removes the boolean `SkipCounted` flag and the `upperTripBound` constant -- we can get the old behavior of `SkipCounted` == `false` by setting `CountedLoopTripWidth` to `13` (2 ^ 13 == 8192). Reviewers: reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12789 llvm-svn: 247656
* [PM] Port SROA to the new pass manager.Chandler Carruth2015-09-122-408/+343
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some ways this is a very boring port to the new pass manager as there are no interesting analyses or dependencies or other oddities. However, this does introduce the first good example of a transformation pass with non-trivial state porting to the new pass manager. I've tried to carve out patterns here to replicate elsewhere, and would appreciate comments on whether folks like these patterns: - A common need in the new pass manager is to effectively lift the pass class and some of its state into a public header file. Prior to this, LLVM used anonymous namespaces to provide "module private" types and utilities, but that doesn't scale to cases where a public header file is needed and the new pass manager will exacerbate that. The pattern I've adopted here is to use the namespace-cased-name of the core pass (what would be a module if we had them) as a module-private namespace. Then utility and other code can be declared and defined in this namespace. At some point in the future, we could even have (conditionally compiled) code that used modules features when available to do the same basic thing. - I've split the actual pass run method in two in order to expose a private method usable by the old pass manager to wrap the new class with a minimum of duplicated code. I actually looked at a bunch of ways to automate or generate these, but they are all quite terrible IMO. The fundamental need is to extract the set of analyses which need to cross this interface boundary, and that will end up being too unpredictable to effectively encapsulate IMO. This is also a relatively small amount of boiler plate that will live a relatively short time, so I'm not too worried about the fact that it is boiler plate. The rest of the patch is totally boring but results in a massive diff (sorry). It just moves code around and removes or adds qualifiers to reflect the new name and nesting structure. Differential Revision: http://reviews.llvm.org/D12773 llvm-svn: 247501
* Clean up: Refactoring the hardcoded value of 6 for ↵Larisse Voufo2015-09-121-3/+4
| | | | | | FindAvailableLoadedValue()'s parameter MaxInstsToScan. llvm-svn: 247497
* Add GlobalsAA as preserved to a bunch of transformsJames Molloy2015-09-1014-0/+28
| | | | | | GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA. llvm-svn: 247263
* [RewriteStatepointsForGC] Minor refactor to use shared implementation [NFC]Philip Reames2015-09-101-8/+1
| | | | llvm-svn: 247223
* [RewriteStatepointsForGC] Strengthen a confusingly weak assertion [NFC]Philip Reames2015-09-101-3/+3
| | | | | | The assertion was weaker than it should be and gave the impression we're growing the number of base defining values being considered during the fixed point interation. That's not true. The tighter form of the assert is useful documentation. llvm-svn: 247221
* [RewriteStatepointsForGC] One last bit of naming [NFCI]Philip Reames2015-09-101-7/+7
| | | | llvm-svn: 247220
* [RewriteStatepointsForGC] Further style/naming fixup [NFCI]Philip Reames2015-09-101-26/+26
| | | | llvm-svn: 247217
* [RewriteStatepointsForGC] More naming cleanup [NFCI]Philip Reames2015-09-101-6/+6
| | | | llvm-svn: 247213
* [RewriteStatepointsForGC] Code cleanup [NFC]Philip Reames2015-09-091-25/+26
| | | | | | Factor out common code related to naming values, fix a small style issue. More to follow in separate changes. llvm-svn: 247211
* [RewriteStatepointsForGC] Extend base pointer inference to handle insertelementPhilip Reames2015-09-091-58/+61
| | | | | | | | | | | | This change is simply enhancing the existing inference algorithm to handle insertelement instructions by conservatively inserting a new instruction to propagate the vector of associated base pointers. In the process, I'm ripping out the peephole optimizations which mostly helped cover the fact this hadn't been done. Note that most of the newly inserted nodes will be nearly immediately removed by the post insertion optimization pass introduced in 246718. Arguably, we should be trying harder to avoid the malloc traffic here, but I'd rather get the code correct, then worry about compile time. Unlike previous extensions of the algorithm to handle more case, I discovered the existing code was causing miscompiles in some cases. In particular, we had an implicit assumption that the peephole covered *all* insert element instructions, so if we had a value directly based on a insert element the peephole didn't cover, we proceeded as if it were a base anyways. Not good. I believe we had the same issue with shufflevector which is why I adjusted the predicate for them as well. Differential Revision: http://reviews.llvm.org/D12583 llvm-svn: 247210
* [RewriteStatepointsForGC] Make base pointer inference deterministicPhilip Reames2015-09-091-44/+35
| | | | | | | | | | | | | | Previously, the base pointer algorithm wasn't deterministic. The core fixed point was (of course), but we were inserting new nodes and optimizing them in an order which was unspecified and variable. We'd somewhat hacked around this for testing by sorting by value name, but that doesn't solve the general determinism problem. Instead, we can use the order of traversal over the def/use graph to give us a single consistent ordering. Today, this is a DFS order, but the exact order doesn't mater provided it's deterministic for a given input. (Q: It is safe to rely on a deterministic order of operands right?) Note that this only fixes the determinism within a single inference step. The inference step is currently invoked many times in a non-deterministic order. That's a future change in the sequence. :) Differential Revision: http://reviews.llvm.org/D12640 llvm-svn: 247208
* [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatibleChandler Carruth2015-09-0915-49/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are *within* a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
* Fix a typo I spotted when hacking on SROA. Somewhat alarming thatChandler Carruth2015-09-091-1/+1
| | | | | | nothing broke. llvm-svn: 247127
* [IRCE] Add INITIALIZE_PASS_DEPENDENCY invocations.Sanjoy Das2015-09-091-2/+9
| | | | | | IRCE was just using INITIALIZE_PASS(), which is incorrect. llvm-svn: 247122
* [RewriteStatepointsForGC] Extract common code, comment, and fix a build ↵Philip Reames2015-09-031-55/+48
| | | | | | warning [NFC] llvm-svn: 246810
* [RewriteStatepointsForGC] Strengthen invariants around BDVsPhilip Reames2015-09-031-29/+65
| | | | | | | | | | As a first step towards a new implementation of the base pointer inference algorithm, introduce an abstraction for BDVs, strengthen the assertions around them, and rewrite the BDV relation code in terms of the abstraction which includes an explicit notion of whether the BDV is also a base. The later is motivated by the fact we had a bug where insertelement was always assumed to be a base pointer even though the BDV code knew it wasn't. The strengthened assertions in this patch would have caught that bug. The next step will be to separate the DefiningValueMap into a BDV use list cache (entirely within findBasePointers) and a base pointer cache. Having the former will allow me to use a deterministic visit order when visiting BDVs in the inference algorithm and remove a bunch of ordering related hacks. Before actually doing the last step, I'm likely going to extend the lattice with a 'BaseN' (seen only base inputs) state so that I can kill the post process optimization step. Phabricator Revision: http://reviews.llvm.org/D12608 llvm-svn: 246809
* [RewriteStatepointsForGC] Workaround a lack of determinism in visit orderPhilip Reames2015-09-031-4/+9
| | | | | | | | The visit order being used in the base pointer inference algorithm is currently non-deterministic. When working on http://reviews.llvm.org/D12583, I discovered that we were relying on a peephole optimization to get deterministic ordering in one of the test cases. This change is intented to let me test and land http://reviews.llvm.org/D12583. The current code will not be long lived. I'm starting to investigate a rewrite of the algorithm which will combine the post-process step into the initial algorithm and make the visit order determistic. Before doing that, I wanted to make sure the existing code was complete and the test were stable. Hopefully, patches should be up for review for the new algorithm this week or early next. llvm-svn: 246801
* [RewriteStatepointsForGC] Delete stale comment [NFC]Philip Reames2015-09-021-3/+0
| | | | llvm-svn: 246722
* [RewriteStatepointsForGC] Pull a function out of anon namespace [NFC]Philip Reames2015-09-021-1/+5
| | | | | | Thanks to David Blaikie for noticing in previous commit. llvm-svn: 246721
* [RewriteStatepointsForGC] Bugfix for change 246133Philip Reames2015-09-021-16/+16
| | | | | | | | Fix a bug in change 246133. I didn't handle the case where we had a cycle in the use graph and could add an instruction we were about to erase back on to the worklist. Oddly, I have not been able to write a small test case for this, even with the AssertingVH added. I have confirmed the basic theory for the fix on a large failing example, but all attempts to reduce that to something appropriate for a test case have failed. Differential Revision: http://reviews.llvm.org/D12575 llvm-svn: 246718
* Fix release build warning for unused functionPhilip Reames2015-09-021-1/+2
| | | | llvm-svn: 246717
* [RewriteStatepointsForGC] Improve debug output [NFC]Philip Reames2015-09-021-30/+36
| | | | llvm-svn: 246713
* assuem(X) handling in GVN bugfixPiotr Padlewski2015-09-021-1/+20
| | | | | | | | | | There was infinite loop because it was trying to change assume(true) into assume(true) Also added handling when assume(false) appear http://reviews.llvm.org/D12516 llvm-svn: 246697
* Constant propagation after hitting assume(cmp) bugfixPiotr Padlewski2015-09-021-9/+26
| | | | | | | | | Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246696
* Constant propagation after hiting llvm.assumePiotr Padlewski2015-09-021-3/+68
| | | | | | | | | | | After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246695
* [JumpThreading] make jump threading respect convergent annotation.Jingyue Wu2015-08-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: JumpThreading shouldn't duplicate a convergent call, because that would move a convergent call into a control-inequivalent location. For example, if (cond) { ... } else { ... } convergent_call(); if (cond) { ... } else { ... } should not be optimized to if (cond) { ... convergent_call(); ... } else { ... convergent_call(); ... } Test Plan: test/Transforms/JumpThreading/basic.ll Patch by Xuetian Weng. Reviewers: resistor, arsenm, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12484 llvm-svn: 246415
* [SROA] Fix PR24463, a crash I introduced in SROA by allowing it toChandler Carruth2015-08-281-3/+13
| | | | | | | | | | | | handle more allocas with loads past the end of the alloca. I suspect there are some related crashers with slightly different patterns, but I'll fix those and add test cases as I find them. Thanks to David Majnemer for the excellent test case reduction here. Made this super simple to debug and fix. llvm-svn: 246289
* Revert r246244 and r246243Steven Wu2015-08-281-93/+11
| | | | | | These two commits cause clang/llvm bootstrap to hang. llvm-svn: 246279
* Constant propagation after hitting assume(cmp) bugfixPiotr Padlewski2015-08-281-9/+26
| | | | | | | | | Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246244
* Constant propagation after hiting llvm.assumePiotr Padlewski2015-08-281-3/+68
| | | | | | | | | | | After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246243
* [LoopVectorize] Extract InductionInfo into a helper class...James Molloy2015-08-271-2/+2
| | | | | | | | ... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup. NFC llvm-svn: 246145
* Allow value forwarding past release fences in EarlyCSEPhilip Reames2015-08-271-0/+11
| | | | | | | | | | | | A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence. We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and *then* eliminate the former, but we can't do this if the stores are on opposite sides of the fence. Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering. In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences. Differential Revision: http://reviews.llvm.org/D11434 llvm-svn: 246134
OpenPOWER on IntegriCloud