summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Value number stores and memory states so we can detect when memory states ↵Daniel Berlin2016-12-251-20/+139
| | | | | | | | | | | | are equivalent (IE store of same value to memory). Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28084 llvm-svn: 290525
* Rename GVNExpression *ops_ members to *op_* to match conventions in the rest ↵Daniel Berlin2016-12-251-11/+11
| | | | | | of LLVM llvm-svn: 290524
* [NewGVN] Prefer `auto` to explicit type when the latter is obvious.Davide Italiano2016-12-241-1/+1
| | | | llvm-svn: 290499
* Mark isOnlyReachableViaThisEdge as constDaniel Berlin2016-12-241-2/+2
| | | | llvm-svn: 290468
* [PM] Teach the always inlining test case to be much more strict aboutChandler Carruth2016-12-231-0/+15
| | | | | | | | | | | | | | | | | | | | whether functions are removed, and fix the new PM's always inliner to actually pass this test. Without this, the new PM's always inliner leaves all the functions kicking around which won't work out very well given the semantics of always inline. Doing this really highlights how frustrating the current alwaysinline semantic contract is though -- why can we put it on *external* functions, etc? Also I've added a number of tricky and interesting test cases for removing functions with the always inliner. There is one remaining case not handled -- fully removing comdats -- and I've left a FIXME about this. llvm-svn: 290457
* Function-import: Disable IRVerifier on lazy-loaded modules: the ODR ↵Mehdi Amini2016-12-231-8/+0
| | | | | | TypeUniquing generates invalid debug info. llvm-svn: 290442
* Fix build after r290437 (missing include)Mehdi Amini2016-12-231-0/+1
| | | | llvm-svn: 290438
* FunctionImport: fix typo '#ifndef NDEBUG' instead of '#ifndef DEBUG'Mehdi Amini2016-12-231-1/+1
| | | | llvm-svn: 290437
* [LICM] Plug a leak freeing the ASTs before clearing the map.Davide Italiano2016-12-231-0/+2
| | | | llvm-svn: 290433
* [LICM] Work around LICM needs to maintain state across loops.Davide Italiano2016-12-231-1/+6
| | | | | | | | | | | | | | | The pass creates some state which expects to be cleaned up by a later instance of the same pass. opt-bisect happens to expose this not ideal design because calling skipLoop() will result in this state not being cleaned up at times and an assertion firing in `doFinalization()`. Chandler tells me the new pass manager will give us options to avoid these design traps, but until it's not ready, we need a workaround for the current pass infrastructure. Fix provided by Andy Kaylor, see the review for a complete discussion. Differential Revision: https://reviews.llvm.org/D25848 llvm-svn: 290427
* [NewGVN] Remove (for now) unused code. NFCI.Davide Italiano2016-12-231-12/+0
| | | | llvm-svn: 290420
* [ThinLTO] Verify lazy-loaded source module for function importing when ↵Mehdi Amini2016-12-231-0/+8
| | | | | | assertions are enabled (NFC) llvm-svn: 290416
* Enable '-Wstring-conversion' and fix some bad asserts that it helpedChandler Carruth2016-12-231-1/+1
| | | | | | | | find. Notable is the assert in NewGVN which had no effect because of the bug. llvm-svn: 290400
* [cfi] Emit jump tables as a function-level inline asm.Evgeniy Stepanov2016-12-221-132/+90
| | | | | | | | | | | | | | | | | | Use a dummy private function with inline asm calls instead of module level asm blocks for CFI jumptables. The main advantage is that now jumptable codegen can be affected by the function attributes (like target_cpu on ARM). Module level asm gets the default subtarget based on the target triple, which is often not good enough. This change also uses asm constraints/arguments to reference jumptable targets and aliases directly. We no longer do asm name mangling in an IR pass. Differential Revision: https://reviews.llvm.org/D28012 llvm-svn: 290384
* [GVN] Initial check-in of a new global value numbering algorithm.Davide Italiano2016-12-223-0/+1859
| | | | | | | | | | | | | | | | | | | | | The code have been developed by Daniel Berlin over the years, and the new implementation goal is that of addressing shortcomings of the current GVN infrastructure, i.e. long compile time for large testcases, lack of phi predication, no load/store value numbering etc... The current code just implements the "core" GVN algorithm, although other pieces (load coercion, phi handling, predicate system) are already implemented in a branch out of tree. Once the core is stable, we'll start adding pieces on top of the base framework. The test currently living in test/Transform/NewGVN are a copy of the ones in GVN, with proper `XFAIL` (missing features in NewGVN). A flag will be added in a future commit to enable NewGVN, so that interested parties can exercise this code easily. Differential Revision: https://reviews.llvm.org/D26224 llvm-svn: 290346
* [PM] Introduce a reasonable port of the main per-module pass pipelineChandler Carruth2016-12-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from the old pass manager in the new one. I'm not trying to support (initially) the numerous options that are currently available to customize the pass pipeline. If we end up really wanting them, we can add them later, but I suspect many are no longer interesting. The simplicity of omitting them will help a lot as we sort out what the pipeline should look like in the new PM. I've also documented to the best of my ability *why* each pass or group of passes is used so that reading the pipeline is more helpful. In many cases I think we have some questionable choices of ordering and I've left FIXME comments in place so we know what to come back and revisit going forward. But for now, I've left it as similar to the current pipeline as I could. Lastly, I've had to comment out several places where passes are not ported to the new pass manager or where the loop pass infrastructure is not yet ready. I did at least fix a few bugs in the loop pass infrastructure uncovered by running the full pipeline, but I didn't want to go too far in this patch -- I'll come back and re-enable these as the infrastructure comes online. But I'd like to keep the comments in place because I don't want to lose track of which passes need to be enabled and where they go. One thing that seemed like a significant API improvement was to require that we don't build pipelines for O0. It seems to have no real benefit. I've also switched back to returning pass managers by value as at this API layer it feels much more natural to me for composition. But if others disagree, I'm happy to go back to an output parameter. I'm not 100% happy with the testing strategy currently, but it seems at least OK. I may come back and try to refactor or otherwise improve this in subsequent patches but I wanted to at least get a good starting point in place. Differential Revision: https://reviews.llvm.org/D28042 llvm-svn: 290325
* Refactor the DIExpression fragment query interface (NFC)Adrian Prantl2016-12-222-6/+7
| | | | | | ... so it becomes available to DIExpressionCursor. llvm-svn: 290322
* Pass GetAssumptionCache to InlineFunctionInfo constructorEaswaran Raman2016-12-221-1/+1
| | | | | | Differential revision: https://reviews.llvm.org/D28038 llvm-svn: 290295
* Revert "[InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp"David Majnemer2016-12-212-98/+2
| | | | | | This reverts commit r289813, it caused PR31449. llvm-svn: 290266
* [LDist] Match behavior between invoking via optimization pipeline or opt ↵Adam Nemet2016-12-212-32/+9
| | | | | | | | | | | | | | | | | | | | | | | | -loop-distribute In r267672, where the loop distribution pragma was introduced, I tried it hard to keep the old behavior for opt: when opt is invoked with -loop-distribute, it should distribute the loop (it's off by default when ran via the optimization pipeline). As MichaelZ has discovered this has the unintended consequence of breaking a very common developer work-flow to reproduce compilations using opt: First you print the pass pipeline of clang with -debug-pass=Arguments and then invoking opt with the returned arguments. clang -debug-pass will include -loop-distribute but the pass is invoked with default=off so nothing happens unless the loop carries the pragma. While through opt (default=on) we will try to distribute all loops. This changes opt's default to off as well to match clang. The tests are modified to explicitly enable the transformation. llvm-svn: 290235
* IPO: Remove the ModuleSummary argument to the FunctionImport pass. NFCI.Peter Collingbourne2016-12-212-34/+15
| | | | | | | | | No existing client is passing a non-null value here. This will come back in a slightly different form as part of the type identifier summary work. Differential Revision: https://reviews.llvm.org/D28006 llvm-svn: 290222
* [Analysis] Centralize objectsize lowering logic.George Burgess IV2016-12-202-16/+8
| | | | | | | | | We're currently doing nearly the same thing for @llvm.objectsize in three different places: two of them are missing checks for overflow, and one of them could subtly break if InstCombine gets much smarter about removing alloc sites. Seems like a good idea to not do that. llvm-svn: 290214
* [LoopUnroll] Modify a comment to clarify the usage of TripCount. NFC.Haicheng Wu2016-12-201-8/+8
| | | | | | | | | Make it clear that TripCount is the upper bound of the iteration on which control exits LatchBlock. Differential Revision: https://reviews.llvm.org/D26675 llvm-svn: 290199
* [PM] Provide an initial, minimal port of the inliner to the new pass manager.Chandler Carruth2016-12-204-20/+202
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This doesn't implement *every* feature of the existing inliner, but tries to implement the most important ones for building a functional optimization pipeline and beginning to sort out bugs, regressions, and other problems. Notable, but intentional omissions: - No alloca merging support. Why? Because it isn't clear we want to do this at all. Active discussion and investigation is going on to remove it, so for simplicity I omitted it. - No support for trying to iterate on "internally" devirtualized calls. Why? Because it adds what I suspect is inappropriate coupling for little or no benefit. We will have an outer iteration system that tracks devirtualization including that from function passes and iterates already. We should improve that rather than approximate it here. - Optimization remarks. Why? Purely to make the patch smaller, no other reason at all. The last one I'll probably work on almost immediately. But I wanted to skip it in the initial patch to try to focus the change as much as possible as there is already a lot of code moving around and both of these *could* be skipped without really disrupting the core logic. A summary of the different things happening here: 1) Adding the usual new PM class and rigging. 2) Fixing minor underlying assumptions in the inline cost analysis or inline logic that don't generally hold in the new PM world. 3) Adding the core pass logic which is in essence a loop over the calls in the nodes in the call graph. This is a bit duplicated from the old inliner, but only a handful of lines could realistically be shared. (I tried at first, and it really didn't help anything.) All told, this is only about 100 lines of code, and most of that is the mechanics of wiring up analyses from the new PM world. 4) Updating the LazyCallGraph (in the new PM) based on the *newly inlined* calls and references. This is very minimal because we cannot form cycles. 5) When inlining removes the last use of a function, eagerly nuking the body of the function so that any "one use remaining" inline cost heuristics are immediately refined, and queuing these functions to be completely deleted once inlining is complete and the call graph updated to reflect that they have become dead. 6) After all the inlining for a particular function, updating the LazyCallGraph and the CGSCC pass manager to reflect the function-local simplifications that are done immediately and internally by the inline utilties. These are the exact same fundamental set of CG updates done by arbitrary function passes. 7) Adding a bunch of test cases to specifically target CGSCC and other subtle aspects in the new PM world. Many thanks to the careful review from Easwaran and Sanjoy and others! Differential Revision: https://reviews.llvm.org/D24226 llvm-svn: 290161
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-202-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades and a change to the Bitcode record for DIGlobalVariable, that makes upgrading the old format unambiguous also for variables without DIExpressions. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 290153
* [LV] Sink tripcount query to where it's actually used. NFC.Michael Kuperstein2016-12-191-5/+4
| | | | llvm-svn: 290142
* [InstCombine] use commutative matcher for pattern with commutative operatorsSanjay Patel2016-12-191-3/+5
| | | | | | | | This is a case that was missed in: https://reviews.llvm.org/rL290067 ...and it would regress if we fix operand complexity (PR28296). llvm-svn: 290127
* [InstCombine] add folds for icmp (umin|umax X, Y), XSanjay Patel2016-12-191-19/+54
| | | | | | | | This is a follow-up to: https://reviews.llvm.org/rL289855 (https://reviews.llvm.org/D27531) https://reviews.llvm.org/rL290111 llvm-svn: 290118
* [LoopVersioning] Require loop-simplify form for loop versioning.Florian Hahn2016-12-194-14/+17
| | | | | | | | | | | | | | | | Summary: Requiring loop-simplify form for loop versioning ensures that the runtime check block always dominates the exit block. This patch closes #30958 (https://llvm.org/bugs/show_bug.cgi?id=30958). Reviewers: silviu.baranga, hfinkel, anemet, ashutosh.nema Subscribers: ashutosh.nema, mzolotukhin, efriedma, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D27469 llvm-svn: 290116
* [InstCombine] add folds for icmp (smax X, Y), XSanjay Patel2016-12-191-17/+35
| | | | | | | This is a follow-up to: https://reviews.llvm.org/rL289855 (D27531) llvm-svn: 290111
* Revert @llvm.assume with operator bundles (r289755-r289757)Daniel Jasper2016-12-1948-327/+505
| | | | | | | This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086
* [InstCombine] use commutative matchers for patterns with commutative operatorsSanjay Patel2016-12-181-20/+35
| | | | | | | | | | | | | | | | | | | | | | | | Background/motivation - I was circling back around to: https://llvm.org/bugs/show_bug.cgi?id=28296 I made a simple patch for that and noticed some regressions, so added test cases for those with rL281055, and this is hopefully the minimal fix for just those cases. But as you can see from the surrounding untouched folds, we are missing commuted patterns all over the place, and of course there are no regression tests to cover any of those cases. We could sprinkle "m_c_" dust all over this file and catch most of the missing folds, but then we still wouldn't have test coverage, and we'd still miss some fraction of commuted patterns because they require adjustments to the match order. I'm aware of the concern about the potential compile-time performance impact of adding matches like this (currently being discussed on llvm-dev), but I don't think there's any evidence yet to suggest that handling commutative pattern matching more thoroughly is not a worthwhile goal of InstCombine. Differential Revision: https://reviews.llvm.org/D24419 llvm-svn: 290067
* [InstCombine] Simplify code slightly. NFCCraig Topper2016-12-171-1/+1
| | | | llvm-svn: 290046
* Revert "[GVNHoist] Move GVNHoist to function simplification part of pipeline."Evgeniy Stepanov2016-12-171-2/+2
| | | | | | | | This reverts r289696, which caused TSan perf regression. See PR31382. llvm-svn: 290030
* Preserve loop metadata when folding branches to a common destination.Michael Kuperstein2016-12-161-0/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D27830 llvm-svn: 289992
* Revert "[IR] Remove the DIExpression field from DIGlobalVariable."Adrian Prantl2016-12-162-11/+8
| | | | | | | | | | | | | | | | | This reverts commit 289920 (again). I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable has not DIExpression. Unfortunately it is not possible to safely upgrade these variables without adding a flag to the bitcode record indicating which version they are. My plan of record is to roll the planned follow-up patch that adds a unit: field to DIGlobalVariable into this patch before recomitting. This way we only need one Bitcode upgrade for both changes (with a version flag in the bitcode record to safely distinguish the record formats). Sorry for the churn! llvm-svn: 289982
* Reapply "[LV] Enable vectorization of loops with conditional stores by default"Matthew Simpson2016-12-161-1/+1
| | | | | | | | This patch reapplies r289863. The original patch was reverted because it exposed a bug causing the loop vectorizer to crash in the Python runtime on PPC. The underlying issue was fixed with r289958. llvm-svn: 289975
* [LV] Don't attempt to type-shrink scalarized instructionsMatthew Simpson2016-12-161-5/+21
| | | | | | | | | | After r288909, instructions feeding predicated instructions may be scalarized if profitable. Since these instructions will remain scalar, we shouldn't attempt to type-shrink them. We should only truncate vector types to their minimal bit widths. This bug was exposed by enabling the vectorization of loops containing conditional stores by default. llvm-svn: 289958
* Revert r289863: [LV] Enable vectorization of loops with conditionalChandler Carruth2016-12-161-1/+1
| | | | | | | | | | stores by default This uncovers a crasher in the loop vectorizer on PPC when building the Python runtime. I'll send the testcase to the review thread for the original commit. llvm-svn: 289934
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-162-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289920
* [ThinLTO] Thin link efficiency: More efficient export list computationTeresa Johnson2016-12-161-32/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Instead of checking whether a global referenced by a function being imported is defined in the same module, speculatively always add the referenced globals to the module's export list. After all imports are computed, for each module prune any not in its defined set from its export list. For a huge C++ app with aggressive importing thresholds, even with D27687 we spent a lot of time invoking modulePath() from exportGlobalInModule (modulePath() was still the 2nd hottest routine in profile). The reason is that with comdat/linkonce the summary lists for each GUID can be long. For the app in question, for example, we were invoking exportGlobalInModule almost 2 million times, and we traversed an average of 63 entries in the summary list each time. This patch reduced the thin link time for the app by about 10% (on top of D27687) when using aggressive importing thresholds, and about 3.5% on average with default importing thresholds. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27755 llvm-svn: 289918
* [SimplifyLibCalls] Use a lambda. NFCI.Davide Italiano2016-12-161-6/+6
| | | | llvm-svn: 289911
* Revert "[IR] Remove the DIExpression field from DIGlobalVariable."Adrian Prantl2016-12-162-11/+8
| | | | | | This reverts commit 289902 while investigating bot berakage. llvm-svn: 289906
* Add missing library dep.Peter Collingbourne2016-12-161-1/+1
| | | | llvm-svn: 289903
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-162-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289902
* IPO: Introduce ThinLTOBitcodeWriter pass.Peter Collingbourne2016-12-162-0/+345
| | | | | | | | | | | | | | This pass prepares a module containing type metadata for ThinLTO by splitting it into regular and thin LTO parts if possible, and writing both parts to a multi-module bitcode file. Modules that do not contain type metadata are written unmodified as a single module. All globals with type metadata are added to the regular LTO module, and the rest are added to the thin LTO module. Differential Revision: https://reviews.llvm.org/D27324 llvm-svn: 289899
* [ThinLTO] Thin link efficiency improvement: don't re-export globals (NFC)Teresa Johnson2016-12-151-9/+13
| | | | | | | | | | | | | | | | | Summary: We were reinvoking exportGlobalInModule numerous times redundantly. No need to re-export globals referenced by a global that was already imported from its module. This resulted in a large speedup in the thin link for a big application, particularly when importing aggressiveness was cranked up. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27687 llvm-svn: 289896
* [SimplifyLibCalls] Lower fls() to llvm.ctlz().Davide Italiano2016-12-151-0/+16
| | | | | | Differential Revision: https://reviews.llvm.org/D14590 llvm-svn: 289894
* [SimplifyLibCalls] Remove redundant folding logic for ffs().Davide Italiano2016-12-151-13/+3
| | | | | | | | Lowering to llvm.cttz() will result in constant folding anyway if the argument to ffs is a constant. Pointed out by Eli for fls() in D14590. llvm-svn: 289888
* [ThinLTO] Revert part of r289843 that belonged to another patch.Teresa Johnson2016-12-151-13/+9
| | | | | | | | The code change for D27687 accidentally got committed along with the main change in r289843. Revert it temporarily, so that I can recommit it along with its test as intended. llvm-svn: 289875
OpenPOWER on IntegriCloud