summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
* [ValueTracking] Improve isImpliedCondition when the dominating cond is false.Chad Rosier2016-04-251-2/+5
| | | | llvm-svn: 267430
* Re-commit optimization bisect support (r267022) without new pass manager ↵Andrew Kaylor2016-04-2237-36/+45
| | | | | | | | | | support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
* PM: Port SinkingPass to the new pass managerJustin Bogner2016-04-222-67/+78
| | | | llvm-svn: 267199
* PM: Reorder the functions used for SinkingPass. NFCJustin Bogner2016-04-221-60/+60
| | | | | | This will make the port to the new PM easier to follow. llvm-svn: 267198
* [DeadStoreElimination] Shorten beginning of memset overwritten by later storesJun Bum Lim2016-04-221-26/+71
| | | | | | | | | | | | Summary: This change will shorten memset if the beginning of memset is overwritten by later stores. Reviewers: hfinkel, eeckstein, dberlin, mcrosier Subscribers: mgrang, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18906 llvm-svn: 267197
* PM: Port DCE to the new pass managerJustin Bogner2016-04-222-33/+37
| | | | | | | Also add a very basic test, since apparently there aren't any tests for DCE whatsoever to add the new pass version to. llvm-svn: 267196
* [EarlyCSE/CVP] Add stats for CVPs and make sure to account for any Changes.Chad Rosier2016-04-221-4/+9
| | | | llvm-svn: 267187
* [EarlyCSE] Don't add the overflow flags to the hashDavid Majnemer2016-04-221-9/+0
| | | | | | | | We take the intersection of overflow flags while CSE'ing. This permits us to consider two instructions with different overflow behavior to be replaceable. llvm-svn: 267153
* Revert "Initial implementation of optimization bisect support."Vedant Kumar2016-04-2237-65/+36
| | | | | | | | This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115
* [GVN] Respect fast-math-flags on fcmpsDavid Majnemer2016-04-221-22/+21
| | | | | | | We assumed that flags were only present on binary operators. This is not true, they may also be present on calls and fcmps. llvm-svn: 267113
* [EarlyCSE] Take the intersection of flags on instructionsDavid Majnemer2016-04-221-10/+3
| | | | | | | | | | | | | EarlyCSE had inconsistent behavior with regards to flag'd instructions: - In some cases, it would pessimize if the available instruction had different flags by not performing CSE. - In other cases, it would miscompile if it replaced an instruction which had no flags with an instruction which has flags. Fix this by being more consistent with our flag handling by utilizing andIRFlags. llvm-svn: 267111
* Initial implementation of optimization bisect support.Andrew Kaylor2016-04-2137-36/+65
| | | | | | | | | | | | This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022
* [LoopUtils] Move def of findStringMetadataForLoop to LoopUtils.cpp. NFCAdam Nemet2016-04-211-22/+0
| | | | | | | The decl is in LoopUtils.h. I think that this was added to LoopVersioningLICM.cpp by mistake. llvm-svn: 267014
* [LoopUtils] Rename {check->find}StringMetadata{Into->For}Loop. NFCAdam Nemet2016-04-211-4/+4
| | | | | | | | "Into" was misleading. I am also planning to use this helper to look for loop metadata and return the argument, so find seems like a better name. llvm-svn: 267013
* Typo.Chad Rosier2016-04-201-1/+1
| | | | llvm-svn: 266905
* [ValueTracking] Make isImpliedCondition return an Optional<bool>. NFC.Chad Rosier2016-04-201-4/+5
| | | | | | Phabricator Revision: http://reviews.llvm.org/D19277 llvm-svn: 266904
* [ValueTracking] Improve isImpliedCondition for conditions with matching ↵Chad Rosier2016-04-191-3/+4
| | | | | | | | | | | | | | | operands. This patch improves SimplifyCFG to catch cases like: if (a < b) { if (a > b) <- known to be false unreachable; } Phabricator Revision: http://reviews.llvm.org/D18905 llvm-svn: 266767
* Port DemandedBits to the new pass manager.Michael Kuperstein2016-04-181-3/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D18679 llvm-svn: 266699
* [NFC] Header cleanupMehdi Amini2016-04-188-16/+3
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* Transforms: Fix bootstrap after r266565Duncan P. N. Exon Smith2016-04-171-1/+1
| | | | | | | | | | | | Apparently there isn't test coverage for all of these. I'd appreciate if someone with could reproduce and send me something to reduce, but for now I've just looked for users of RemapInstruction and MapValue and ensured they don't accidentally insert nullptr. Here is one of the bootstraps that caught: http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11494 llvm-svn: 266567
* [Speculation] Add a SpeculativeExecution mode where the pass does nothing ↵Justin Lebar2016-04-151-5/+43
| | | | | | | | | | | | | | | | unless TTI::hasBranchDivergence() is true. Summary: This lets us add this pass to the IR pass manager unconditionally; it will simply not do anything on targets without branch divergence. Reviewers: tra Subscribers: llvm-commits, jingyue, rnk, chandlerc Differential Revision: http://reviews.llvm.org/D18625 llvm-svn: 266398
* [StructurizeCFG] Annotate branches that were treated as uniformNicolai Haehnle2016-04-141-0/+15
| | | | | | | | | | | | | | | | | | | Summary: This fully solves the problem where the StructurizeCFG pass does not consider the same branches as uniform as the SIAnnotateControlFlow pass. The patch in D19013 helps with this problem, but is not sufficient (and, interestingly, causes a "regression" with one of the existing test cases). No tests included here, because tests in D19013 already cover this. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19018 llvm-svn: 266346
* ARM: override cost function to re-enable ConstantHoisting (& fix it).Tim Northover2016-04-131-0/+12
| | | | | | | | | | | | | | | | At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260
* [PGO] Remove redundant VP instrumentationBetul Buyukkurt2016-04-131-0/+16
| | | | | | | | LLVM optimization passes may reduce a profiled target expression to a constant. Removing runtime calls at such instrumentation points would help speedup the runtime of the instrumented program. llvm-svn: 266229
* Don't IPO over functions that can be de-refinedSanjoy Das2016-04-082-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762
* [GVN] Address review comments for D18662Ulrich Weigand2016-04-071-9/+10
| | | | | | | | | | | As suggested by Chandler in his review comments for D18662, this follow-on patch renames some variables in GetLoadValueForLoad and CoerceAvailableValueToLoadType to hopefully make it more obvious which variables hold value sizes and which hold load/store sizes. No functional change intended. llvm-svn: 265687
* [GVN] Fix handling of sub-byte types in big-endian modeUlrich Weigand2016-04-071-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | When GVN wants to re-interpret an already available value in a smaller type, it needs to right-shift the value on big-endian systems to ensure the correct bytes are accessed. The shift value is the difference of the sizes of the two types. This is correct as long as both types occupy multiples of full bytes. However, when one of them is a sub-byte type like i1, this no longer holds true: we still need to shift, but only to access the correct *byte*. Accessing bits within the byte requires no shift in either endianness; e.g. an i1 resides in the least-significant bit of its containing byte on both big- and little-endian systems. Therefore, the appropriate shift value to be used is the difference of the *storage* sizes of the two types. This is already handled correctly in one place where such a shift takes place (GetStoreValueForLoad), but is incorrect in two other places: GetLoadValueForLoad and CoerceAvailableValueToLoadType. This patch changes both places to use the storage size as well. Differential Revision: http://reviews.llvm.org/D18662 llvm-svn: 265684
* IR: RF_IgnoreMissingValues => RF_IgnoreMissingLocals, NFCDuncan P. N. Exon Smith2016-04-073-3/+3
| | | | | | | | | | | | | | | | | | | | | | | Clarify what this RemapFlag actually means. - Change the flag name to match its intended behaviour. - Clearly document that it's not supposed to affect globals. - Add a host of FIXMEs to indicate how to fix the behaviour to match the intent of the flag. RF_IgnoreMissingLocals should only affect the behaviour of RemapInstruction for function-local operands; namely, for operands of type Argument, Instruction, and BasicBlock. Currently, it is *only* passed into RemapInstruction calls (and the transitive MapValue calls that it makes). When I split Metadata from Value I didn't understand the flag, and I used it in a bunch of places for "global" metadata. This commit doesn't have any functionality change, but prepares to cleanup MapMetadata and MapValue. llvm-svn: 265628
* NFC: make AtomicOrdering an enum classJF Bastien2016-04-062-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602
* Loop Unroll: add options and tweak to make Partial unrolling more usefulFiona Glaser2016-04-061-3/+20
| | | | | | | | | | | | | | | | 1. Add FullUnrollMaxCount option that works like MaxCount, but also limits the unroll count for fully unrolled loops. So if a loop has an iteration count over this, it won't fully unroll. 2. Add CLI options for MaxCount and the new option, so they can be tested (plus a test). 3. Make partial unrolling obey MaxCount. An example use-case (the out of tree one this is originally designed for) is a target’s TTI can analyze a loop and decide on a max unroll count separate from the size threshold, e.g. based on register pressure, then constrain LoopUnroll to not exceed that, regardless of the size of the unrolled loop. llvm-svn: 265562
* LoopUnroll: only allow non-modulo Partial unrolling when Runtime=trueFiona Glaser2016-04-061-2/+4
| | | | | | Patch by Evgeny Stupachenko <evstupac@gmail.com>. llvm-svn: 265558
* Simplify logic. NFC.Chad Rosier2016-04-061-7/+5
| | | | llvm-svn: 265537
* Add parentheses to silence warning.Richard Trieu2016-04-061-1/+2
| | | | llvm-svn: 265516
* [RS4GC] Add a commentSanjoy Das2016-04-061-0/+4
| | | | llvm-svn: 265503
* [RS4GC] NFC cleanup of the DeferredReplacement classSanjoy Das2016-04-051-5/+18
| | | | | | Instead of constructors use clearly named factory methods. llvm-svn: 265486
* [RS4GC] Better codegen for deoptimize callsSanjoy Das2016-04-051-16/+52
| | | | | | | | | Don't emit a gc.result for a statepoint lowered from @llvm.experimental.deoptimize since the call into __llvm_deoptimize is effectively noreturn. Instead follow the corresponding gc.statepoint with an "unreachable". llvm-svn: 265485
* use range loop; NFCISanjay Patel2016-04-041-3/+3
| | | | llvm-svn: 265360
* Enable unroll for constant bound loops when TripCount is not modulo of ↵Zia Ansari2016-04-041-0/+10
| | | | | | | | | | unroll factor, reducing it to maximum power-of-2 that satisfies threshold limit. Commit for Evgeny Stupachenko (evstupac@gmail.com) Differential Revision: http://reviews.llvm.org/D18290 llvm-svn: 265337
* Introduce a @llvm.experimental.guard intrinsicSanjoy Das2016-03-313-0/+110
| | | | | | | | | | | | | | | | | | | | | | | Summary: As discussed on llvm-dev[1]. This change adds the basic boilerplate code around having this intrinsic in LLVM: - Changes in Intrinsics.td, and the IR Verifier - A lowering pass to lower @llvm.experimental.guard to normal control flow - Inliner support [1]: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095523.html Reviewers: reames, atrick, chandlerc, rnk, JosephTremoulet, echristo Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18527 llvm-svn: 264976
* [IndVarSimplify] Don't insert after a catchswitchDavid Majnemer2016-03-301-0/+6
| | | | | | | | | | Widening a PHI requires us to insert a trunc. The logical place for this trunc is in the same BB as the PHI. This is not possible if the BB is terminated by a catchswitch. This fixes PR27133. llvm-svn: 264926
* [LoopDataPrefetch] Centralize the tuning cl::opts under the passAdam Nemet2016-03-291-4/+35
| | | | | | | | | This is effectively NFC, minus the renaming of the options (-cyclone-prefetch-distance -> -prefetch-distance). The change was requested by Tim in D17943. llvm-svn: 264806
* ADCE: Remove debug info intrinsics in dead scopesDuncan P. N. Exon Smith2016-03-291-6/+60
| | | | | | | | | | | | | | | | | During ADCE, track which debug info scopes still have live references from the code, and delete debug info intrinsics for the dead ones. These intrinsics describe the locations of variables (in registers or stack slots). If there's no code left corresponding to a variable's scope, then there's no way to reference the variable in the debugger and it doesn't matter what its value is. I add a DEBUG printout when the described location in an SSA register, in case it helps some trying to track down why locations get lost. However, we still delete these; the scope itself isn't attached to any real code, so the ship has already sailed. llvm-svn: 264800
* [LoopDataPrefetch] Make more member functions private, NFC.Adam Nemet2016-03-291-1/+2
| | | | llvm-svn: 264798
* [SimlifyCFG] Prevent passes from destroying canonical loop structure, ↵Hyojin Sung2016-03-292-2/+13
| | | | | | | | | | | | | | | | | especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264697
* Revert "[SimlifyCFG] Prevent passes from destroying canonical loop ↵Reid Kleckner2016-03-282-14/+3
| | | | | | | | | | structure, especially for nested loops" This reverts commit r264596. It does not compile. llvm-svn: 264604
* [SimlifyCFG] Prevent passes from destroying canonical loop structure, ↵Hyojin Sung2016-03-282-3/+14
| | | | | | | | | | | | | | | | especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264596
* [SROA] Fix typo in commentHal Finkel2016-03-281-1/+1
| | | | llvm-svn: 264573
* C++11 is required, remove some preprocessor checks for itHal Finkel2016-03-281-3/+3
| | | | | | | We require C++11 to build, so remove a few remaining preprocessor checks for '__cplusplus >= 201103L'. This should always be true. llvm-svn: 264572
* [RS4GC] Lower calls to @llvm.experimental.deoptimizeSanjoy Das2016-03-251-1/+21
| | | | | | | | | | | | | | This changes RS4GC to lower calls to ``@llvm.experimental.deoptimize`` to gc.statepoints wrapping ``__llvm_deoptimize``, and changes ``callsGCLeafFunction`` to recognize ``@llvm.experimental.deoptimize`` as a non GC leaf function. I've had to hard code the ``"__llvm_deoptimize"`` name in RewriteStatepointsForGC; since ``TargetLibraryInfo`` is available only during codegen. This isn't without precedent in the codebase, so I'm not overtly concerned. llvm-svn: 264456
* Enable non-power-of-2 #pragma unroll counts.David L Kreitzer2016-03-251-5/+4
| | | | | | | | Patch by Evgeny Stupachenko. Differential Revision: http://reviews.llvm.org/D18202 llvm-svn: 264407
OpenPOWER on IntegriCloud