summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Don't miscompile select to poisonDavid Majnemer2015-06-061-0/+13
| | | | | | | | | | | | | | | | If we have (select a, b, c), it is sometimes valid to simplify this to a single select operand. However, doing so is only valid if the computation doesn't inject poison into the computation. It might be helpful to consider the following example: (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN) The select is equivalent to (add %i, 1) but not (add nsw %i, 1). Self hosting on x86_64 revealed that this occurs very, very rarely so bailing out is hopefully pretty reasonable. llvm-svn: 239215
* Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced"Renato Golin2015-06-051-22/+4
| | | | | | | | | This reverts commit r239141. This commit was an attempt to reintroduce a previous patch that broke many self-hosting bots with clang timeouts, but it still has slowdown issues, at least on ARM, increasing the compilation time (stage 2, clang's) by 5x. llvm-svn: 239175
* [InstCombine][NFC] Add a ``break;`` statement.Sanjoy Das2015-06-051-0/+1
| | | | | | | | This change is NFC because both the ``break;`` and the fall through end up returning immediately. However, this helps clarify intent and also ensures correctness in case more ``case`` blocks are added later. llvm-svn: 239172
* [InstCombine] Fix PR23751.Sanjoy Das2015-06-051-0/+1
| | | | | | PR23751 was caused by a missing ``break;`` in r234388. llvm-svn: 239171
* [Unroll] Rework the naming and structure of the new unroll heuristics.Chandler Carruth2015-06-051-95/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new naming is (to me) much easier to understand. Here is a summary of the new state of the world: - '*Threshold' is the threshold for full unrolling. It is measured against the estimated unrolled cost as computed by getUserCost in TTI (or CodeMetrics, etc). We will exceed this threshold when unrolling loops where unrolling exposes a significant degree of simplification of the logic within the loop. - '*PercentDynamicCostSavedThreshold' is the percentage of the loop's estimated dynamic execution cost which needs to be saved by unrolling to apply a discount to the estimated unrolled cost. - '*DynamicCostSavingsDiscount' is the discount applied to the estimated unrolling cost when the dynamic savings are expected to be high. When actually analyzing the loop, we now produce both an estimated unrolled cost, and an estimated rolled cost. The rolled cost is notably a dynamic estimate based on our analysis of the expected execution of each iteration. While we're still working to build up the infrastructure for making these estimates, to me it is much more clear *how* to make them better when they have reasonably descriptive names. For example, we may want to apply estimated (from heuristics or profiles) dynamic execution weights to the *dynamic* cost estimates. If we start doing that, we would also need to track the static unrolled cost and the dynamic unrolled cost, as only the latter could reasonably be weighted by profile information. This patch is sadly not without functionality change for the new unroll analysis logic. Buried in the heuristic management were several things that surprised me. For example, we never subtracted the optimized instruction count off when comparing against the unroll heursistics! I don't know if this just got lost somewhere along the way or what, but with the new accounting of things, this is much easier to keep track of and we use the post-simplification cost estimate to compare to the thresholds, and use the dynamic cost reduction ratio to select whether we can exceed the baseline threshold. The old values of these flags also don't necessarily make sense. My impression is that none of these thresholds or discounts have been tuned yet, and so they're just arbitrary placehold numbers. As such, I've not bothered to adjust for the fact that this is now a discount and not a tow-tier threshold model. We need to tune all these values once the logic is ready to be enabled. Differential Revision: http://reviews.llvm.org/D9966 llvm-svn: 239164
* [LoopVectorize] Don't crash on zero-sized types in isInductionPHIDavid Majnemer2015-06-051-0/+3
| | | | | | | | | isInductionPHI wants to calculate the stride based on the pointee size. However, this is not possible when the pointee is zero sized. This fixes PR23763. llvm-svn: 239143
* [InstCombine] Rephrase fix to SimplifyWithOpReplacedDavid Majnemer2015-06-051-4/+22
| | | | | | | | | | | | | | | | | | I don't have the IR which is causing the build bot breakage but I can postulate as to why they are timing out: 1. SimplifyWithOpReplaced was stripping flags from the simplified value. 2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because it's simplification wasn't correct. 3. InstCombine would revisit the add instruction and note that it can rederive the flags. 4. By modifying the value, we chose to revisit instructions which reuse the value. One of the instructions is the original select, causing LLVM to never reach fixpoint. Instead, strip the flags only when we are sure we are going to perform the simplification. llvm-svn: 239141
* Revert "[InstCombine] Don't miscompile safe increment idiom"Daniel Jasper2015-06-051-21/+3
| | | | | | | | | This is breaking a lot of build bots and is causing very long-running compiles (infinite loops)? Likely, we shouldn't return nullptr? llvm-svn: 239139
* [InstCombine] Don't miscompile safe increment idiomDavid Majnemer2015-06-041-3/+21
| | | | | | | | | | | We cleverly handle cases where computation done in one argument of a select instruction is suitable for the other operand, thus obviating the need of the select and the comparison. However, the other operand cannot have flags. This fixes PR23757. llvm-svn: 239115
* Tidy code in InstrProfiling.cpp. NFC.Diego Novillo2015-06-041-8/+8
| | | | | | | | | Removed the redundant "llvm::" from class names in InstrProfiling.cpp clang-format is ran on the changes. Patch from Betul Buyukkurt. llvm-svn: 239034
* [PM/AA] Start refactoring AliasAnalysis to remove the analysis group andChandler Carruth2015-06-048-31/+31
| | | | | | | | | | | | | | | | | | | | | port it to the new pass manager. All this does is extract the inner "location" class used by AA into its own full fledged type. This seems *much* cleaner as MemoryDependence and soon MemorySSA also use this heavily, and it doesn't make much sense being inside the AA infrastructure. This will also make it much easier to break apart the AA infrastructure into something that stands on its own rather than using the analysis group design. There are a few places where this makes APIs not make sense -- they were taking an AliasAnalysis pointer just to build locations. I'll try to clean those up in follow-up commits. Differential Revision: http://reviews.llvm.org/D10228 llvm-svn: 239003
* Remove stray semicolon. NFC.Vasileios Kalintiris2015-06-031-1/+1
| | | | llvm-svn: 238908
* [RewriteStatepointsForGC] Strip deref info after rewriting.Sanjoy Das2015-06-021-0/+102
| | | | | | | | | | | | | | | | | | | | | Summary: Once a gc.statepoint has been rewritten to relocate live references, the SSA values represent physical pointers instead of logical references. Logical dereferencability does not imply physical dereferencability and after RewriteStatepointsForGC has run any attributes that imply dereferencability of the logical references need to be stripped. This current approach is conservative, and can be made more precise later if needed. For starters, we need to strip dereferencable attributes only from pointers that live in the GC address space. Reviewers: reames, pgavlin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10105 llvm-svn: 238883
* [NFCI] Change RewriteStatepointsForGC to a ModulePass.Sanjoy Das2015-06-021-13/+16
| | | | | | | | | | | | | | Summary: A later change that has RewriteStatepointsForGC change function attributes throughout the module depends on this. Reviewers: reames, pgavlin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10104 llvm-svn: 238882
* Teach the IR Sink pass to (conservatively) respect convergent annotations.Owen Anderson2015-06-011-0/+6
| | | | llvm-svn: 238762
* [opaque pointer type] Explicitly store the pointee type of the result of a GEPDavid Blaikie2015-06-011-1/+4
| | | | | | | | Alternatively, this type could be derived on-demand whenever getResultElementType is called - if someone thinks that's the better choice (simple time/space tradeoff), I'm happy to give it a go. llvm-svn: 238716
* Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial typesBenjamin Kramer2015-05-298-24/+24
| | | | | | | | | | | | | | | | | | | | If the type isn't trivially moveable emplace can skip a potentially expensive move. It also saves a couple of characters. Call sites were found with the ASTMatcher + some semi-automated cleanup. memberCallExpr( argumentCountIs(1), callee(methodDecl(hasName("push_back"))), on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))), hasArgument(0, bindTemporaryExpr( hasType(recordDecl(hasNonTrivialDestructor())), has(constructExpr()))), unless(isInTemplateInstantiation())) No functional change intended. llvm-svn: 238602
* Enable exitValue rewrite only when the cost of expansion is low.Wei Mi2015-05-281-15/+129
| | | | | | | | The patch evaluates the expansion cost of exitValue in indVarSimplify pass, and only does the rewriting when the expansion cost is low or loop can be deleted with the rewriting. It provides an option "-replexitval=" to control the default aggressiveness of the exitvalue rewriting. It also fixes some missing cases in SCEVExpander::isHighCostExpansionHelper to enhance the evaluation of SCEV expansion cost. Differential Revision: http://reviews.llvm.org/D9800 llvm-svn: 238507
* [InstCombine] Fold IntToPtr and PtrToInt into preceding loads.David Majnemer2015-05-281-5/+10
| | | | | | | | | | | Currently we only fold a BitCast into a Load when the BitCast is its only user. Do the same for any no-op cast. Differential Revision: http://reviews.llvm.org/D9152 llvm-svn: 238452
* Don't call utostr in Twine/raw_ostream contexts.Benjamin Kramer2015-05-281-1/+1
| | | | | | Creating temporary std::strings there is unnecessary. llvm-svn: 238412
* [ASan] Fix previous commit. Patch by Max Ostapenko!Yury Gribov2015-05-281-4/+4
| | | | llvm-svn: 238403
* [ASan] New approach to dynamic allocas unpoisoning. Patch by Max Ostapenko!Yury Gribov2015-05-281-163/+76
| | | | | | Differential Revision: http://reviews.llvm.org/D7098 llvm-svn: 238402
* [Reassociate] Canonicalizing 'x [+-] (-Constant * y)' isn't always a winDavid Majnemer2015-05-281-35/+21
| | | | | | | | | | | | | Canonicalizing 'x [+-] (-Constant * y)' is not a win if we don't *know* we will open up CSE opportunities. If the multiply was 'nsw', then negating 'y' requires us to clear the 'nsw' flag. If this is actually worth pursuing, it is probably more appropriate to do so in GVN or EarlyCSE. This fixes PR23675. llvm-svn: 238397
* [NaryReassociate] Run EarlyCSE after NaryReassociateJingyue Wu2015-05-281-1/+23
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch made two improvements to NaryReassociate and the NVPTX pipeline 1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common expressions. 2. When adding an instruction to SeenExprs, maps both the SCEV before and after reassociation to that instruction. Test Plan: updated @reassociate_gep_nsw in nary-gep.ll Reviewers: meheff, broune Reviewed By: broune Subscribers: dberlin, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9947 llvm-svn: 238396
* Final fix for PR 23499 and IR test case.Diego Novillo2015-05-271-5/+5
| | | | | | | | | | | This fixes a bit I forgot in r238335. In addition to the data record and the counter, we can also move the name of the counter to the comdat for the associated function. I'm also adding an IR test case to check that these three elements are placed in the proper comdat. llvm-svn: 238351
* Fix PR 23499 - Avoid multiple profile counters for functions in comdat sections.Diego Novillo2015-05-271-0/+6
| | | | | | | | Counter symbols created for linkonce functions are not discarded by ELF linkers unless the symbols are placed in the same comdat section as its associated function. llvm-svn: 238335
* [PlaceSafepoints] Entry safepoint location doesn't need to be a terminatorPhilip Reames2015-05-261-17/+1
| | | | | | | | | | Long ago, the poll insertion code assumed that the insertion site was a terminator. As a result, the entry selection code would split a basic block to ensure it could pass a terminator. The insertion code was updated quite a while ago - possibly before it ever landed upstream - but the now redundant work was never removed. While I'm at it, remove a comment which doesn't apply to the upstreamed code. NFC intended. llvm-svn: 238254
* [PlaceSafepoints] Cleanup InsertSafepointPoll functionPhilip Reames2015-05-261-20/+17
| | | | | | While working on another change, I noticed that the naming in this function was mildly deceptive. While fixing that, I took the oppurtunity to modernize some of the code. NFC intended. llvm-svn: 238252
* Use range-based for loops. NFC.Craig Topper2015-05-251-130/+84
| | | | llvm-svn: 238154
* Remove conflicting attributes before adding deduced readonly/readnoneBjorn Steinbrink2015-05-251-1/+5
| | | | | | | | | | | | | | | | Summary: In case of functions that have a pointer argument and only pass it to each other, the function attributes pass deduces that the pointer should get the readnone attribute, but fails to remove a readonly attribute that may already have been present. Reviewers: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9995 llvm-svn: 238152
* Reformat.NAKAMURA Takumi2015-05-252-4/+3
| | | | llvm-svn: 238126
* Prune CRLFs.NAKAMURA Takumi2015-05-252-23/+23
| | | | llvm-svn: 238125
* [Unroll] Switch from an eagerly populated SCEV cache to one that isChandler Carruth2015-05-251-89/+116
| | | | | | | | | | lazily built. Also, make it a much more generic SCEV cache, which today exposes only a reduced GEP model description but could be extended in the future to do other profitable caching of SCEV information. llvm-svn: 238124
* Give more meaningful names than I and J to some for loop variables after ↵Craig Topper2015-05-231-10/+10
| | | | | | converting to range-based loops. llvm-svn: 238095
* Fix an unused variable warning in release builds.Craig Topper2015-05-231-0/+1
| | | | llvm-svn: 238094
* Use range-based for loops. NFC.Craig Topper2015-05-231-76/+36
| | | | llvm-svn: 238093
* Extend EarlyCSE to handle basic cases from JumpThreading and CVPPhilip Reames2015-05-223-21/+46
| | | | | | | | | | | | | | This patch extends EarlyCSE to take advantage of the information that a controlling branch gives us about the value of a Value within this and dominated basic blocks. If the current block has a single predecessor with a controlling branch, we can infer what the branch condition must have been to execute this block. The actual change to support this is downright simple because EarlyCSE's existing scoped hash table logic deals with most of the complexity around merging. The patch actually implements two optimizations. 1) The first is analogous to JumpThreading in that it enables EarlyCSE's CSE handling to fold branches which are exactly redundant due to a previous branch to branches on constants. (It doesn't actually replace the branch or change the CFG.) This is pretty clearly a win since it enables substantial CFG simplification before we start trying to inline. 2) The second is analogous to CVP in that it exploits the knowledge gained to replace dominated *uses* of the original value. EarlyCSE does not otherwise reason about specific uses, so this is the more arguable one. It does enable further simplication and constant folding within the rest of the visit by EarlyCSE. In both cases, the added code only handles the easy dominance based case of each optimization. The general case is deferred to the existing passes. Differential Revision: http://reviews.llvm.org/D9763 llvm-svn: 238071
* [InstCombine] Don't eagerly propagate nsw for A*B+A*C => A*(B+C)David Majnemer2015-05-221-3/+16
| | | | | | | | | | | | | | | | InstCombine transforms A *nsw B +nsw A *nsw C to A *nsw (B + C). This is incorrect -- e.g. if A = -1, B = 1, C = INT_SMAX. Then nothing in the LHS overflows, but the multiplication in RHS overflows. We need to first make sure that we won't multiple by INT_SMAX + 1. Test case `add_of_mul` contributed by Sanjoy Das. This fixes PR23635. Differential Revision: http://reviews.llvm.org/D9629 llvm-svn: 238066
* [Unroll] Separate the logic for testing each iteration of the loop,Chandler Carruth2015-05-221-106/+111
| | | | | | | | | | | | | | | | | | accumulating estimated cost, and other loop-centric logic from the logic used to analyze instructions in a particular iteration. This makes the visitor very narrow in scope -- all it does is visit instructions, update a map of simplified values, and return whether it is able to optimize away a particular instruction. The two cost metrics are now returned as an optional struct. When the optional is left unengaged, there is no information about the unrolled cost of the loop, when it is engaged the cost metrics are available to run against the thresholds. No functionality changed. llvm-svn: 238033
* [InstSimplify] Handle some overflow intrinsics in InstSimplifyDavid Majnemer2015-05-222-12/+6
| | | | | | | | | This change does a few things: - Move some InstCombine transforms to InstSimplify - Run SimplifyCall from within InstCombine::visitCallInst - Teach InstSimplify to fold [us]mul_with_overflow(X, undef) to 0. llvm-svn: 237995
* [Unroll] Replace a hand-wavy FIXME with a FIXME that explains the actualChandler Carruth2015-05-221-1/+6
| | | | | | | problem instead of suggesting doing something that is trivial to do but incorrect given the current design of the libraries. llvm-svn: 237994
* [Unroll] Extract the logic for caching SCEV-modeled GEPs with theirChandler Carruth2015-05-221-67/+81
| | | | | | | | | | | | | | | | | | | simplified model for use simulating each iteration into a separate helper function that just returns the cache. Building this cache had nothing to do with the rest of the unroll analysis and so this removes an unnecessary coupling, etc. It should also make it easier to think about the concept of providing fast cached access to basic SCEV models as an orthogonal concept to the overall unroll simulation. I'd really like to see this kind of caching logic folded into SCEV itself, it seems weird for us to provide it at this layer rather than making repeated queries into SCEV fast all on their own. No functionality changed. llvm-svn: 237993
* [Unroll] Refactor the accumulation of optimized instruction costs intoChandler Carruth2015-05-221-9/+10
| | | | | | | | | | | | a single location. This reduces code duplication a bit and will also pave the way for a better separation between the visitation algorithm and the unroll analysis. No functionality changed. llvm-svn: 237990
* [LICM] Sinking doesn't involve the preheaderPhilip Reames2015-05-221-5/+11
| | | | | | PR23608 pointed out that using the preheader to gain a context instruction isn't always legal because a loop might not have a preheader. When looking into that, I realized that using the preheader to determine legality for sinking is questionable at best. Given no test covers that case and the original commit didn't seem to intend it, I restructured the code to only ask context sensative queries for hoising of loads and stores. This is effectively a partial revert of 237593. llvm-svn: 237985
* MergedLoadStoreMotion preserves MemoryDependenceAnalysis, it does not ↵Daniel Berlin2015-05-221-2/+2
| | | | | | | | require it. (It already was coded assuming it can sometimes be null, so no other changes are necessary) llvm-svn: 237978
* [NaryReassoc] reassociate GEP for CSEJingyue Wu2015-05-211-21/+245
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: x = &a[i]; y = &a[i + j]; => y = x + j; along with some refactoring work such as extracting method findClosestMatchingDominator. Depends on D9786 which provides the ScalarEvolution::getGEPExpr interface. Test Plan: nary-gep.ll Reviewers: meheff, broune Reviewed By: broune Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9802 llvm-svn: 237971
* [InstCombine] X - 0 is equal to X, not undefDavid Majnemer2015-05-211-27/+21
| | | | | | | | | A refactoring made @llvm.ssub.with.overflow.i32(i32 %X, i32 0) transform into undef instead of %X. This fixes PR23624. llvm-svn: 237968
* [LoopDistribute] Remove a layer of pointer indirection.Benjamin Kramer2015-05-211-41/+32
| | | | | | | Just store InstPartitions directly into the std::list. No functional change intended. llvm-svn: 237930
* [RewriteStatepointsForGC] Fix debug assertion during derivable pointer ↵Igor Laevsky2015-05-211-6/+6
| | | | | | | | | | rematerialization Correct assertion would be that there is no other uses from chain we are currently cloning. It is ok to have other uses of values not from this chain. Differential Revision: http://reviews.llvm.org/D9882 llvm-svn: 237899
* [MemCpyOpt] Do move the memset, but look at its dest's dependencies.Ahmed Bougacha2015-05-211-1/+8
| | | | | | | | | In effect a partial revert of r237858, which was a dumb shortcut. Looking at the dependencies of the destination should be the proper fix: if the new memset would depend on anything other than itself, the transformation isn't correct. llvm-svn: 237874
OpenPOWER on IntegriCloud