summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [MemorySSA] LCSSA preserves MemorySSA.Alina Sbirlea2019-04-231-1/+4
| | | | | | | | | | | | | | | | | | | | | Summary: Enabling MemorySSA in the old pass manager leads to MemorySSA being run twice due to the fact that LCSSA and LoopSimplify do not preserve MemorySSA. This is the first step to address that: target LCSSA. LCSSA does not make any changes that invalidate MemorySSA, so it preserves it by design. It must preserve AA as well, for this to hold. After this patch, MemorySSA is still run twice in the old pass manager. Step two follows: target LoopSimplify. Subscribers: mehdi_amini, jlebar, Prazek, llvm-commits, george.burgess.iv, chandlerc Tags: #llvm Differential Revision: https://reviews.llvm.org/D60832 llvm-svn: 359032
* [ObjC][ARC] Check the basic block size before callingAkira Hatanaka2019-04-231-1/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DominatorTree::dominate. ARC contract pass has an optimization that replaces the uses of the argument of an ObjC runtime function call with the call result. For example: ; Before optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %1, i8** @g0, align 8 ; After optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %2, i8** @g0, align 8 // %1 is replaced with %2 Before replacing the argument use, DominatorTree::dominate is called to determine whether the user instruction is dominated by the ObjC runtime function call instruction. The call to DominatorTree::dominate can be expensive if the two instructions belong to the same basic block and the size of the basic block is large. This patch checks the basic block size and just bails out if the size exceeds the limit set by command line option "arc-contract-max-bb-size". rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D60900 llvm-svn: 359027
* [InstCombine] Convert a masked.load of a dereferenceable address to an ↵Philip Reames2019-04-231-4/+14
| | | | | | | | | | unconditional load If we have a masked.load from a location we know to be dereferenceable, we can simply issue a speculative unconditional load against that address. The key advantage is that it produces IR which is well understood by the optimizer. The select (cnd, load, passthrough) form produced should be pattern matchable back to hardware predication if profitable. Differential Revision: https://reviews.llvm.org/D59703 llvm-svn: 359000
* Use llvm::stable_sortFangrui Song2019-04-2314-45/+35
| | | | | | While touching the code, simplify if feasible. llvm-svn: 358996
* [CallSite removal] move InlineCost to CallBase usageFedor Sergeev2019-04-234-13/+16
| | | | | | | | | | | Converting InlineCost interface and its internals into CallBase usage. Inliners themselves are still not converted. Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D60636 llvm-svn: 358982
* [LSR] Limit the recursion for setup costDavid Green2019-04-231-11/+14
| | | | | | | | | | | | | | In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958
* [InstCombine] Eliminate stores to constant memoryPhilip Reames2019-04-222-0/+24
| | | | | | | | | | | | If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store. The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead. Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store *is undefined*, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date. Differential Revision: https://reviews.llvm.org/D60659 llvm-svn: 358919
* [InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify ↵Philip Reames2019-04-221-5/+0
| | | | | | | | from InstCombine In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask. llvm-svn: 358913
* [IPSCCP] Add missing `AssumptionCacheTracker` dependencyJustin Bogner2019-04-221-0/+1
| | | | | | | | | Back in August, r340525 introduced a dependency on the assumption cache tracker in the ipsccp pass, but that commit missed a call to INITIALIZE_PASS_DEPENDENCY, which leaves the assumption cache improperly registered if SCCP is the only thing that pulls it in. llvm-svn: 358903
* [LPM/BPI] Preserve BPI through trivial loop pass pipeline (e.g. LCSSA, ↵Philip Reames2019-04-222-0/+13
| | | | | | | | | | | | | | LoopSimplify) Currently, we do not expose BPI to loop passes at all. In the old pass manager, we appear to have been ignoring the fact that LCSSA and/or LoopSimplify didn't preserve BPI, and making it available to the following loop passes anyways. In the new one, it's invalidated before running any loop pass if either LCSSA or LoopSimplify actually make changes. If they don't make changes, then BPI is valid and available. So, we go ahead and teach LCSSA and LoopSimplify how to preserve BPI for consistency between old and new pass managers. This patch avoids an invalidation between the two requires in the following trivial pass pipeline: opt -passes="requires<branch-prob>,loop(no-op-loop),requires<branch-prob>" (when the input file is one which requires either LCSSA or LoopSimplify to canonicalize the loops) Differential Revision: https://reviews.llvm.org/D60790 llvm-svn: 358901
* Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC"Nikita Popov2019-04-221-3/+3
| | | | | | | | | | This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445. After thinking about this more, this isn't right, the range is not exact in the same sense as makeExactICmpRegion(). This needs a separate function. llvm-svn: 358876
* [ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFCNikita Popov2019-04-221-3/+3
| | | | | | | | Following D60632 makeGuaranteedNoWrapRegion() always returns an exact nowrap region. Rename the function accordingly. This is in line with the naming of makeExactICmpRegion(). llvm-svn: 358875
* [CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw.Luqman Aden2019-04-201-25/+26
| | | | | | | | | | | | | | | | | | | Summary: Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative. Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181) Reviewers: sanjoy, apilipenko, nikic Reviewed By: nikic Subscribers: hiraditya, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60036 llvm-svn: 358816
* [GVN+LICM] Use line 0 locations for better crash attributionVedant Kumar2019-04-192-10/+8
| | | | | | | | | | | | This is a follow-up to r291037+r291258, which used null debug locations to prevent jumpy line tables. Using line 0 locations achieves the same effect, but works better for crash attribution because it preserves the right inline scope. Differential Revision: https://reviews.llvm.org/D60913 llvm-svn: 358791
* Remove the EnableEarlyCSEMemSSA set of options from the legacyEric Christopher2019-04-191-5/+1
| | | | | | | | | and new pass managers. They were default to true and not being used. Differential Revision: https://reviews.llvm.org/D60747 llvm-svn: 358789
* [LICM & MemorySSA] Make limit flags pass tuning options.Alina Sbirlea2019-04-193-40/+55
| | | | | | | | | | | | | | Summary: Make the flags in LICM + MemorySSA tuning options in the old and new pass managers. Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60490 llvm-svn: 358772
* [NewPassManager] Adding pass tuning options: loop vectorize.Alina Sbirlea2019-04-192-5/+14
| | | | | | | | | | | | | | | | Summary: Trying to add the plumbing necessary to add tuning options to the new pass manager. Testing with the flags for loop vectorize. Reviewers: chandlerc Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59723 llvm-svn: 358763
* [MergeFunc] Delete unused FunctionNode::release()Fangrui Song2019-04-191-2/+0
| | | | llvm-svn: 358742
* [MergeFunc] removeUsers: call remove() only on direct usersFangrui Song2019-04-191-21/+3
| | | | | | | | | | | | | | | removeUsers uses a work list to collect indirect users and call remove() on those functions. However it has a bug (`if (!Visited.insert(UU).second)`). Actually, we don't have to collect indirect users. After the merge of F and G, G's callers will be considered (added to Deferred). If G's callers can be merged, G's callers' callers will be considered. Update the test unnamed-addr-reprocessing.ll to make it clear we can still merge indirect callers. llvm-svn: 358741
* [CallSite removal] Move the legacy PM, call graph, and some inlinerChandler Carruth2019-04-195-17/+19
| | | | | | | | | | | | code to `CallBase`. This patch focuses on the legacy PM, call graph, and some of inliner and legacy passes interacting with those APIs from `CallSite` to the new `CallBase` class. No interesting changes. Differential Revision: https://reviews.llvm.org/D60412 llvm-svn: 358739
* [MergeFunc] Use less_first() as the comparator of Schwartzian transformFangrui Song2019-04-191-6/+1
| | | | llvm-svn: 358738
* MergeFunc: preserve COMDAT information when creating a thunkSaleem Abdulrasool2019-04-191-0/+1
| | | | | | | | | | We would previously drop the COMDAT on the thunk we generated when replacing a function body with the forwarding thunk. This would result in a function that may have been multiply emitted and multiply merged to be emitted with the same name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is used for the deduplication of Value Witness functions for Swift. llvm-svn: 358728
* [LoopUnroll] Move list of params into a struct [NFCI].Alina Sbirlea2019-04-183-66/+70
| | | | | | | | | | | | | | Summary: Cleanup suggested in review of r358304. Reviewers: sanjoy, efriedma Subscribers: jlebar, zzheng, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60638 llvm-svn: 358723
* [GuardWidening] Wire up a NPM version of the LoopGuardWidening passPhilip Reames2019-04-181-0/+25
| | | | llvm-svn: 358704
* [BlockExtractor] Extend the file format to support the grouping of basic blocksQuentin Colombet2019-04-181-28/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch, each basic block listed in the extrack-blocks-file would be extracted to a different function. This patch adds the support for comma separated list of basic blocks to form group. When the region formed by a group is not extractable, e.g., not single entry, all the blocks of that group are left untouched. Let us see this new format in action (comments are not part of the file format): ;; funcName bbName[,bbName...] foo bb1 ;; Extract bb1 in its own function foo bb2,bb3 ;; Extract bb2,bb3 in their own function bar bb1,bb4 ;; Extract bb1,bb4 in their own function bar bb2 ;; Extract bb2 in its own function Assuming all regions are extractable, this will create one function and thus one call per region. Differential Revision: https://reviews.llvm.org/D60746 llvm-svn: 358701
* [LoopPred] Fix a blatantly obvious bug in r358684Philip Reames2019-04-181-1/+1
| | | | | | The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops. llvm-svn: 358688
* [LoopPredication] Allow predication of loop invariant computations (within ↵Philip Reames2019-04-181-15/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the loop) The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i' A = _a.length; guard (i < A) a = _a[i] B = _b.length; guard (i < B); b = _b[i]; ... Z = _z.length; guard (i < Z) z = _z[i] accum += a + b + ... + z; Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form. Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later. As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner. See the udiv test changes as an example. If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before. A couple of subtleties in the implementation: - SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point). It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point. - SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located. (i.e. it can be in the loop) This implies we have a speculation burden to prove before expanding them outside loops. - invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance. I plan to sink this part into SCEV once this has baked for a bit. Differential Revision: https://reviews.llvm.org/D60093 llvm-svn: 358684
* Elaborate why we have an option on by default for enabling chr.Eric Christopher2019-04-181-0/+2
| | | | llvm-svn: 358641
* Fix bad compare function over FusionCandidate.Richard Trieu2019-04-181-6/+8
| | | | | | | Reverse the checking of the domiance order so that when a self compare happens, it returns false. This makes compare function have strict weak ordering. llvm-svn: 358636
* Fix formatting. NFCAkira Hatanaka2019-04-171-90/+88
| | | | llvm-svn: 358623
* Test commit by Denis BakhvalovDenis Bakhvalov2019-04-171-1/+1
| | | | | Change-Id: I4d85123a157d957434902fb14ba50926b2d56212 llvm-svn: 358619
* Add basic loop fusion pass.Kit Barton2019-04-173-0/+1217
| | | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607
* [InstCombine] Factor out unreachable inst idiom creation [NFC]Philip Reames2019-04-173-13/+15
| | | | | | | | In InstCombine, we use an idiom of "store i1 true, i1 undef" to indicate we've found a path which we've proven unreachable. We can't actually insert the unreachable instruction since that would require changing the CFG. We leave that to simplifycfg later. This just factors out that idiom creation so we don't duplicate the same mostly undocument idiom creation in multiple places. llvm-svn: 358600
* [LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.Florian Hahn2019-04-171-2/+13
| | | | | | | | | | | | | | | | | | | | Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip counts 2) potential to constant fold most instructions after fully unrolling. We can unroll in those cases, by setting the unrolling threshold to the loop size. This might highlight some cost modeling issues and fixing them will have a positive impact in general. Reviewers: vsk, efriedma, dmgreen, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D60265 llvm-svn: 358586
* [CVP] processOverflowIntrinsic(): don't crash if constant-holding happenedRoman Lebedev2019-04-171-4/+7
| | | | | | | As reported by Mikael Holmén in post-commit review in https://reviews.llvm.org/D60791#1469765 llvm-svn: 358559
* Revert "Add basic loop fusion pass." Per request.Eric Christopher2019-04-173-1214/+0
| | | | | | This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358553
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-173-0/+1214
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Remove the run-slp-after-loop-vectorization option.Eric Christopher2019-04-171-12/+3
| | | | | | | It's been on by default for 4 years and cleans up the pass hierarchy. llvm-svn: 358548
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-173-1214/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* Add basic loop fusion pass.Kit Barton2019-04-173-0/+1214
| | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Phabricator: https://reviews.llvm.org/D55851 llvm-svn: 358543
* Fix a typo in comments. [NFC]Ali Tamur2019-04-161-1/+1
| | | | llvm-svn: 358531
* [EarlyCSE] detect equivalence of selects with inverse conditions and ↵Sanjay Patel2019-04-161-2/+59
| | | | | | | | | | | | | | | | | | commuted operands (PR41101) This is 1 of the problems discussed in the post-commit thread for: rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html and filed as: https://bugs.llvm.org/show_bug.cgi?id=41101 Instcombine tries to canonicalize some of these cases (and there's room for improvement there independently of this patch), but it can't always do that because of extra uses. So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar to how we detect commuted compares and commuted min/max/abs. Differential Revision: https://reviews.llvm.org/D60723 llvm-svn: 358523
* [CVP] Simplify umulo and smulo that cannot overflowNikita Popov2019-04-161-6/+1
| | | | | | | | | | | If a umul.with.overflow or smul.with.overflow operation cannot overflow, simplify it to a simple mul nuw / mul nsw. After the refactoring in D60668 this is just a matter of removing an explicit check against multiplications. Differential Revision: https://reviews.llvm.org/D60791 llvm-svn: 358521
* [SLP] Refactoring of the operand reordering code.Simon Pilgrim2019-04-161-171/+463
| | | | | | | | | | | | | | This is a refactoring patch which should have all the functionality of the current code. Its goal is twofold: i. Cleanup and simplify the reordering code, and ii. Generalize reordering so that it will work for an arbitrary number of operands, not just 2. This is the second patch in a series of patches that will enable operand reordering across chains of operations. An example of this was presented in EuroLLVM'18 https://www.youtube.com/watch?v=gIEn34LvyNo . Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59973 llvm-svn: 358519
* [InstCombine] Prune fshl/fshr with masked operandsNikita Popov2019-04-161-0/+4
| | | | | | | | | | | | | | | | If a constant shift amount is used, then only some of the LHS/RHS operand bits are demanded and we may be able to simplify based on that. InstCombineSimplifyDemanded already had the necessary support for that, we just weren't calling it with fshl/fshr as root. In particular, this allows us to relax some masked funnel shifts into simple shifts, as shown in the tests. Patch by Shawn Landden. Differential Revision: https://reviews.llvm.org/D60660 llvm-svn: 358515
* [IR] Add WithOverflowInst classNikita Popov2019-04-164-161/+68
| | | | | | | | | | | | | | This adds a WithOverflowInst class with a few helper methods to get the underlying binop, signedness and nowrap type and makes use of it where sensible. There will be two more uses in D60650/D60656. The refactorings are all NFC, though I left some TODOs where things could be improved. In particular we have two places where add/sub are handled but mul isn't. Differential Revision: https://reviews.llvm.org/D60668 llvm-svn: 358512
* Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink ↵Hans Wennborg2019-04-161-14/+15
| | | | | | | | | | | | | | | | | | | function calls without used results (PR41259) The original commit caused false positives from AddressSanitizer's use-after-scope checks, which have now been fixed in r358478. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 358483
* Asan use-after-scope: don't poison allocas if there were untraced lifetime ↵Hans Wennborg2019-04-161-1/+14
| | | | | | | | | | | | | | intrinsics in the function (PR41481) If there are any intrinsics that cannot be traced back to an alloca, we might have missed the start of a variable's scope, leading to false error reports if the variable is poisoned at function entry. Instead, if there are some intrinsics that can't be traced, fail safe and don't poison the variables in that function. Differential revision: https://reviews.llvm.org/D60686 llvm-svn: 358478
* [CodeExtractor] Add a few debug lines to understand why a region is not ↵Quentin Colombet2019-04-161-3/+8
| | | | | | | | | | | | | | | | | | extracted The CodeExtractor is not smart enough to compute which basic block is the entry of a region. Instead it relies on the order of the list of basic blocks that is handed to it and assumes that the entry is the first block in the list. Without the additional debug information, it is hard to understand why a valid region does not get extracted, because we would miss that the order of in the list just doesn't match what the CodeExtractor wants. NFC llvm-svn: 358471
* [LSR] Rewrite misses some fixup locations if it splits critical edgeQuentin Colombet2019-04-151-1/+42
| | | | | | | | | | | | | | | If LSR split critical edge during rewriting phi operands and phi node has other pending fixup operands, we need to update those pending fixups. Otherwise formulae will not be implemented completely and some instructions will not be eliminated. llvm.org/PR41445 Differential Revision: https://reviews.llvm.org/D60645 Patch by: Denis Bakhvalov <denis.bakhvalov@intel.com> llvm-svn: 358457
OpenPOWER on IntegriCloud