summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine][X86] Tweak generic expansion of PACKSS/PACKUS to shuffle then ↵Simon Pilgrim2019-04-251-7/+4
| | | | | | | | truncate. NFCI. This has no effect on constant folding but will be useful when we expand non-saturating PACKSS/PACKUS intrinsics. llvm-svn: 359191
* Fix include order. NFCI.Simon Pilgrim2019-04-251-1/+1
| | | | llvm-svn: 359177
* Enable LoopVectorization by default.Alina Sbirlea2019-04-251-1/+1
| | | | | | | | | | | | | | | | | Summary: When refactoring vectorization flags, vectorization was disabled by default in the new pass manager. This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults. Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization. Reviewers: chandlerc, jgorbe Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61091 llvm-svn: 359167
* Consolidate existing utilities for interpreting vector predicate maskes [NFC]Philip Reames2019-04-251-31/+1
| | | | llvm-svn: 359163
* Fix unused variable warning in LoopFusion pass.Kit Barton2019-04-251-7/+5
| | | | | | | | | | Do not wrap the contents of printFusionCandidates in the LLVM_DEBUG macro. This fixes an unused variable warning generated when compiling without asserts but with -DENABLE_LLVM_DUMP. Differential Revision: https://reviews.llvm.org/D61035 llvm-svn: 359161
* [InstCombine] Be consistent w/handling of masked intrinsics style wise [NFC]Philip Reames2019-04-252-5/+6
| | | | llvm-svn: 359160
* [SLP] Fix crash after r358519, by V. Porpodas.Alexey Bataev2019-04-241-1/+2
| | | | | | | | | | | | | | | | Summary: The code did not check if operand was undef before casting it to Instruction. Reviewers: RKSimon, ABataev, dtemirbulatov Reviewed By: ABataev Subscribers: uabelho Tags: #llvm Differential Revision: https://reviews.llvm.org/D61024 llvm-svn: 359136
* [InstCombine][X86] Use generic expansion of PACKSS/PACKUS for constant ↵Simon Pilgrim2019-04-241-51/+45
| | | | | | | | | | folding. NFCI. This patch rewrites the existing PACKSS/PACKUS constant folding code to expand as a generic expansion. This is a first NFCI step toward expanding PACKSS/PACKUS intrinsics which are acting as non-saturating truncations (although technically the expansion could be used in all cases - but we'll probably want to be conservative). llvm-svn: 359111
* Add "const" in GetUnderlyingObjects. NFCBjorn Pettersson2019-04-243-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Both the input Value pointer and the returned Value pointers in GetUnderlyingObjects are now declared as const. It turned out that all current (in-tree) uses of GetUnderlyingObjects were trivial to update, being satisfied with have those Value pointers declared as const. Actually, in the past several of the users had to use const_cast, just because of ValueTracking not providing a version of GetUnderlyingObjects with "const" Value pointers. With this patch we get rid of those const casts. Reviewers: hfinkel, materi, jkorous Reviewed By: jkorous Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61038 llvm-svn: 359072
* [CommandLine] Provide parser<unsigned long> instantiation to allow ↵Fangrui Song2019-04-243-20/+20
| | | | | | | | | | | | | cl::opt<uint64_t> on LP64 platforms Summary: And migrate opt<unsigned long long> to opt<uint64_t> Fixes PR19665 Differential Revision: https://reviews.llvm.org/D60933 llvm-svn: 359068
* The error message for mismatched value sites is very cryptic.Dmitry Mikulin2019-04-231-2/+10
| | | | | | | | Make it more readable for an average user. Differential Revision: https://reviews.llvm.org/D60896 llvm-svn: 359043
* [MemorySSA] LCSSA preserves MemorySSA.Alina Sbirlea2019-04-231-1/+4
| | | | | | | | | | | | | | | | | | | | | Summary: Enabling MemorySSA in the old pass manager leads to MemorySSA being run twice due to the fact that LCSSA and LoopSimplify do not preserve MemorySSA. This is the first step to address that: target LCSSA. LCSSA does not make any changes that invalidate MemorySSA, so it preserves it by design. It must preserve AA as well, for this to hold. After this patch, MemorySSA is still run twice in the old pass manager. Step two follows: target LoopSimplify. Subscribers: mehdi_amini, jlebar, Prazek, llvm-commits, george.burgess.iv, chandlerc Tags: #llvm Differential Revision: https://reviews.llvm.org/D60832 llvm-svn: 359032
* [ObjC][ARC] Check the basic block size before callingAkira Hatanaka2019-04-231-1/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DominatorTree::dominate. ARC contract pass has an optimization that replaces the uses of the argument of an ObjC runtime function call with the call result. For example: ; Before optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %1, i8** @g0, align 8 ; After optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %2, i8** @g0, align 8 // %1 is replaced with %2 Before replacing the argument use, DominatorTree::dominate is called to determine whether the user instruction is dominated by the ObjC runtime function call instruction. The call to DominatorTree::dominate can be expensive if the two instructions belong to the same basic block and the size of the basic block is large. This patch checks the basic block size and just bails out if the size exceeds the limit set by command line option "arc-contract-max-bb-size". rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D60900 llvm-svn: 359027
* [InstCombine] Convert a masked.load of a dereferenceable address to an ↵Philip Reames2019-04-231-4/+14
| | | | | | | | | | unconditional load If we have a masked.load from a location we know to be dereferenceable, we can simply issue a speculative unconditional load against that address. The key advantage is that it produces IR which is well understood by the optimizer. The select (cnd, load, passthrough) form produced should be pattern matchable back to hardware predication if profitable. Differential Revision: https://reviews.llvm.org/D59703 llvm-svn: 359000
* Use llvm::stable_sortFangrui Song2019-04-2314-45/+35
| | | | | | While touching the code, simplify if feasible. llvm-svn: 358996
* [CallSite removal] move InlineCost to CallBase usageFedor Sergeev2019-04-234-13/+16
| | | | | | | | | | | Converting InlineCost interface and its internals into CallBase usage. Inliners themselves are still not converted. Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D60636 llvm-svn: 358982
* [LSR] Limit the recursion for setup costDavid Green2019-04-231-11/+14
| | | | | | | | | | | | | | In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958
* [InstCombine] Eliminate stores to constant memoryPhilip Reames2019-04-222-0/+24
| | | | | | | | | | | | If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store. The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead. Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store *is undefined*, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date. Differential Revision: https://reviews.llvm.org/D60659 llvm-svn: 358919
* [InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify ↵Philip Reames2019-04-221-5/+0
| | | | | | | | from InstCombine In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask. llvm-svn: 358913
* [IPSCCP] Add missing `AssumptionCacheTracker` dependencyJustin Bogner2019-04-221-0/+1
| | | | | | | | | Back in August, r340525 introduced a dependency on the assumption cache tracker in the ipsccp pass, but that commit missed a call to INITIALIZE_PASS_DEPENDENCY, which leaves the assumption cache improperly registered if SCCP is the only thing that pulls it in. llvm-svn: 358903
* [LPM/BPI] Preserve BPI through trivial loop pass pipeline (e.g. LCSSA, ↵Philip Reames2019-04-222-0/+13
| | | | | | | | | | | | | | LoopSimplify) Currently, we do not expose BPI to loop passes at all. In the old pass manager, we appear to have been ignoring the fact that LCSSA and/or LoopSimplify didn't preserve BPI, and making it available to the following loop passes anyways. In the new one, it's invalidated before running any loop pass if either LCSSA or LoopSimplify actually make changes. If they don't make changes, then BPI is valid and available. So, we go ahead and teach LCSSA and LoopSimplify how to preserve BPI for consistency between old and new pass managers. This patch avoids an invalidation between the two requires in the following trivial pass pipeline: opt -passes="requires<branch-prob>,loop(no-op-loop),requires<branch-prob>" (when the input file is one which requires either LCSSA or LoopSimplify to canonicalize the loops) Differential Revision: https://reviews.llvm.org/D60790 llvm-svn: 358901
* Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC"Nikita Popov2019-04-221-3/+3
| | | | | | | | | | This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445. After thinking about this more, this isn't right, the range is not exact in the same sense as makeExactICmpRegion(). This needs a separate function. llvm-svn: 358876
* [ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFCNikita Popov2019-04-221-3/+3
| | | | | | | | Following D60632 makeGuaranteedNoWrapRegion() always returns an exact nowrap region. Rename the function accordingly. This is in line with the naming of makeExactICmpRegion(). llvm-svn: 358875
* [CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw.Luqman Aden2019-04-201-25/+26
| | | | | | | | | | | | | | | | | | | Summary: Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative. Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181) Reviewers: sanjoy, apilipenko, nikic Reviewed By: nikic Subscribers: hiraditya, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60036 llvm-svn: 358816
* [GVN+LICM] Use line 0 locations for better crash attributionVedant Kumar2019-04-192-10/+8
| | | | | | | | | | | | This is a follow-up to r291037+r291258, which used null debug locations to prevent jumpy line tables. Using line 0 locations achieves the same effect, but works better for crash attribution because it preserves the right inline scope. Differential Revision: https://reviews.llvm.org/D60913 llvm-svn: 358791
* Remove the EnableEarlyCSEMemSSA set of options from the legacyEric Christopher2019-04-191-5/+1
| | | | | | | | | and new pass managers. They were default to true and not being used. Differential Revision: https://reviews.llvm.org/D60747 llvm-svn: 358789
* [LICM & MemorySSA] Make limit flags pass tuning options.Alina Sbirlea2019-04-193-40/+55
| | | | | | | | | | | | | | Summary: Make the flags in LICM + MemorySSA tuning options in the old and new pass managers. Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60490 llvm-svn: 358772
* [NewPassManager] Adding pass tuning options: loop vectorize.Alina Sbirlea2019-04-192-5/+14
| | | | | | | | | | | | | | | | Summary: Trying to add the plumbing necessary to add tuning options to the new pass manager. Testing with the flags for loop vectorize. Reviewers: chandlerc Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59723 llvm-svn: 358763
* [MergeFunc] Delete unused FunctionNode::release()Fangrui Song2019-04-191-2/+0
| | | | llvm-svn: 358742
* [MergeFunc] removeUsers: call remove() only on direct usersFangrui Song2019-04-191-21/+3
| | | | | | | | | | | | | | | removeUsers uses a work list to collect indirect users and call remove() on those functions. However it has a bug (`if (!Visited.insert(UU).second)`). Actually, we don't have to collect indirect users. After the merge of F and G, G's callers will be considered (added to Deferred). If G's callers can be merged, G's callers' callers will be considered. Update the test unnamed-addr-reprocessing.ll to make it clear we can still merge indirect callers. llvm-svn: 358741
* [CallSite removal] Move the legacy PM, call graph, and some inlinerChandler Carruth2019-04-195-17/+19
| | | | | | | | | | | | code to `CallBase`. This patch focuses on the legacy PM, call graph, and some of inliner and legacy passes interacting with those APIs from `CallSite` to the new `CallBase` class. No interesting changes. Differential Revision: https://reviews.llvm.org/D60412 llvm-svn: 358739
* [MergeFunc] Use less_first() as the comparator of Schwartzian transformFangrui Song2019-04-191-6/+1
| | | | llvm-svn: 358738
* MergeFunc: preserve COMDAT information when creating a thunkSaleem Abdulrasool2019-04-191-0/+1
| | | | | | | | | | We would previously drop the COMDAT on the thunk we generated when replacing a function body with the forwarding thunk. This would result in a function that may have been multiply emitted and multiply merged to be emitted with the same name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is used for the deduplication of Value Witness functions for Swift. llvm-svn: 358728
* [LoopUnroll] Move list of params into a struct [NFCI].Alina Sbirlea2019-04-183-66/+70
| | | | | | | | | | | | | | Summary: Cleanup suggested in review of r358304. Reviewers: sanjoy, efriedma Subscribers: jlebar, zzheng, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60638 llvm-svn: 358723
* [GuardWidening] Wire up a NPM version of the LoopGuardWidening passPhilip Reames2019-04-181-0/+25
| | | | llvm-svn: 358704
* [BlockExtractor] Extend the file format to support the grouping of basic blocksQuentin Colombet2019-04-181-28/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch, each basic block listed in the extrack-blocks-file would be extracted to a different function. This patch adds the support for comma separated list of basic blocks to form group. When the region formed by a group is not extractable, e.g., not single entry, all the blocks of that group are left untouched. Let us see this new format in action (comments are not part of the file format): ;; funcName bbName[,bbName...] foo bb1 ;; Extract bb1 in its own function foo bb2,bb3 ;; Extract bb2,bb3 in their own function bar bb1,bb4 ;; Extract bb1,bb4 in their own function bar bb2 ;; Extract bb2 in its own function Assuming all regions are extractable, this will create one function and thus one call per region. Differential Revision: https://reviews.llvm.org/D60746 llvm-svn: 358701
* [LoopPred] Fix a blatantly obvious bug in r358684Philip Reames2019-04-181-1/+1
| | | | | | The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops. llvm-svn: 358688
* [LoopPredication] Allow predication of loop invariant computations (within ↵Philip Reames2019-04-181-15/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the loop) The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i' A = _a.length; guard (i < A) a = _a[i] B = _b.length; guard (i < B); b = _b[i]; ... Z = _z.length; guard (i < Z) z = _z[i] accum += a + b + ... + z; Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form. Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later. As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner. See the udiv test changes as an example. If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before. A couple of subtleties in the implementation: - SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point). It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point. - SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located. (i.e. it can be in the loop) This implies we have a speculation burden to prove before expanding them outside loops. - invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance. I plan to sink this part into SCEV once this has baked for a bit. Differential Revision: https://reviews.llvm.org/D60093 llvm-svn: 358684
* Elaborate why we have an option on by default for enabling chr.Eric Christopher2019-04-181-0/+2
| | | | llvm-svn: 358641
* Fix bad compare function over FusionCandidate.Richard Trieu2019-04-181-6/+8
| | | | | | | Reverse the checking of the domiance order so that when a self compare happens, it returns false. This makes compare function have strict weak ordering. llvm-svn: 358636
* Fix formatting. NFCAkira Hatanaka2019-04-171-90/+88
| | | | llvm-svn: 358623
* Test commit by Denis BakhvalovDenis Bakhvalov2019-04-171-1/+1
| | | | | Change-Id: I4d85123a157d957434902fb14ba50926b2d56212 llvm-svn: 358619
* Add basic loop fusion pass.Kit Barton2019-04-173-0/+1217
| | | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607
* [InstCombine] Factor out unreachable inst idiom creation [NFC]Philip Reames2019-04-173-13/+15
| | | | | | | | In InstCombine, we use an idiom of "store i1 true, i1 undef" to indicate we've found a path which we've proven unreachable. We can't actually insert the unreachable instruction since that would require changing the CFG. We leave that to simplifycfg later. This just factors out that idiom creation so we don't duplicate the same mostly undocument idiom creation in multiple places. llvm-svn: 358600
* [LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.Florian Hahn2019-04-171-2/+13
| | | | | | | | | | | | | | | | | | | | Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip counts 2) potential to constant fold most instructions after fully unrolling. We can unroll in those cases, by setting the unrolling threshold to the loop size. This might highlight some cost modeling issues and fixing them will have a positive impact in general. Reviewers: vsk, efriedma, dmgreen, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D60265 llvm-svn: 358586
* [CVP] processOverflowIntrinsic(): don't crash if constant-holding happenedRoman Lebedev2019-04-171-4/+7
| | | | | | | As reported by Mikael Holmén in post-commit review in https://reviews.llvm.org/D60791#1469765 llvm-svn: 358559
* Revert "Add basic loop fusion pass." Per request.Eric Christopher2019-04-173-1214/+0
| | | | | | This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358553
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-173-0/+1214
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Remove the run-slp-after-loop-vectorization option.Eric Christopher2019-04-171-12/+3
| | | | | | | It's been on by default for 4 years and cleans up the pass hierarchy. llvm-svn: 358548
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-173-1214/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
OpenPOWER on IntegriCloud