summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [NFC] PHINode: introduce replaceIncomingBlockWith() function, use itRoman Lebedev2019-05-052-23/+4
| | | | | | | | | | | | | | | | | | | | Summary: There is `PHINode::getBasicBlockIndex()`, `PHINode::setIncomingBlock()` and `PHINode::getNumOperands()`, but no function to replace every specified `BasicBlock*` predecessor with some other specified `BasicBlock*`. Clearly, there are a lot of places that could use that functionality. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61011 llvm-svn: 359995
* [NFC] Instruction: introduce replaceSuccessorWith() function, use itRoman Lebedev2019-05-051-3/+1
| | | | | | | | | | | | | | | | | | | | Summary: There is `Instruction::getNumSuccessors()`, `Instruction::getSuccessor()` and `Instruction::setSuccessor()`, but no function to replace every specified `BasicBlock*` successor with some other specified `BasicBlock*`. I've found one place where it should clearly be used. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61010 llvm-svn: 359994
* [NFC][Utils] deleteDeadLoop(): add an assert that exit block has some ↵Roman Lebedev2019-05-051-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | non-PHI instruction Summary: If `deleteDeadLoop()` is called on such a loop, that has "bad" exit block, one that e.g. has no terminator instruction, the `DIBuilder::insertDbgValueIntrinsic()` will be told to insert the Dbg Value Intrinsic after `nullptr` (since there is no first non-PHI instruction), which will cause it to not insert those instructions into any basic block. The instructions will be parent-less, and IR verifier will complain. It is rather obvious to track down the root cause when that happens, so let's just assert it never happens. Reviewers: sanjoy, davide, vsk Reviewed By: vsk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61008 llvm-svn: 359993
* [SLPVectorizer] Prefer pre-increments. NFCI.Simon Pilgrim2019-05-051-3/+3
| | | | llvm-svn: 359989
* [SLPVectorizer] Make getSpillCost() const. NFCI.Simon Pilgrim2019-05-051-2/+9
| | | | | | Ideally getTreeCost() should be const as well but non-const Type creation would need to be addressed first. llvm-svn: 359975
* Move Value *RHSCIOp def into the scope where its actually used. NFCI.Simon Pilgrim2019-05-051-2/+1
| | | | llvm-svn: 359973
* remove inalloca parameters in globalopt and simplify argpromotionBob Haarman2019-05-022-27/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Inalloca parameters require special handling in some optimizations. This change causes globalopt to strip the inalloca attribute from function parameters when it is safe to do so, removes the special handling for inallocas from argpromotion, and replaces it with a simple check that causes argpromotion to skip functions that receive inallocas (for when the pass is invoked on code that didn't run through globalopt first). This also avoids a case where argpromotion would incorrectly try to pass an inalloca in a register. Fixes PR41658. Reviewers: rnk, efriedma Reviewed By: rnk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61286 llvm-svn: 359743
* [PGO][CHR] A bug fix.Hiroshi Yamauchi2019-05-011-6/+21
| | | | | | | | | | | | | | | | Summary: Fix a transformation bug where two scopes share a common instrution to hoist. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61405 llvm-svn: 359736
* [InstCombine] Limit a vector demanded elts rule which was producing invalid IR.Philip Reames2019-04-301-0/+12
| | | | | | | | The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design). It turns out that the LangRef disallows such cases when indexing structs. The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes. This should fix https://bugs.llvm.org/show_bug.cgi?id=41624. llvm-svn: 359633
* [PassManagerBuilder] Add option for interleaved loops, for loop vectorize.Alina Sbirlea2019-04-301-4/+2
| | | | | | | | | | | | | | | | | Summary: Match NewPassManager behavior: add option for interleaved loops in the old pass manager, and use that instead of the flag used to disable loop unroll. No changes in the defaults. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61030 llvm-svn: 359615
* [SimplifyLibCalls] Clean up code (NFC)Evandro Menezes2019-04-301-6/+8
| | | | | | Fix pointer check after dereferencing (PR41665). llvm-svn: 359595
* MSan: handle llvm.lifetime.start intrinsicAlexander Potapenko2019-04-301-8/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When a variable goes into scope several times within a single function or when two variables from different scopes share a stack slot it may be incorrect to poison such scoped locals at the beginning of the function. In the former case it may lead to false negatives (see https://github.com/google/sanitizers/issues/590), in the latter - to incorrect reports (because only one origin remains on the stack). If Clang emits lifetime intrinsics for such scoped variables we insert code poisoning them after each call to llvm.lifetime.start(). If for a certain intrinsic we fail to find a corresponding alloca, we fall back to poisoning allocas for the whole function, as it's now impossible to tell which alloca was missed. The new instrumentation may slow down hot loops containing local variables with lifetime intrinsics, so we allow disabling it with -mllvm -msan-handle-lifetime-intrinsics=false. Reviewers: eugenis, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60617 llvm-svn: 359536
* [InstCombine] reduce code duplication; NFCSanjay Patel2019-04-291-5/+7
| | | | | | | | | | Follow-up to: rL359482 Avoid this potential problem throughout by giving the type a name and verifying the assumption that both operands are the same type. llvm-svn: 359485
* [InstCombine] visitFCmpInst - appease copy+paste pattern warning. NFCI.Simon Pilgrim2019-04-291-1/+1
| | | | | | | | PVS Studio's copy+paste recognizer was seeing this as a typo, technically Op0/Op1 in a fcmp should always be the same type, but we might as well avoid the issue. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359482
* [BlockExtractor] Expose a constructor for the group extractionQuentin Colombet2019-04-291-3/+29
| | | | | | | | NFC Differential Revision: https://reviews.llvm.org/D60971 llvm-svn: 359463
* [BlockExtractor] Change the basic block separator from ',' to ';'Quentin Colombet2019-04-291-1/+1
| | | | | | | | | This change aims at making the file format be compatible with the way LLVM handles command line options. Differential Revision: https://reviews.llvm.org/D60970 llvm-svn: 359462
* [LoopSimplifyCFG] Suppress expensive DomTree verificationYevgeny Rouban2019-04-291-1/+7
| | | | | | | | | This patch makes verification level lower for builds with inexpensive checks. Differential Revision: https://reviews.llvm.org/D61055 llvm-svn: 359446
* [InferAddressSpaces] Add AS parameter to the pass factorySven van Haastregt2019-04-261-6/+11
| | | | | | | | | | | | | | This enables the pass to be used in the absence of TargetTransformInfo. When the argument isn't passed, the factory defaults to UninitializedAddressSpace and the flat address space is obtained from the TargetTransformInfo as before this change. Existing users won't have to change. Patch by Kevin Petit. Differential Revision: https://reviews.llvm.org/D60602 llvm-svn: 359290
* [GlobalOpt] Swap the expensive check for cold calls with the cheap TTI checkJustin Bogner2019-04-261-2/+2
| | | | | | | | | | | isValidCandidateForColdCC is much more expensive than TTI.useColdCCForColdCall, which by default just returns false. Avoid doing this work if we're not going to look at the answer anyway. This change is NFC, but I see significant compile time improvements on some code with pathologically many functions. llvm-svn: 359253
* Assigning to a local object in a return statement prevents copy elision. NFC.David Blaikie2019-04-251-4/+6
| | | | | | | | | | | | | | | | I added a diagnostic along the lines of `-Wpessimizing-move` to detect `return x = y` suppressing copy elision, but I don't know if the diagnostic is really worth it. Anyway, here are the places where my diagnostic reported that copy elision would have been possible if not for the assignment. P1155R1 in the post-San-Diego WG21 (C++ committee) mailing discusses whether WG21 should fix this pitfall by just changing the core language to permit copy elision in cases like these. (Kona update: The bulk of P1155 is proceeding to CWG review, but specifically *not* the parts that explored the notion of permitting copy-elision in these specific cases.) Reviewed By: dblaikie Author: Arthur O'Dwyer Differential Revision: https://reviews.llvm.org/D54885 llvm-svn: 359236
* [ObjC][ARC] Let ARC optimizer bail out if the number of pointer statesAkira Hatanaka2019-04-251-2/+42
| | | | | | | | | | | | | | | | | | | | it keeps track of becomes too large ARC optimizer does a top-down and a bottom-up traversal of the whole function to pair up retain and release instructions and remove them. This can be expensive if the number of instructions in the function and pointer states it tracks are large since it has to look at each pointer state and determine whether the instruction being visited can potentially use the pointer. This patch adds a command line option that sets a limit to the number of pointers it tracks. rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D61100 llvm-svn: 359226
* [Evaluator] Walk initial elements when handling load through bitcastRobert Lougher2019-04-251-38/+65
| | | | | | | | | | | | | | | | | | | | | | When evaluating a store through a bitcast, the evaluator tries to move the bitcast from the pointer onto the stored value. If the cast is invalid, it tries to "introspect" the type to get a valid cast by obtaining a pointer to the initial element (if the type is nested, this may require walking several initial elements). In some situations it is possible to get a bitcast on a load (e.g. with unions, where the bitcast may not be the same type as the store). However, equivalent logic to the store to introspect the type is missing. This patch add this logic. Note, when developing the patch I was unhappy with adding similar logic directly to the load case as it could get out of step. Instead, I have abstracted the "introspection" into a helper function, with the specifics being handled by a passed-in lambda function. Differential Revision: https://reviews.llvm.org/D60793 llvm-svn: 359205
* [InstCombine][X86] Tweak generic expansion of PACKSS/PACKUS to shuffle then ↵Simon Pilgrim2019-04-251-7/+4
| | | | | | | | truncate. NFCI. This has no effect on constant folding but will be useful when we expand non-saturating PACKSS/PACKUS intrinsics. llvm-svn: 359191
* Fix include order. NFCI.Simon Pilgrim2019-04-251-1/+1
| | | | llvm-svn: 359177
* Enable LoopVectorization by default.Alina Sbirlea2019-04-251-1/+1
| | | | | | | | | | | | | | | | | Summary: When refactoring vectorization flags, vectorization was disabled by default in the new pass manager. This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults. Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization. Reviewers: chandlerc, jgorbe Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61091 llvm-svn: 359167
* Consolidate existing utilities for interpreting vector predicate maskes [NFC]Philip Reames2019-04-251-31/+1
| | | | llvm-svn: 359163
* Fix unused variable warning in LoopFusion pass.Kit Barton2019-04-251-7/+5
| | | | | | | | | | Do not wrap the contents of printFusionCandidates in the LLVM_DEBUG macro. This fixes an unused variable warning generated when compiling without asserts but with -DENABLE_LLVM_DUMP. Differential Revision: https://reviews.llvm.org/D61035 llvm-svn: 359161
* [InstCombine] Be consistent w/handling of masked intrinsics style wise [NFC]Philip Reames2019-04-252-5/+6
| | | | llvm-svn: 359160
* [SLP] Fix crash after r358519, by V. Porpodas.Alexey Bataev2019-04-241-1/+2
| | | | | | | | | | | | | | | | Summary: The code did not check if operand was undef before casting it to Instruction. Reviewers: RKSimon, ABataev, dtemirbulatov Reviewed By: ABataev Subscribers: uabelho Tags: #llvm Differential Revision: https://reviews.llvm.org/D61024 llvm-svn: 359136
* [InstCombine][X86] Use generic expansion of PACKSS/PACKUS for constant ↵Simon Pilgrim2019-04-241-51/+45
| | | | | | | | | | folding. NFCI. This patch rewrites the existing PACKSS/PACKUS constant folding code to expand as a generic expansion. This is a first NFCI step toward expanding PACKSS/PACKUS intrinsics which are acting as non-saturating truncations (although technically the expansion could be used in all cases - but we'll probably want to be conservative). llvm-svn: 359111
* Add "const" in GetUnderlyingObjects. NFCBjorn Pettersson2019-04-243-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Both the input Value pointer and the returned Value pointers in GetUnderlyingObjects are now declared as const. It turned out that all current (in-tree) uses of GetUnderlyingObjects were trivial to update, being satisfied with have those Value pointers declared as const. Actually, in the past several of the users had to use const_cast, just because of ValueTracking not providing a version of GetUnderlyingObjects with "const" Value pointers. With this patch we get rid of those const casts. Reviewers: hfinkel, materi, jkorous Reviewed By: jkorous Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61038 llvm-svn: 359072
* [CommandLine] Provide parser<unsigned long> instantiation to allow ↵Fangrui Song2019-04-243-20/+20
| | | | | | | | | | | | | cl::opt<uint64_t> on LP64 platforms Summary: And migrate opt<unsigned long long> to opt<uint64_t> Fixes PR19665 Differential Revision: https://reviews.llvm.org/D60933 llvm-svn: 359068
* The error message for mismatched value sites is very cryptic.Dmitry Mikulin2019-04-231-2/+10
| | | | | | | | Make it more readable for an average user. Differential Revision: https://reviews.llvm.org/D60896 llvm-svn: 359043
* [MemorySSA] LCSSA preserves MemorySSA.Alina Sbirlea2019-04-231-1/+4
| | | | | | | | | | | | | | | | | | | | | Summary: Enabling MemorySSA in the old pass manager leads to MemorySSA being run twice due to the fact that LCSSA and LoopSimplify do not preserve MemorySSA. This is the first step to address that: target LCSSA. LCSSA does not make any changes that invalidate MemorySSA, so it preserves it by design. It must preserve AA as well, for this to hold. After this patch, MemorySSA is still run twice in the old pass manager. Step two follows: target LoopSimplify. Subscribers: mehdi_amini, jlebar, Prazek, llvm-commits, george.burgess.iv, chandlerc Tags: #llvm Differential Revision: https://reviews.llvm.org/D60832 llvm-svn: 359032
* [ObjC][ARC] Check the basic block size before callingAkira Hatanaka2019-04-231-1/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DominatorTree::dominate. ARC contract pass has an optimization that replaces the uses of the argument of an ObjC runtime function call with the call result. For example: ; Before optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %1, i8** @g0, align 8 ; After optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %2, i8** @g0, align 8 // %1 is replaced with %2 Before replacing the argument use, DominatorTree::dominate is called to determine whether the user instruction is dominated by the ObjC runtime function call instruction. The call to DominatorTree::dominate can be expensive if the two instructions belong to the same basic block and the size of the basic block is large. This patch checks the basic block size and just bails out if the size exceeds the limit set by command line option "arc-contract-max-bb-size". rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D60900 llvm-svn: 359027
* [InstCombine] Convert a masked.load of a dereferenceable address to an ↵Philip Reames2019-04-231-4/+14
| | | | | | | | | | unconditional load If we have a masked.load from a location we know to be dereferenceable, we can simply issue a speculative unconditional load against that address. The key advantage is that it produces IR which is well understood by the optimizer. The select (cnd, load, passthrough) form produced should be pattern matchable back to hardware predication if profitable. Differential Revision: https://reviews.llvm.org/D59703 llvm-svn: 359000
* Use llvm::stable_sortFangrui Song2019-04-2314-45/+35
| | | | | | While touching the code, simplify if feasible. llvm-svn: 358996
* [CallSite removal] move InlineCost to CallBase usageFedor Sergeev2019-04-234-13/+16
| | | | | | | | | | | Converting InlineCost interface and its internals into CallBase usage. Inliners themselves are still not converted. Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D60636 llvm-svn: 358982
* [LSR] Limit the recursion for setup costDavid Green2019-04-231-11/+14
| | | | | | | | | | | | | | In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958
* [InstCombine] Eliminate stores to constant memoryPhilip Reames2019-04-222-0/+24
| | | | | | | | | | | | If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store. The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead. Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store *is undefined*, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date. Differential Revision: https://reviews.llvm.org/D60659 llvm-svn: 358919
* [InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify ↵Philip Reames2019-04-221-5/+0
| | | | | | | | from InstCombine In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask. llvm-svn: 358913
* [IPSCCP] Add missing `AssumptionCacheTracker` dependencyJustin Bogner2019-04-221-0/+1
| | | | | | | | | Back in August, r340525 introduced a dependency on the assumption cache tracker in the ipsccp pass, but that commit missed a call to INITIALIZE_PASS_DEPENDENCY, which leaves the assumption cache improperly registered if SCCP is the only thing that pulls it in. llvm-svn: 358903
* [LPM/BPI] Preserve BPI through trivial loop pass pipeline (e.g. LCSSA, ↵Philip Reames2019-04-222-0/+13
| | | | | | | | | | | | | | LoopSimplify) Currently, we do not expose BPI to loop passes at all. In the old pass manager, we appear to have been ignoring the fact that LCSSA and/or LoopSimplify didn't preserve BPI, and making it available to the following loop passes anyways. In the new one, it's invalidated before running any loop pass if either LCSSA or LoopSimplify actually make changes. If they don't make changes, then BPI is valid and available. So, we go ahead and teach LCSSA and LoopSimplify how to preserve BPI for consistency between old and new pass managers. This patch avoids an invalidation between the two requires in the following trivial pass pipeline: opt -passes="requires<branch-prob>,loop(no-op-loop),requires<branch-prob>" (when the input file is one which requires either LCSSA or LoopSimplify to canonicalize the loops) Differential Revision: https://reviews.llvm.org/D60790 llvm-svn: 358901
* Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC"Nikita Popov2019-04-221-3/+3
| | | | | | | | | | This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445. After thinking about this more, this isn't right, the range is not exact in the same sense as makeExactICmpRegion(). This needs a separate function. llvm-svn: 358876
* [ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFCNikita Popov2019-04-221-3/+3
| | | | | | | | Following D60632 makeGuaranteedNoWrapRegion() always returns an exact nowrap region. Rename the function accordingly. This is in line with the naming of makeExactICmpRegion(). llvm-svn: 358875
* [CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw.Luqman Aden2019-04-201-25/+26
| | | | | | | | | | | | | | | | | | | Summary: Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative. Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181) Reviewers: sanjoy, apilipenko, nikic Reviewed By: nikic Subscribers: hiraditya, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60036 llvm-svn: 358816
* [GVN+LICM] Use line 0 locations for better crash attributionVedant Kumar2019-04-192-10/+8
| | | | | | | | | | | | This is a follow-up to r291037+r291258, which used null debug locations to prevent jumpy line tables. Using line 0 locations achieves the same effect, but works better for crash attribution because it preserves the right inline scope. Differential Revision: https://reviews.llvm.org/D60913 llvm-svn: 358791
* Remove the EnableEarlyCSEMemSSA set of options from the legacyEric Christopher2019-04-191-5/+1
| | | | | | | | | and new pass managers. They were default to true and not being used. Differential Revision: https://reviews.llvm.org/D60747 llvm-svn: 358789
* [LICM & MemorySSA] Make limit flags pass tuning options.Alina Sbirlea2019-04-193-40/+55
| | | | | | | | | | | | | | Summary: Make the flags in LICM + MemorySSA tuning options in the old and new pass managers. Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60490 llvm-svn: 358772
* [NewPassManager] Adding pass tuning options: loop vectorize.Alina Sbirlea2019-04-192-5/+14
| | | | | | | | | | | | | | | | Summary: Trying to add the plumbing necessary to add tuning options to the new pass manager. Testing with the flags for loop vectorize. Reviewers: chandlerc Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59723 llvm-svn: 358763
OpenPOWER on IntegriCloud