summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [LoopIdiomRecognize][NFC] Sort includesRoman Lebedev2019-05-301-2/+2
| | | | | | Split off from D61144 llvm-svn: 362091
* [LV] Inform about exactly reason of loop illegalityFlorian Hahn2019-05-301-2/+10
| | | | | | | | | | | | | | | | | | | | | | | Currently, only the following information is provided by LoopVectorizer in the case when the CF of the loop is not legal for vectorization: LV: Can't vectorize the instructions or CFG LV: Not vectorizing: Cannot prove legality. But this information is not enough for the root cause analysis; what is exactly wrong with the loop should also be printed: LV: Not vectorizing: The exiting block is not the loop latch. Patch by Pavel Samolysov. Reviewers: mkuper, hsaito, rengolin, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D62311 llvm-svn: 362056
* LoopVersioningLICM: Respect convergent and noduplicateMatt Arsenault2019-05-291-1/+9
| | | | llvm-svn: 362031
* [LoopIdiomRecognize][NFC] Use DEBUG_TYPE, add LLVM_DEBUG() to ↵Roman Lebedev2019-05-291-2/+8
| | | | | | | | runOnNoncountableLoop() Split off from D61144 llvm-svn: 362022
* [InstCombine] Optimize always overflowing signed saturating add/subNikita Popov2019-05-291-8/+12
| | | | | | | | | Based on the overflow direction information added in D62463, we can now fold always overflowing signed saturating add/sub to signed min/max. Differential Revision: https://reviews.llvm.org/D62544 llvm-svn: 362006
* CallSiteSplitting: Respect convergent and noduplicateMatt Arsenault2019-05-291-0/+3
| | | | llvm-svn: 361990
* [ThinLTO] Use original alias visibility when importingTeresa Johnson2019-05-291-2/+3
| | | | | | | | | | | | | | | | | | | | | Summary: When we import an alias, we do so by making a clone of the aliasee. Just as this clone uses the original alias name and linkage, it should also use the same visibility (not the aliasee's visibility). Otherwise, linker behavior is affected (e.g. if the aliasee was hidden, but the alias is not, the resulting imported clone should not be hidden, otherwise the linker will make the final symbol hidden which is incorrect). Reviewers: wmi Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62535 llvm-svn: 361989
* SpeculateAroundPHIs: Respect convergentMatt Arsenault2019-05-291-0/+8
| | | | llvm-svn: 361957
* [PGO] Handle cases of failing to split critical edgesRong Xu2019-05-281-44/+56
| | | | | | | | | | | Fix PR41279 where critical edges to EHPad are not split. The fix is to not instrument those critical edges. We used to be able to know the size of counters right after MST is computed. With this, we have to pre-collect the instrument BBs to know the size, and then instrument them. Differential Revision: https://reviews.llvm.org/D62439 llvm-svn: 361882
* Revert "[CorrelatedValuePropagation] Fix prof branch_weights metadata ↵Nikita Popov2019-05-281-61/+56
| | | | | | | | | | | | handling for SwitchInst" This reverts commit 53f2f3286572cb879b3861d7c15480e4d830dd3b. As reported on D62126, this causes assertion failures if the switch has incorrect branch_weights metadata, which may happen as a result of other transforms not handling it correctly yet. llvm-svn: 361881
* [InstCombine] Clean up saturing math overflow optimizations; NFCNikita Popov2019-05-281-29/+20
| | | | | | | Reduce duplication and make it easier to handle signed always-overflows conditions in the future. llvm-svn: 361863
* [ValueTracking][ConstantRange] Distinguish low/high always overflowNikita Popov2019-05-282-3/+4
| | | | | | | | | | | | In order to fold an always overflowing signed saturating add/sub, we need to know in which direction the always overflow occurs. This patch splits up AlwaysOverflows into AlwaysOverflowsLow and AlwaysOverflowsHigh to pass through this information (but it is not used yet). Differential Revision: https://reviews.llvm.org/D62463 llvm-svn: 361858
* Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors: ↵Hans Wennborg2019-05-281-14/+15
| | | | | | | | | | | | | | | | | | | | Also sink function calls without used results (PR41259)" This was reverted in r360086 as it was supected of causing mysterious test failures internally. However, it was never concluded that this patch was the root cause. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 361811
* [CorrelatedValuePropagation] Fix prof branch_weights metadata handling for ↵Yevgeny Rouban2019-05-281-56/+61
| | | | | | | | | | | | | | SwitchInst This patch fixes the CorrelatedValuePropagation pass to keep prof branch_weights metadata of SwitchInst consistent. It makes use of SwitchInstProfUpdateWrapper. New tests are added. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D62126 llvm-svn: 361808
* [LoopInterchange] Fix handling of LCSSA nodes defined in headers and latches.Florian Hahn2019-05-261-22/+64
| | | | | | | | | | | | | | | | | | The code to preserve LCSSA PHIs currently only properly supports reduction PHIs and PHIs for values defined outside the latches. This patch improves the LCSSA PHI handling to cover PHIs for values defined in the latches. Fixes PR41725. Reviewers: efriedma, mcrosier, davide, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D61576 llvm-svn: 361743
* [SimplifyCFG] back out all SwitchInst commitsShawn Landden2019-05-261-90/+71
| | | | | | | | They caused the sanitizer builds to fail. My suspicion is the change the countLeadingZeros(). llvm-svn: 361736
* [InstCombine] prevent crashing with invalid extractelement indexSanjay Patel2019-05-261-2/+3
| | | | | | | This was found/reduced from a fuzzer report: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956 llvm-svn: 361729
* [SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold ↵Shawn Landden2019-05-261-14/+24
| | | | | | doesn't trigger llvm-svn: 361728
* [SimplifyCFG] Run ReduceSwitchRange unconditionally, generalizeShawn Landden2019-05-261-56/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than gating on "isSwitchDense" (resulting in necessesarily sparse lookup tables even when they were generated), always run this quite cheap transform. This transform is useful not just for generating tables. LowerSwitch also wants this: read LowerSwitch.cpp:257. Be careful to not generate worse code, by introducing a SubThreshold heuristic. Instead of just sorting by signed, generalize the finding of the best base. And now that it is run unconditionally, do not replicate its functionality in SwitchToLookupTable (which could use a Sub when having a hole is smaller, hence the SubThreshold heuristic located in a single place). This simplifies SwitchToLookupTable, and fixes some ugly corner cases due to the use of signed numbers, such as a table containing i16 32768 and 32769, of which 32769 would be interpreted as -32768, and now the code thinks the table is size 65536. (We still use unconditional subtraction when building a single-register mask, but I think this whole block should go when the more general sparse map is added, which doesn't leave empty holes in the table.) And the reason test4 and test5 did not trigger was documented wrong: it was because they were not considered sufficiently "dense". Also, fix generation of invalid LLVM-IR: shl by bit-width. llvm-svn: 361727
* [SimpligyCFG] NFC, remove GCD that was only used for powers of twoShawn Landden2019-05-261-12/+10
| | | | | | | | | | and replace with an equilivent countTrailingZeros. GCD is much more expensive than this, with repeated division. This depends on D60823 llvm-svn: 361726
* [Support] make countLeadingZeros() and countTrailingZeros() return unsignedShawn Landden2019-05-261-11/+12
| | | | | | | | | This matches countLeadingOnes() and countTrailingOnes(), and APInt's countLeadingZeros() and countTrailingZeros(). (as well as __builtin_clzll()) llvm-svn: 361724
* [InstCombine] Refactor OptimizeOverflowCheck; NFCINikita Popov2019-05-262-79/+65
| | | | | | | | | Extract method to compute overflow based on binop and signedness, and then make the result handling code generic. This extends the always-overflow handling to signed muls, but has currently no effect, as we don't compute always overflow for them (thus NFC). llvm-svn: 361721
* [InstCombine] Remove OverflowCheckFlavor; NFCNikita Popov2019-05-263-59/+20
| | | | | | | Instead pass binary op and signedness. The extra enum only makes things more complicated in this case. llvm-svn: 361720
* [SimplifyCFG] Added condition assumption for unreachable blocksDavid Bolvansky2019-05-251-0/+3
| | | | | | | | | | | | | | | | Summary: PR41688 Reviewers: spatel, efriedma, craig.topper, hfinkel, reames Reviewed By: hfinkel Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61409 llvm-svn: 361707
* [CVP] Remove unnecessary checks for empty GNWR; NFCNikita Popov2019-05-251-27/+11
| | | | | | | | | | The guaranteed no-wrap region is never empty, it always contains at least zero, so these optimizations don't ever apply. To make this more obviously true, replace the conversative return in makeGNWR with an assertion. llvm-svn: 361698
* Use the DataLayout::typeSizeEqualsStoreSize helper. NFCBjorn Pettersson2019-05-243-7/+4
| | | | | | | | | Just a minor refactoring to use the new helper method DataLayout::typeSizeEqualsStoreSize(). This is done when checking if getTypeSizeInBits is equal/non-equal to getTypeStoreSizeInBits. llvm-svn: 361613
* StructurizeCFG: Relax uniformity checks.Neil Henning2019-05-241-3/+30
| | | | | | | | | | | | | | | This change relaxes the checks for hasOnlyUniformBranches such that our region is uniform if: 1. All conditional branches that are direct children are uniform. 2. And either: a. All sub-regions are uniform. b. There is one or less conditional branches among the direct children. Differential Revision: https://reviews.llvm.org/D62198 llvm-svn: 361610
* [DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized storesBjorn Pettersson2019-05-241-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The DeadStoreElimination pass now skips doing PartialStoreMerging when stores overlap according to OW_PartialEarlierWithFullLater and at least one of the stores is having a store size that is different from the size of the type being stored. This solves problems seen in https://bugs.llvm.org/show_bug.cgi?id=41949 for which we in the past could end up with mis-compiles or assertions. The content and location of the padding bits is not formally described (or undefined) in the LangRef at the moment. So the solution is chosen based on that we cannot assume anything about the padding bits when having a store that clobbers more memory than indicated by the type of the value that is stored (such as storing an i6 using an 8-bit store instruction). Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949 Reviewers: spatel, efriedma, fhahn Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62250 llvm-svn: 361605
* Revert r361460Eli Friedman2019-05-241-23/+4
| | | | | | | It regresses https://bugs.llvm.org/show_bug.cgi?id=38309 (represented by the testcase test/Transforms/GlobalOpt/globalsra-multigep.ll). llvm-svn: 361581
* [InstSimplify] fold insertelement-of-extractelementSanjay Patel2019-05-241-5/+0
| | | | | | | | This was partly handled in InstCombine (only the constant index case), so delete that and zap it more generally in InstSimplify. llvm-svn: 361576
* [InstCombine] remove redundant fold for extractelement; NFCSanjay Patel2019-05-231-4/+0
| | | | | | | | The out-of-bounds index pattern is handled by InstSimplify, so the extractelement should be eliminated next time it is visited. llvm-svn: 361570
* [InstCombine] remove redundant fold for insertelement; NFCSanjay Patel2019-05-231-4/+0
| | | | | | The out-of-bounds index pattern is handled by InstSimplify. llvm-svn: 361569
* [NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].Alina Sbirlea2019-05-232-8/+9
| | | | | | | | | | | | | | Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61612 llvm-svn: 361560
* [InstSimplify] insertelement V, undef, ? --> VSanjay Patel2019-05-231-4/+0
| | | | | | | | This was part of InstCombine, but it's better placed in InstSimplify. InstCombine also had an unreachable but weaker fold for insertelement with undef index, so that is deleted. llvm-svn: 361559
* Revert [LOOPINFO] Extend Loop object to add utilities to get the loop ↵Kit Barton2019-05-231-1/+28
| | | | | | | | bounds, step, induction variable, and guard branch. This reverts r361517 (git commit 2049e4dd8f61100f88f14db33bd95d197bcbfbbc) llvm-svn: 361553
* [SLPVectorizer] Set flag to previous default.Alina Sbirlea2019-05-231-1/+1
| | | | | | | | | | | | | | | | | Summary: The refactoring in r360276 moved the `RunSLPVectorization` flag and added the default explicitly. The default should have been `false`, as before. The new pass manager used to have SLPVectorization on by default, now it's off in opt, and needs D61617 checked in to enable it in clang. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61955 llvm-svn: 361537
* [InstCombine] be more careful when transforming a shuffle maskSanjay Patel2019-05-231-4/+21
| | | | | | | | | | | This is reduced from a fuzzer test: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890 Usually, demanded elements should be able to simplify shuffle mask elements that are pointing to undef elements of its source operands, but that doesn't happen in the test case. llvm-svn: 361533
* [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, ↵Kit Barton2019-05-231-28/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | step, induction variable, and guard branch. Summary: This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard. /// Example: /// for (int i = lb; i < ub; i+=step) /// <loop body> /// --- pseudo LLVMIR --- /// beforeloop: /// guardcmp = (lb < ub) /// if (guardcmp) goto preheader; else goto afterloop /// preheader: /// loop: /// i1 = phi[{lb, preheader}, {i2, latch}] /// <loop body> /// i2 = i1 + step /// latch: /// cmp = (i2 < ub) /// if (cmp) goto loop /// exit: /// afterloop: /// /// getBounds /// getInitialIVValue --> lb /// getStepInst --> i2 = i1 + step /// getStepValue --> step /// getFinalIVValue --> ub /// getCanonicalPredicate --> '<' /// getDirection --> Increasing /// getGuard --> if (guardcmp) goto loop; else goto afterloop /// getInductionVariable --> i1 /// getAuxiliaryInductionVariable --> {i1} /// isCanonical --> false Committed on behalf of @Whitney (Whitney Tsang). Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn Reviewed By: kbarton Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60565 llvm-svn: 361517
* Transforms: lower fadd and fsub atomicrmw instructionsSaleem Abdulrasool2019-05-231-0/+6
| | | | | | | | | | `fadd` and `fsub` have recently (r351850) been added as `atomicrmw` operations. This diff adds lowering cases for them to the LowerAtomic transform. Patch by Josh Berdine! llvm-svn: 361512
* [MergeICmps] Make the pass compatible with the new pass manager.Clement Courbet2019-05-232-72/+78
| | | | | | | | | | | | Reviewers: gchatelet, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62287 llvm-svn: 361490
* [GlobalOpt] recognize dead struct fields and propagate valuesChristian Bruel2019-05-231-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Allow struct fields SRA and dead stores. This works by considering fields accesses from getElementPtr to be considered as a possible pointer root that can be cleaned up. We check that the variable can be SRA by recursively checking the sub expressions with the new isSafeSubSROAGEP function. basically this allows the array in following C code to be optimized out struct Expr { int a[2]; int b; }; static struct Expr e; int foo (int i) { e.b = 2; e.a[i] = 1; return e.b; } Reviewers: greened, bkramer, nicholas, jmolloy Reviewed By: jmolloy Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61911 llvm-svn: 361460
* [X86][InstCombine] Remove InstCombine code that turns X86 round intrinsics ↵Craig Topper2019-05-221-122/+0
| | | | | | | | | | | | | | | | | | | | | | | into llvm.ceil/floor. Remove some isel patterns that existed because that was happening. We were turning roundss/sd/ps/pd intrinsics with immediates of 1 or 2 into llvm.floor/ceil. The llvm.ceil/floor intrinsics are supposed to correspond to the libm functions. For the libm functions we need to disable the precision exception so the llvm.floor/ceil functions should always map to encodings 0x9 and 0xA. We had a mix of isel patterns where some used 0x9 and 0xA and others used 0x1 and 0x2. We need to be consistent and always use 0x9 and 0xA. Since we have no way in isel of knowing where the llvm.ceil/floor came from, we can't map X86 specific intrinsics with encodings 1 or 2 to it. We could map 0x9 and 0xA to llvm.ceil/floor instead, but I'd really like to see a use case and optimization advantage first. I've left the backend test cases to show the blend we now emit without the extra isel patterns. But I've removed the InstCombine tests completely. llvm-svn: 361425
* [PGO][CHR] Speed up following long use-def chains.Hiroshi Yamauchi2019-05-221-9/+25
| | | | | | | | | | | | | | | | Summary: Avoid visiting an instruction more than once by using a map. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62262 llvm-svn: 361416
* LoopVectorizationCostModel::selectInterleaveCount - assert we have a ↵Simon Pilgrim2019-05-221-0/+2
| | | | | | | | non-zero loop cost. NFCI. The input LoopCost value can be zero, but if so it should be recalculated with the current VF. After that it should always be non-zero. llvm-svn: 361387
* Re-land r361257 "[MergeICmps][NFC] Make BCEAtom move-only.""Clement Courbet2019-05-221-5/+19
| | | | llvm-svn: 361366
* [InstCombine] fold shuffles of insert_subvectorsSanjay Patel2019-05-221-1/+52
| | | | | | | | | | | | | | | | | This should be a valid exception to the general rule of not creating new shuffle masks in IR... because we already do it. :) Also, DAG combining/legalization will undo this by widening the shuffle back out if needed. Explanation for how we already do this: SLP or vector source can create chains of insert/extract as shown in 1 of the examples from PR16739: https://godbolt.org/z/NlK7rA https://bugs.llvm.org/show_bug.cgi?id=16739 And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles. Differential Revision: https://reviews.llvm.org/D62024 llvm-svn: 361338
* [MergeICmps] Make sorting strongly stable on the rhs.Clement Courbet2019-05-211-1/+2
| | | | | | | | | | | | | | | | | | | Summary: Because the sort order was not strongly stable on the RHS, whether the chain could merge would depend on the order of the blocks in the Phi. EXPENSIVE_CHECKS would shuffle the blocks before sorting, resulting in non-deterministic merging. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits, RKSimon Tags: #llvm Differential Revision: https://reviews.llvm.org/D62193 llvm-svn: 361281
* Revert r361257 "[MergeICmps][NFC] Make BCEAtom move-only."Clement Courbet2019-05-211-20/+6
| | | | | | Broke some bots. llvm-svn: 361263
* [MergeICmps][NFC] Make BCEAtom move-only.Clement Courbet2019-05-211-6/+20
| | | | | | | | And handle for self-move. This is required so that llvm::sort can work with EXPENSIVE_CHECKS, as it will do a random shuffle of the input which can result in self-moves. llvm-svn: 361257
* Revert r360902 "Resubmit: [Salvage] Change salvage debug info ..."Bob Haarman2019-05-211-21/+2
| | | | | | | | | | | This reverts commit rr360902. It caused an assertion failure in lib/IR/DebugInfoMetadata.cpp: Assertion `(OffsetInBits + SizeInBits <= FragmentSizeInBits) && "new fragment outside of original fragment"' failed. PR41931. llvm-svn: 361246
OpenPOWER on IntegriCloud