summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] use m_OneUse to reduce code; NFCISanjay Patel2017-05-151-6/+7
| | | | llvm-svn: 303090
* [ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits.Craig Topper2017-05-157-30/+20
| | | | | | | | This patch finishes off the conversion of ComputeSignBit to computeKnownBits. Differential Revision: https://reviews.llvm.org/D33166 llvm-svn: 303035
* [InstCombine] Merge duplicate functionality between InstCombine and ↵Craig Topper2017-05-153-104/+26
| | | | | | | | | | | | | | | | | | | | | | | ValueTracking Summary: Merge overflow computation for signed add, appearing both in InstCombine and ValueTracking. As part of the merge, cleanup the interface for overflow checks in InstCombine. Patch by Yoav Ben-Shalom. Reviewers: craig.topper, majnemer Reviewed By: craig.topper Subscribers: takuto.ikuta, llvm-commits Differential Revision: https://reviews.llvm.org/D32946 llvm-svn: 303029
* [InstCombine] Remove 'return' of a called function that also returned void. NFCCraig Topper2017-05-151-3/+2
| | | | llvm-svn: 303028
* Fix test failure on windows -- do not return deleted funcXinliang David Li2017-05-141-2/+8
| | | | llvm-svn: 302999
* [LoopOptimizer][Fix]PR32859, PR24738Simon Pilgrim2017-05-131-7/+9
| | | | | | | | | | | | | | | | | | | The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form. Here it is: before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ] and after this change we have: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ] Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D33055 llvm-svn: 302988
* [InstCombine] Prevent InstCombine from triggering an extra iteration if ↵Craig Topper2017-05-131-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | something changed in the initial Worklist creation Summary: If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop. This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now. This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation. I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this. This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that. Reviewers: davide, majnemer, spatel, chandlerc Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32453 llvm-svn: 302982
* [PartialInlining] Profile based cost analysisXinliang David Li2017-05-121-45/+363
| | | | | | | | | | | | Implemented frequency based cost/saving analysis and related options. The pass is now in a state ready to be turne on in the pipeline (in follow up). Differential Revision: http://reviews.llvm.org/D32783 llvm-svn: 302967
* [KnownBits] Add bit counting methods to KnownBits struct and use them where ↵Craig Topper2017-05-126-11/+11
| | | | | | | | | | | | possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925
* [NewGVN] Improve debug output a bit. NFCI.Davide Italiano2017-05-121-1/+1
| | | | | | | While debugging a predicate info problem, I noticed this was missing a newline, making the debug output slightly less readable. llvm-svn: 302908
* [NewGVN] Format an assertion and fix a typo. NFCI.Davide Italiano2017-05-121-3/+2
| | | | llvm-svn: 302906
* [NewGVN] Don't incorrectly reset the memory leader.Davide Italiano2017-05-121-1/+1
| | | | | | | | | | This code was missing a check for stores, so we were thinking the congruency class didn't have any memory members, and reset the memory leader. Differential Revision: https://reviews.llvm.org/D33056 llvm-svn: 302905
* [PM/Unswitch] Teach the new simple loop unswitch to handle loopChandler Carruth2017-05-121-23/+138
| | | | | | | | | | | | | | | | | | | | | | | | invariant PHI inputs and to rewrite PHI nodes during the actual unswitching. The checking is quite easy, but rewriting the PHI nodes is somewhat surprisingly challenging. This should handle both branches and switches. I think this is now a full featured trivial unswitcher, and more full featured than the trivial cases in the old pass while still being (IMO) somewhat simpler in how it works. Next up is to verify its correctness in more widespread testing, and then to add non-trivial unswitching. Thanks to Davide and Sanjoy for the excellent review. There is one remaining question that I may address in a follow-up patch (see the review thread for details) but it isn't related to the functionality specifically. Differential Revision: https://reviews.llvm.org/D32699 llvm-svn: 302867
* [SLP] Emit optimization remarksAdam Nemet2017-05-111-6/+36
| | | | | | | | | | | | | | | | | | The approach I followed was to emit the remark after getTreeCost concludes that SLP is profitable. I initially tried emitting them after the vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely remember missing a few cases for example in HorizontalReduction::tryToReduce. ORE is placed in BoUpSLP so that it's available from everywhere (notably HorizontalReduction::tryToReduce). We use the first instruction in the root bundle as the locator for the remark. In order to get a sense how far the tree is spanning I've include the size of the tree in the remark. This is not perfect of course but it gives you at least a rough idea about the tree. Then you can follow up with -view-slp-tree to really see the actual tree. llvm-svn: 302811
* [LV] Refactor ILV.vectorize{Loop}() by introducing LVP.executePlan(); NFCAyal Zaks2017-05-111-80/+101
| | | | | | | | | | | | | | Introduce LoopVectorizationPlanner.executePlan(), replacing ILV.vectorize() and refactoring ILV.vectorizeLoop(). Method collectDeadInstructions() is moved from ILV to LVP. These changes facilitate building VPlans and using them to generate code, following https://reviews.llvm.org/D28975 and its tentative breakdown. Method ILV.createEmptyLoop() is renamed ILV.createVectorizedLoopSkeleton() to improve clarity; it's contents remain intact. Differential Revision: https://reviews.llvm.org/D32200 llvm-svn: 302790
* [msan] Fix PR32842Alexander Potapenko2017-05-111-2/+5
| | | | | | | | | | | | | | | | | | | | It turned out that MSan was incorrectly calculating the shadow for int comparisons: it was done by truncating the result of (Shadow1 OR Shadow2) to i1, effectively rendering all bits except LSB useless. This approach doesn't work e.g. in the case where the values being compared are even (i.e. have the LSB of the shadow equal to zero). Instead, if CreateShadowCast() has to cast a bigger int to i1, we replace the truncation with an ICMP to 0. This patch doesn't affect the code generated for SPEC 2006 binaries, i.e. there's no performance impact. For the test case reported in PR32842 MSan with the patch generates a slightly more efficient code: orq %rcx, %rax jne .LBB0_6 , instead of: orl %ecx, %eax testb $1, %al jne .LBB0_6 llvm-svn: 302787
* Remove spurious cast of nullptr. NFC.Serge Guelton2017-05-112-2/+2
| | | | | | Conversion rules allow automatic casting of nullptr to any pointer type. llvm-svn: 302780
* [InstCombine] remove fold that swaps xor/or with constants; NFCISanjay Patel2017-05-101-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | // (X ^ C1) | C2 --> (X | C2) ^ (C1&~C2) This canonicalization was added at: https://reviews.llvm.org/rL7264 By moving xors out/down, we can more easily combine constants. I'm adding tests that do not change with this patch, so we can verify that those kinds of transforms are still happening. This is no-functional-change-intended because there's a later fold: // (X^C)|Y -> (X|Y)^C iff Y&C == 0 ...and demanded-bits appears to guarantee that any fold that would have hit the fold we're removing here would be caught by that 2nd fold. Similar reasoning was used in: https://reviews.llvm.org/rL299384 The larger motivation for removing this code is that it could interfere with the fix for PR32706: https://bugs.llvm.org/show_bug.cgi?id=32706 Ie, we're not checking if the 'xor' is actually a 'not', so we could reverse a 'not' optimization and cause an infinite loop by altering an 'xor X, -1'. Differential Revision: https://reviews.llvm.org/D33050 llvm-svn: 302733
* [NewGVN] Introduce a definesNoMemory() helper and use it.Davide Italiano2017-05-101-3/+5
| | | | | | | This is nice as is, but it will be used in my next patch to fix a bug. Suggested by Daniel Berlin. llvm-svn: 302714
* Ensure non-null ProfileSummaryInfo passed to ModuleSummaryIndex builderTeresa Johnson2017-05-101-1/+3
| | | | | | | | | | | | This fixes a ubsan bot failure after r302597, which made getProfileCount non-static, but ended up invoking it on a null ProfileSummaryInfo object in some cases from buildModuleSummaryIndex. Most testing passed because the non-static getProfileCount currently doesn't access any member variables, but I found this when testing a follow on patch (D32877) that adds a member variable access. llvm-svn: 302705
* [InstCombine] add (ashr (shl i32 X, 31), 31), 1 --> and (not X), 1Sanjay Patel2017-05-101-0/+10
| | | | | | | | | | | | | | This is another step towards favoring 'not' ops over random 'xor' in IR: https://bugs.llvm.org/show_bug.cgi?id=32706 This transformation may have occurred in longer IR sequences using computeKnownBits, but that could be much more expensive to calculate. As the scalar result shows, we do not currently favor 'not' in all cases. The 'not' created by the transform is transformed again (unnecessarily). Vectors don't have this problem because vectors are (wrongly) excluded from several other combines. llvm-svn: 302659
* Use explicit false instead of casted nullptr. NFC.Serge Guelton2017-05-101-2/+2
| | | | llvm-svn: 302656
* Revert r301950: SpeculativeExecution: Stop using whitelist for costsChandler Carruth2017-05-101-1/+42
| | | | | | | | | | This pass doesn't correctly handle testing for when it is legal to hoist arbitrary instructions. The whitelist happens to make it safe, so before it is removed the pass's legality checks will need to be enhanced. Details have been added to the code review thread for the patch. llvm-svn: 302640
* Add a late IR expansion pass for the experimental reduction intrinsics.Amara Emerson2017-05-101-5/+4
| | | | | | | | | This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence. Differential Revision: https://reviews.llvm.org/D32245 llvm-svn: 302631
* [InstCombine] add helper function for add X, C folds; NFCISanjay Patel2017-05-101-34/+45
| | | | llvm-svn: 302605
* [ProfileSummary] Make getProfileCount a non-static member function.Easwaran Raman2017-05-093-11/+13
| | | | | | | | | | This change is required because the notion of count is different for sample profiling and getProfileCount will need to determine the underlying profile type. Differential revision: https://reviews.llvm.org/D33012 llvm-svn: 302597
* FunctionImport: Simplify function llvm::thinLTOInternalizeModule. NFCI.Peter Collingbourne2017-05-091-10/+5
| | | | llvm-svn: 302595
* [GVN] Fix a crash on encountering non-integral pointersKeno Fischer2017-05-091-0/+9
| | | | | | | | | | | | | | | | | | Summary: This fixes the immediate crash caused by introducing an incorrect inttoptr before attempting the conversion. There may still be a legality check missing somewhere earlier for non-integral pointers, but this change seems necessary in any case. Reviewers: sanjoy, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32623 llvm-svn: 302587
* [AArch64] Consider widening instructions in cost calculationsMatthew Simpson2017-05-091-4/+6
| | | | | | | | | | | | | | | The AArch64 instruction set has a few "widening" instructions (e.g., uaddl, saddl, uaddw, etc.) that take one or more doubleword operands and produce quadword results. The operands are automatically sign- or zero-extended as appropriate. However, in LLVM IR, these extends are explicit. This patch updates TTI to consider these widening instructions as single operations whose cost is attached to the arithmetic instruction. It marks extends that are part of a widening operation "free" and applies a sub-target specified overhead (zero by default) to the arithmetic instructions. Differential Revision: https://reviews.llvm.org/D32706 llvm-svn: 302582
* [InstCombine] clean up matchDeMorgansLaws(); NFCISanjay Patel2017-05-091-32/+13
| | | | | | | | | | | | | | The motivation for getting rid of dyn_castNotVal is to allow fixing: https://bugs.llvm.org/show_bug.cgi?id=32706 So this was supposed to be functional-change-intended for the case of inverting constants and applying DeMorgan. However, I can't find any cases where that pattern will actually get to matchDeMorgansLaws() because we have other folds in visitAnd/visitOr that do the same thing. So this ends up just being a clean-up patch with slight efficiency improvement, but no-functional-change-intended. llvm-svn: 302581
* [NewGVN] Simplify a DEBUG() statement. NFCI.Davide Italiano2017-05-091-2/+1
| | | | llvm-svn: 302579
* Make it illegal for two Functions to point to the same DISubprogramAdrian Prantl2017-05-092-44/+31
| | | | | | | | | | | | | | | | | | | As recently discussed on llvm-dev [1], this patch makes it illegal for two Functions to point to the same DISubprogram and updates FunctionCloner to also clone the debug info of a function to conform to the new requirement. To simplify the implementation it also factors out the creation of inlineAt locations from the Inliner into a general-purpose utility in DILocation. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html <rdar://problem/31926379> Differential Revision: https://reviews.llvm.org/D32975 This reapplies r302469 with a fix for a bot failure (reparentDebugInfo now checks for the case the orig and new function are identical). llvm-svn: 302576
* NFC: refactor replaceDominatedUsesWithPiotr Padlewski2017-05-091-27/+26
| | | | | | | | | | | | | | | Summary: Since I will post patch with some changes to replaceDominatedUsesWith, it would be good to avoid duplicating code again. Reviewers: davide, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32798 llvm-svn: 302575
* Suppress all uses of LLVM_END_WITH_NULL. NFC.Serge Guelton2017-05-097-58/+48
| | | | | | | | | Use variadic templates instead of relying on <cstdarg> + sentinel. This enforces better type checking and makes code more readable. Differential Revision: https://reviews.llvm.org/D32541 llvm-svn: 302571
* [NewGVN] Explain why sorting by pointer values doesn't introduce ↵Davide Italiano2017-05-091-0/+4
| | | | | | | | non-determinism. Thanks to Eli for pointing out in a post-commit review comment. llvm-svn: 302566
* [NewGVN] Fix a consistent order for phi nodes operands.Davide Italiano2017-05-091-7/+19
| | | | | | | | | | | | | | | | | | | | | | | | | The way we currently define congruency for two PHIExpression(s) is: 1) The operands to the phi functions are congruent 2) The PHIs are defined in the same BasicBlock. NewGVN works under the assumption that phi operands are in predecessor order, or at least in some consistent order. OTOH, is valid IR: patatino: %meh = phi i16 [ %0, %winky ], [ %conv1, %tinky ] %banana = phi i16 [ %0, %tinky ], [ %conv1, %winky ] br label %end and the in-memory representations of the two SSA registers have an inconsistent order. This violation of NewGVN assumptions results into two PHIs found congruent when they're not. While we think it's useful to have always a consistent order enforced, let's fix this in NewGVN sorting uses in predecessor order before creating a PHI expression. Differential Revision: https://reviews.llvm.org/D32990 llvm-svn: 302552
* NewGVN: Make all of symbolic evaluation logically const.Daniel Berlin2017-05-091-64/+74
| | | | llvm-svn: 302550
* [InstCombineCasts] Fix checks in sext->lshr->trunc pattern.Sanjay Patel2017-05-091-6/+14
| | | | | | | | | | | | | | | The comment says to avoid the case where zero bits are shifted into the truncated value, but the code checks that the shift is smaller than the truncated value instead of the number of bits added by the sign extension. Fixing this allows a shift by more than the value size to be introduced, which is undefined behavior, so the shift is capped at the value size minus one, which has the expected behavior of filling the value with the sign bit. Patch by Jacob Young! Differential Revision: https://reviews.llvm.org/D32285 llvm-svn: 302548
* Revert r302469 "Make it illegal for two Functions to point to the same ↵Hans Wennborg2017-05-092-31/+44
| | | | | | | | | | | | | | | | | | | | | | | | DISubprogram" This caused PR32977. Original commit message: > Make it illegal for two Functions to point to the same DISubprogram > > As recently discussed on llvm-dev [1], this patch makes it illegal for > two Functions to point to the same DISubprogram and updates > FunctionCloner to also clone the debug info of a function to conform > to the new requirement. To simplify the implementation it also factors > out the creation of inlineAt locations from the Inliner into a > general-purpose utility in DILocation. > > [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html > <rdar://problem/31926379> > > Differential Revision: https://reviews.llvm.org/D32975 llvm-svn: 302533
* [LV] Fix insertion point for shuffle vectors in first order recurrenceAnna Thomas2017-05-091-2/+5
| | | | | | | | | | | | | | | | | | Summary: In first order recurrence vectorization, when the previous value is a phi node, we need to set the insertion point to the first non-phi node. We can have the previous value being a phi node, due to the generation of new IVs as part of trunc optimization [1]. [1] https://reviews.llvm.org/rL294967 Reviewers: mssimpso, mkuper Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32969 llvm-svn: 302532
* Introduce experimental generic intrinsics for horizontal vector reductions.Amara Emerson2017-05-093-65/+222
| | | | | | | | | | | | | | - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514
* [InstNamer] Use range-forSanjoy Das2017-05-081-4/+3
| | | | llvm-svn: 302481
* [InstNamer] Don't check type of arguments (they're never void)Sanjoy Das2017-05-081-1/+1
| | | | llvm-svn: 302480
* Delete trailing whitespaceSanjoy Das2017-05-081-3/+3
| | | | llvm-svn: 302479
* Make it illegal for two Functions to point to the same DISubprogramAdrian Prantl2017-05-082-44/+31
| | | | | | | | | | | | | | | | As recently discussed on llvm-dev [1], this patch makes it illegal for two Functions to point to the same DISubprogram and updates FunctionCloner to also clone the debug info of a function to conform to the new requirement. To simplify the implementation it also factors out the creation of inlineAt locations from the Inliner into a general-purpose utility in DILocation. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html <rdar://problem/31926379> Differential Revision: https://reviews.llvm.org/D32975 llvm-svn: 302469
* [InstCombine] add folds for not-of-shift-rightSanjay Patel2017-05-081-15/+32
| | | | | | | | | | | | | | | This is another step towards getting rid of dyn_castNotVal, so we can recommit: https://reviews.llvm.org/rL300977 As the tests show, we were missing the lshr case for constants and both ashr/lshr vector splat folds. The ashr case with constant was being performed inefficiently in 2 steps. It's also possible there was a latent bug in that case because we can't do that fold if the constant is positive: http://rise4fun.com/Alive/Bge llvm-svn: 302465
* [PartialInlining] Capture by reference rather than by value.Davide Italiano2017-05-081-3/+3
| | | | llvm-svn: 302464
* [InstCombine] use local variable to reduce code duplication; NFCISanjay Patel2017-05-081-14/+11
| | | | llvm-svn: 302438
* [InstCombine/InstSimplify] add comments about code duplication; NFCSanjay Patel2017-05-081-0/+3
| | | | llvm-svn: 302436
* [ConstantRange][SimplifyCFG] Add a helper method to allow SimplifyCFG to ↵Craig Topper2017-05-071-1/+1
| | | | | | | | | | determine if a ConstantRange has more than 8 elements without requiring an allocation if the ConstantRange is 64-bits wide. Previously SimplifyCFG used getSetSize which returns an APInt that is 1 bit wider than the ConstantRange's bit width. In the reasonably common case that the ConstantRange is 64-bits wide, this requires returning a 65-bit APInt. APInt's can only store 64-bits without a memory allocation so this is inefficient. The new method takes the 8 as an input and tells if the range contains more than that many elements without requiring any wider math. llvm-svn: 302385
OpenPOWER on IntegriCloud