summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [PGO] Fix branch probability remarks assertRong Xu2018-03-272-0/+30
| | | | | | | | | Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653
* [IRCE] Enable decreasing loops of non-const boundSam Parker2018-03-271-0/+180
| | | | | | | | | | | | | | As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613
* [SCEV] Make exact taken count calculation more optimisticMax Kazantsev2018-03-272-7/+2
| | | | | | | | | | | | | | | | | | | | | Currently, `getExact` fails if it sees two exit counts in different blocks. There is no solid reason to do so, given that we only calculate exact non-taken count for exiting blocks that dominate latch. Using this fact, we can simply take min out of all exits of all blocks to get the exact taken count. This patch makes the calculation more optimistic with enforcing our assumption with asserts. It allows us to calculate exact backedge taken count in trivial loops like for (int i = 0; i < 100; i++) { if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44676 Reviewed By: fhahn llvm-svn: 328611
* [SCEV] Add one more case in computeConstantDifferenceMax Kazantsev2018-03-271-0/+89
| | | | | | | | | | This patch teaches `computeConstantDifference` handle calculation of constant difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`. Differential Revision: https://reviews.llvm.org/D43759 Reviewed By: anna llvm-svn: 328609
* [MemorySSA] Fix exponential compile-time updating MemorySSA.Eli Friedman2018-03-261-0/+54
| | | | | | | | | | | | | | | | | | | | | | | MemorySSAUpdater::getPreviousDefRecursive is a recursive algorithm, for each block, it computes the previous definition for each predecessor, then takes those definitions and combines them. But currently it doesn't remember results which it already computed; this means it can visit the same block multiple times, which adds up to exponential time overall. To fix this, this patch adds a cache. If we computed the result for a block already, we don't need to visit it again because we'll come up with the same result. Well, unless we RAUW a MemoryPHI; in that case, the TrackingVH will be updated automatically. This matches the original source paper for this algorithm. The testcase isn't really a test for the bug, but it adds coverage for the case where tryRemoveTrivialPhi erases an existing PHI node. (It's hard to write a good regression test for a performance issue.) Differential Revision: https://reviews.llvm.org/D44715 llvm-svn: 328577
* [SLP] Add more checks to a test case. NFC.Haicheng Wu2018-03-261-0/+14
| | | | llvm-svn: 328572
* [SLP] Add a test case. NFC.Haicheng Wu2018-03-261-0/+38
| | | | llvm-svn: 328546
* [InstCombine] reassociate loop invariant GEP chains to enable LICMSebastian Pop2018-03-262-2/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539
* [InstCombine] distribute fmul over fadd/fsubSanjay Patel2018-03-262-13/+11
| | | | | | | | | | This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502
* [InstCombine] check uses before creating instructions for fmul distributionSanjay Patel2018-03-261-12/+3
| | | | | | As the tests show, we could create extra instructions without any obvious benefit. llvm-svn: 328498
* [LoopUnroll] Fix dangling pointers in SCEVMax Kazantsev2018-03-261-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on fact that exit count of outer loops cannot rely on exiting blocks of inner loops, which is true in current implementation of backedge taken count calculation but is wrong in general. As result, when we only forget the loop that we have just unrolled, we may still have cached data for its outer loops (in particular, exit counts) which keeps references on blocks of inner loop that could have been changed or even deleted. The attached test demonstrates a situaton when after unrolling of innermost loop the outermost loop contains a dangling pointer on non-existant block. The problem shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV smarter about exit count calculation. I am not sure if the bug exists without this patch, it appears that now it is accidentally correct just because in practice exact backedge taken count for outer loops with complex control flow inside is never calculated. But when SCEV learns to do so, this problem shows up. This patch replaces existing logic of SCEV loop invalidation with a correct one, which happens to be invalidation of outermost loop (which also leads to invalidation of all loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers on removed blocks, or just outdated information that has changed after unrolling. Differential Revision: https://reviews.llvm.org/D44818 Reviewed By: samparker llvm-svn: 328483
* [DeadArgElim] Strip allocsize attributes when deleting an argument.Benjamin Kramer2018-03-261-0/+13
| | | | | | | Since allocsize refers to the argument number it gets invalidated when an argument is removed and the numbers shift. llvm-svn: 328481
* [IRCE] Enable increasing loops of variable boundsSam Parker2018-03-262-6/+176
| | | | | | | | | | | | | | | | | | | | | CanBeMin is currently used which will report true for any unknown values, but often a check is performed outside the loop which covers this situation: for (int i = 0; i < N; ++i) ... if (N > 0) for (int i = 0; i < N; ++i) ... So I've add 'LoopGuardedAgainstMin' which reports whether N is greater than the minimum value which then allows loop with a variable loop count to be optimised. I've also moved the increasing bound checking into its own function and replaced SumCanReachMax is another isLoopEntryGuardedByCond function. llvm-svn: 328480
* [PatternMatch] allow undef elements when matching vector FP +0.0Sanjay Patel2018-03-256-14/+8
| | | | | | | | | | | | | This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461
* [InstSimplify, InstCombine] add/update tests with FP +0.0 vector with undef; NFCSanjay Patel2018-03-258-364/+423
| | | | llvm-svn: 328455
* [InstCombine] adjust test comments; NFCSanjay Patel2018-03-251-9/+6
| | | | llvm-svn: 328450
* [InstCombine] consolidate casted icmp vector testsSanjay Patel2018-03-251-660/+43
| | | | | | | | We have thorough coverage of predicates and scalar types, so we just need a sampling of vector tests to show that things are working or not with vectors types. llvm-svn: 328449
* [InstCombine] peek through more icmp of FP cast + bitcastSanjay Patel2018-03-251-135/+45
| | | | | | This is an extension of rL328426 as noted in D44367. llvm-svn: 328448
* [InstCombine] peek through FP casts for sign-bit compares (PR36682)Sanjay Patel2018-03-242-107/+27
| | | | | | | | | | | | This pattern came up in PR36682: https://bugs.llvm.org/show_bug.cgi?id=36682 https://godbolt.org/g/LhuD9A Equality checks are planned as a follow-up enhancement. Differential Revision: https://reviews.llvm.org/D44367 llvm-svn: 328426
* [InstCombine] add multi-use/vector tests for intrinsic shrinking; NFCSanjay Patel2018-03-241-29/+155
| | | | llvm-svn: 328422
* [PM][FunctionAttrs] add NoUnwind attribute inference to ↵Fedor Sergeev2018-03-2330-54/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PostOrderFunctionAttrs pass Summary: This was motivated by absence of PrunEH functionality in new PM. It was decided that a proper way to do PruneEH is to add NoUnwind inference into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top. This change generalizes attribute handling implemented for (a removal of) Convergent attribute, by introducing a generic builder-like class AttributeInferer It registers all the attribute inference requests, storing per-attribute predicates into a vector, and then goes through an SCC Node, scanning all the instructions for not breaking attribute assumptions. The main idea is that as soon all the instructions from all the functions of SCC Node conform to attribute assumptions then we are free to infer the attribute as set for all the functions of SCC Node. It handles two distinct cases of attributes: - those that might break due to derefinement of the function code for these attributes we are allowed to apply inference only if all the functions are "exact definitions". Example - NoUnwind. - those that do not care about derefinement for these attributes we are allowed to apply inference as soon as we see any function definition. Example - removal of Convergent attribute. Also in this commit: * Converted all the FunctionAttrs tests to use FileCheck and added new-PM invocations to them * FunctionAttrs/convergent.ll test demonstrates a difference in behavior between new and old PM implementations. Marked with FIXME. * PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg combo as intended * some of "other" tests were updated since function-attrs now infers 'nounwind' even for old PM pipeline * -disable-nounwind-inference hidden option added as a possible workaround for a supposedly rare case when nounwind being inferred by default presents a problem Reviewers: chandlerc, jlebar Reviewed By: jlebar Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D44415 llvm-svn: 328377
* [InstCombine] increase test coverage for intrinsic shrinking; NFCSanjay Patel2018-03-231-48/+48
| | | | | | There were no tests with vector types before this. llvm-svn: 328371
* [InstCombine] auto-generate checks; NFCSanjay Patel2018-03-232-16/+48
| | | | llvm-svn: 328329
* [InstSimplify] regenerate checks, move tests; NFCSanjay Patel2018-03-232-34/+43
| | | | llvm-svn: 328327
* [InstCombine] regenerate test checks; NFCSanjay Patel2018-03-231-10/+17
| | | | llvm-svn: 328325
* [SLP] Stop counting cost of gather sequences with multiple usesMatthew Simpson2018-03-232-34/+21
| | | | | | | | | | | | | | | When building the SLP tree, we look for reuse among the vectorized tree entries. However, each gather sequence is represented by a unique tree entry, even though the sequence may be identical to another one. This means, for example, that a gather sequence with two uses will be counted twice when computing the cost of the tree. We should only count the cost of the definition of a gather sequence rather than its uses. During code generation, the redundant gather sequences are emitted, but we optimize them away with CSE. So it looks like this problem just affects the cost model. Differential Revision: https://reviews.llvm.org/D44742 llvm-svn: 328316
* Revert r328307: [IPSCCP] Use constant range information for comparisons of ↵Florian Hahn2018-03-231-9/+13
| | | | | | | | parameters. Reverted for now, due to it causing verifier failures. llvm-svn: 328312
* [IPSCCP] Use constant range information for comparisons of parameters.Florian Hahn2018-03-231-13/+9
| | | | | | | | | | | | | | | | | | For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 328307
* [LoopUnroll] Simplify induction variables after peeling too.Florian Hahn2018-03-231-15/+10
| | | | | | | | | | | | | Loop peeling also has an impact on the induction variables, so we should benefit from induction variable simplification after peeling too. Reviewers: sanjoy, bogner, mzolotukhin, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43878 llvm-svn: 328301
* Revert r325687 (workaround for PR36032).Evgeny Stupachenko2018-03-224-5/+5
| | | | | | | | | | | | | | Summary: Revert r325687 workaround for PR36032 since a fix was committed in r326154. Reviewers: sbaranga Differential Revision: http://reviews.llvm.org/D44768 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 328257
* [SimplifyCFG] Create attribute for fuzzing-specific optimizations.Matt Morehouse2018-03-221-0/+49
| | | | | | | | | | | | | | | | | | | | | | Summary: When building with libFuzzer, converting control flow to selects or obscuring the original operands of CMPs reduces the effectiveness of libFuzzer's heuristics. This patch provides an attribute to disable or modify certain optimizations for optimal fuzzing signal. Provides a less aggressive alternative to https://reviews.llvm.org/D44057. Reviewers: vitalybuka, davide, arsenm, hfinkel Reviewed By: vitalybuka Subscribers: junbuml, mehdi_amini, wdng, javed.absar, hiraditya, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44232 llvm-svn: 328214
* [LoopPredication] Add profitability check based on BPIAnna Thomas2018-03-221-0/+120
| | | | | | | | | | | | | | | | | | | | | | | | Summary: LoopPredication is not profitable when the loop is known to always exit through some block other than the latch block. A coarse grained latch check can cause loop predication to predicate the loop, and unconditionally deoptimize. However, without predicating the loop, the guard may never fail within the loop during the dynamic execution because the non-latch loop termination condition exits the loop before the latch condition causes the loop to exit. We teach LP about this using BranchProfileInfo pass. Reviewers: apilipenko, skatkov, mkazantsev, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44667 llvm-svn: 328210
* [InstCombine] add folds for xor-of-icmp signbit tests (PR36682)Sanjay Patel2018-03-222-25/+56
| | | | | | | | | | | | | | | | | | This is a retry of r328119 which was reverted at r328145 because it could crash by trying to combine icmps with different operand types. This version has a check for that and additional tests. Original commit message: This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328197
* [InstCombine] move/add tests for fmul distribution; NFCSanjay Patel2018-03-212-79/+185
| | | | | | | | | There are at least 3 problems: 1. We're distributing across large patterns, but fail to do that for the minimal patterns. 2. We're not checking uses, so we may create more instructions than we eliminate. 3. We should be able to do these transforms with less than full 'fast' fmuls. llvm-svn: 328152
* Revert r328119 "[InstCombine] add folds for xor-of-icmp signbit tests (PR36682)"Reid Kleckner2018-03-212-24/+29
| | | | | | | This asserts when compiling safe_numerics_unittest.cpp in Chromium with MSan. llvm-svn: 328145
* [InstSimplify] fp_binop X, NaN --> NaNSanjay Patel2018-03-211-40/+20
| | | | | | | | We propagate the existing NaN value when possible. Differential Revision: https://reviews.llvm.org/D44521 llvm-svn: 328140
* [SLP] Add test case for a gather sequence with multiple usesMatthew Simpson2018-03-211-0/+66
| | | | llvm-svn: 328133
* [InstCombine] add folds for xor-of-icmp signbit tests (PR36682)Sanjay Patel2018-03-212-29/+24
| | | | | | | | | | | | This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328119
* [InstCombine] move/add tests for xor-of-icmps (PR36682); NFCSanjay Patel2018-03-212-21/+128
| | | | llvm-svn: 328109
* [MemCpyOpt] Update to new API for memory intrinsic alignmentDaniel Neilson2018-03-212-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the MemCpyOpt pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. We also add a few tests to fill gaps in the testing of this pass. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 328097
* Re-re-land: Teach CorrelatedValuePropagation to reduce the width of ↵Justin Lebar2018-03-212-0/+196
| | | | | | | | | | | | | | | udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for causing performance regressions. Re-landing because we've determined that these regressions were noise. Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 328096
* [ObjCARC] Add funclet token to ARC markerShoaib Meenai2018-03-201-0/+61
| | | | | | | | | | | | | | | The inline assembly generated for the ARC autorelease elision marker must have a funclet token if it's emitted inside a funclet, otherwise the inline assembly (and all subsequent code in the funclet) will be marked unreachable by WinEHPrepare. Note that this only applies for the non-O0 case, since at O0, clang emits the autorelease elision marker itself rather than deferring to the backend. The fix for clang is handled in a separate change. Differential Revision: https://reviews.llvm.org/D44641 llvm-svn: 328042
* [MergeICmp] Fix a bug in entry block shuffled to middle of the chainXin Tong2018-03-202-1/+58
| | | | | | | | | | | | Summary: Fix a bug in entry block shuffled to middle of the chain. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44642 llvm-svn: 327971
* [LV] Let recordVectorLoopValueForInductionCast to check if IV was created ↵Andrei Elovikov2018-03-201-0/+39
| | | | | | | | | | | | | | | | | | | | | | from the cast. Summary: It turned out to be error-prone to expect the callers to handle that - better to leave the decision to this routine and make the required data to be explicitly passed to the function. This handles the case that was missed in the r322473 and fixes the assert mentioned in PR36524. Reviewers: dorit, mssimpso, Ayal, dcaballe Reviewed By: dcaballe Subscribers: Ka-Ka, hiraditya, dneilson, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D43812 llvm-svn: 327960
* [InstCombine] canonicalize fcmp+select to fabsSanjay Patel2018-03-191-68/+60
| | | | | | | | | | | | | | This is complicated by -0.0 and nan. This is based on the DAG patterns as shown in D44091. I'm hoping that we can just remove those DAG folds and always rely on IR canonicalization to handle the matching to fabs. We would still need to delete the broken code from DAGCombiner to fix PR36600: https://bugs.llvm.org/show_bug.cgi?id=36600 Differential Revision: https://reviews.llvm.org/D44550 llvm-svn: 327858
* [SCEV] Re-land: Fix isKnownPredicateSerguei Katkov2018-03-191-0/+35
| | | | | | | | | | | | | | | | | This is re-land of https://reviews.llvm.org/rL327362 with a fix and regression test. The crash was due to it is possible that for found MDL loop, LHS or RHS may contain an invariant unknown SCEV which does not dominate the MDL. Please see regression test for an example. Reviewers: sanjoy, mkazantsev, reames Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44553 llvm-svn: 327822
* [LICM] Salvage DI from dying InstructionsAnastasis Grammenos2018-03-181-0/+4
| | | | | | | LICM deletes trivially dead instructions which it won't attempt to sink. Attempt to salvage debug values which reference these instructions. llvm-svn: 327800
* [InstCombine] peek through unsigned FP casts for zero-equality compares ↵Roman Lebedev2018-03-181-38/+14
| | | | | | | | | | | | | | | | | | | | | | (PR36682) Summary: This pattern came up in PR36682 / D44390 https://bugs.llvm.org/show_bug.cgi?id=36682 https://reviews.llvm.org/D44390 https://godbolt.org/g/oKvT5H See also D44416 Reviewers: spatel, majnemer, efriedma, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44424 llvm-svn: 327799
* [InstCombine] add nnan requirement for sqrt(x) * sqrt(y) -> sqrt(x*y)Sanjay Patel2018-03-181-9/+25
| | | | | | This is similar to D43765. llvm-svn: 327797
* [InstSimplify] loosen FMF for sqrt(X) * sqrt(X) --> XSanjay Patel2018-03-181-2/+37
| | | | | | | | | As shown in the code comment, we don't need all of 'fast', but we do need reassoc + nsz + nnan. Differential Revision: https://reviews.llvm.org/D43765 llvm-svn: 327796
OpenPOWER on IntegriCloud