summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a functionWhitney Tsang2020-01-094-0/+4
| | | | | | | | | | | | | | | | | | | pass. Summary: This patch changes LoopUnrollAndJamPass to a function pass, and keeps the loops traversal order same as defined in FunctionToLoopPassAdaptor LoopPassManager.h. The next patch will change the loop traversal to outer to inner order, so more loops can be transform. Discussion in llvm-dev mailing list: https://groups.google.com/forum/#!topic/llvm-dev/LF4rUjkVI2g Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto Reviewed By: dmgreen Subscribers: hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D72230
* [InstCombine] Z / (1.0 / Y) => (Y * Z)@raghesh (Raghesh Aloor)2020-01-091-2/+2
| | | | | | | | | This is a special case of Z / (X / Y) => (Y * Z) / X, with X = 1.0. The m_OneUse check is avoided because even in the case of the multiple uses for 1.0/Y, the number of instructions remain the same and a division is replaced by a multiplication. Differential Revision: https://reviews.llvm.org/D72319
* [InstSimplify] select Cond, true, false --> CondSanjay Patel2020-01-091-6/+3
| | | | | | | | | | | | | | | This is step 1 of damage control assuming that we need to remove several over-reaching folds for select-of-booleans because they can cause miscompiles as shown in D72396. The scalar case seems obviously safe: https://rise4fun.com/Alive/jSj And I don't think there's any danger for vectors either - if the condition is poisoned, then the select must be poisoned too, so undef elements don't make any difference. Differential Revision: https://reviews.llvm.org/D72412
* [ARM][MVE] MVE-I should not be disabled by -mfpu=noneMomchil Velikov2020-01-092-2/+2
| | | | | | | | | | | | | | | Architecturally, it's allowed to have MVE-I without an FPU, thus -mfpu=none should not disable MVE-I, or moves to/from FP-registers. This patch removes `+/-fpregs` from features unconditionally added to target feature list, depending on FPU and moves the logic to Clang driver, where the negative form (`-fpregs`) is conditionally added to the target features list for the cases of `-mfloat-abi=soft`, or `-mfpu=none` without either `+mve` or `+mve.fp`. Only the negative form is added by the driver, the positive one is derived from other features in the backend. Differential Revision: https://reviews.llvm.org/D71843
* [InstCombine] Use minimal FMF in testcase for Z / (1.0 / Y) => (Y * Z); NFCSanjay Patel2020-01-091-2/+2
| | | | | | Patch by: @raghesh (Raghesh Aloor) Differential Revision: https://reviews.llvm.org/D72431
* [ARM][MVE] Don't unroll intrinsic loops.Sam Parker2020-01-091-0/+49
| | | | | | | | We don't unroll vector loops for MVE targets, but we miss the case when loops only contain intrinsic calls. So just move the logic a bit to catch this case. Differential Revision: https://reviews.llvm.org/D72440
* [Matrix] Update shape propagation to iterate until done.Florian Hahn2020-01-091-0/+84
| | | | | | | | | | | | | | | | | | This patch updates the shape propagation to iterate until no new shape information is discovered. As initial seed for the forward propagation, we use the matrix intrinsic instructions. Both propagateShapeForward and propagateShapeBackward return new work lists, with the instructions to be used for the next iteration. When propagating forward, we record all instructions we added new shape information for. When propagating backward, we record all users of instructions we added new shape information for. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70901
* [Matrix] Propagate and use shape information for loads.Florian Hahn2020-01-092-123/+140
| | | | | | | | | | | This patch extends to shape propagation to also include load instructions and implements shape aware lowering for vector loads. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70900
* [Matrix] Implement back-propagation of shape information.Florian Hahn2020-01-092-0/+222
| | | | | | | | | | | This patch extends the shape propagation for matrix operations to also propagate the shape of instructions to their operands. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70899
* [LV] Still vectorise when tail-folding can't find a primary inducation variableSjoerd Meijer2020-01-092-0/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses a vectorisation regression for tail-folded loops that are counting down, e.g. loops as simple as this: void foo(char *A, char *B, char *C, uint32_t N) { while (N > 0) { *C++ = *A++ + *B++; N--; } } These are loops that can be vectorised, but when tail-folding is requested, it can't find a primary induction variable which we do need for predicating the loop. As a result, the loop isn't vectorised at all, which it is able to do when tail-folding is not attempted. So, this adds a check for the primary induction variable where we decide how to lower the scalar epilogue. I.e., when there isn't a primary induction variable, a scalar epilogue loop is allowed (i.e. don't request tail-folding) so that vectorisation could still be triggered. Having this check for the primary induction variable make sense anyway, and in addition, in a follow-up of this I will look into discovering earlier the primary induction variable for counting down loops, so that this can also be tail-folded. Differential revision: https://reviews.llvm.org/D72324
* [Attributor][FIX] Carefully change invokes to calls (after manifest)Johannes Doerfert2020-01-088-26/+99
| | | | | | Before we manually inserted unreachable early but that could lead to broken PHI nodes. Now we use the existing late modification functionality.
* [Attributor][FIX] Avoid dangling value pointers during code modificationJohannes Doerfert2020-01-081-0/+10
| | | | | | | When we replace instructions with unreachable we delete instructions. We now avoid dangling pointers to those deleted instructions in the `ToBeChangedToUnreachableInsts` set. Other modification collections might need to be updated in the future as well.
* Revert "[JumpThreading] Thread jumps through two basic blocks"Kazu Hirata2020-01-082-115/+0
| | | | | | | | It looks like my patch breaks the sanitizer-windows build: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/56324 This reverts commit ead815924e6ebeaf02c31c37ebf7a560b5fdf67b.
* [InstSimplify] add tests for select of true/false; NFCSanjay Patel2020-01-081-51/+80
|
* [InstCombine] Adding testcase for Z / (1.0 / Y) => (Y * Z); NFCSanjay Patel2020-01-081-0/+15
| | | | | | | | | | | | | The added testcase shows the current transformation for the operation Z / (1.0 / Y), which remains unchanged. This will be updated to align with the transformed code (Y * Z) with D72319. The existing transformation Z / (X / Y) => (Y * Z) / X is not handling this case as there are multiple uses for (1.0 / Y) in this testcase. Patch by: @raghesh (Raghesh Aloor) Differential Revision: https://reviews.llvm.org/D72388
* [JumpThreading] Thread jumps through two basic blocksKazu Hirata2020-01-082-0/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247
* Revert "[InstCombine] fold zext of masked bit set/clear"Kadir Cetinkaya2020-01-081-37/+28
| | | | | | | | | | This reverts commit a041c4ec6f7aa659b235cb67e9231a05e0a33b7d. This looks like a non-trivial change and there has been no code reviews (at least there were no phabricator revisions attached to the commit description). It is also causing a regression in one of our downstream integration tests, we haven't been able to come up with a minimal reproducer yet.
* [SCEV] get more accurate range for AddExpr with wrap flag.czhengsz2020-01-071-6/+2
| | | | | | Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D64869
* [GVN/FP] Considate logic for reasoning about equality vs equivalance for floatsPhilip Reames2020-01-071-0/+69
| | | | | | Factor out common logic into some reasonable commented helper functions. In the process, ensure that the in-block vs cross-block cases are handled the same. They previously weren't. Differential Revision: https://reviews.llvm.org/D67126
* [InstCombine] try to pull 'not' of select into compare operandsSanjay Patel2020-01-071-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | not (select ?, (cmp TPred, ?, ?), (cmp FPred, ?, ?) --> select ?, (cmp TPred', ?, ?), (cmp FPred', ?, ?) If both sides of the select are cmps, we can remove an instruction. The case where only side is a cmp is deferred to a possible follow-on patch. We have a more general 'isFreeToInvert' analysis, but I'm not seeing a way to use that more widely without inducing infinite looping (opposing transforms). Here, we flip the compare predicates directly, so we should not have any danger by creating extra intermediate 'not' ops. Alive proofs: https://rise4fun.com/Alive/jKa Name: both select values are compares - invert predicates %tcmp = icmp sle i32 %x, %y %fcmp = icmp ugt i32 %z, %w %sel = select i1 %cond, i1 %tcmp, i1 %fcmp %not = xor i1 %sel, true => %tcmp_not = icmp sgt i32 %x, %y %fcmp_not = icmp ule i32 %z, %w %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not Name: false val is compare - invert/not %fcmp = icmp ugt i32 %z, %w %sel = select i1 %cond, i1 %tcmp, i1 %fcmp %not = xor i1 %sel, true => %tcmp_not = xor i1 %tcmp, -1 %fcmp_not = icmp ule i32 %z, %w %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not Differential Revision: https://reviews.llvm.org/D72007
* llc: Change behavior of -mcpu with existing attributeMatt Arsenault2020-01-071-3/+3
| | | | | | | | | | | Don't overwrite existing target-cpu attributes. I've often found the replacement behavior annoying, and this is inconsistent with how the fast math command line flags interact with the function attributes. Does not yet change target-features, since I think that should behave as a concatenation.
* [PowerPC][LoopVectorize] Extend getRegisterClassForType to consider double ↵Jinsong Ji2020-01-061-6/+9
| | | | | | | | | | | | | | and other floating point type In https://reviews.llvm.org/D67148, we use isFloatTy to test floating point type, otherwise we return GPRRC. So 'double' will be classified as GPRRC, which is not accurate. This patch covers other floating point types. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D71946
* [NFC] Fix trivial typos in commentsJames Henderson2020-01-063-3/+3
| | | | | | | | Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.
* [Metadata] Add TBAA struct metadata to `AAMDNode`Anton Afanasyev2020-01-061-1/+1
| | | | | | | | | | | | | | | | | | | Summary: Make `AAMDNodes`' `getAAMetadata()` and `setAAMetadata()` to take `!tbaa.struct` into account as well as `!tbaa`. This impacts llvm.org/pr42022. This is a temprorary fix needed to keep `!tbaa.struct` tag by SROA pass. New field `TBAAStruct` should be deleted when `!tbaa` tag replaces `!tbaa.struct`. Merging two `!tbaa.struct`'s to one is conservatively considered to be `nullptr` (giving `MayAlias`) -- this could be enhanced, but relying on the said future replacement. Reviewers: RKSimon, spatel, vporpo Subscribers: hiraditya, kosarev, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70924
* [Coroutines] Remove corresponding phi values when apply ↵Brian Gesiak2020-01-052-11/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | simplifyTerminatorLeadingToRet Summary: In addMustTailToCoroResumes, we set musttail on those resume instructions that are followed by a ret instruction. This is done by simplifyTerminatorLeadingToRet which replace a sequence of branches leading to a ret with a clone of the ret. However it forgets to remove corresponding PHI values that come from basic block of replaced branch, and may cause jumpthreading pass hangs (https://bugs.llvm.org/show_bug.cgi?id=43720) This patch fix this issue Test Plan: cppcoro library with O3+flto check-llvm Reviewers: modocache, GorNishanov, lewissbaker Reviewed By: modocache Subscribers: mehdi_amini, EricWF, hiraditya, dexonsmith, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71826 Patch by junparser (JunMa)!
* [InstCombine] Sink sub into hands of select if one hand becomes zero. Part 2 ↵Roman Lebedev2020-01-042-22/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | (PR44426) This decreases use count of %Op0, makes one hand of select to be 0, and possibly exposes further folding potential. Name: sub %Op0, (select %Cond, %Op0, %FalseVal) -> select %Cond, 0, (sub %Op0, %FalseVal) %Op0 = %TrueVal %o = select i1 %Cond, i8 %Op0, i8 %FalseVal %r = sub i8 %Op0, %o => %n = sub i8 %Op0, %FalseVal %r = select i1 %Cond, i8 0, i8 %n Name: sub %Op0, (select %Cond, %TrueVal, %Op0) -> select %Cond, (sub %Op0, %TrueVal), 0 %Op0 = %FalseVal %o = select i1 %Cond, i8 %TrueVal, i8 %Op0 %r = sub i8 %Op0, %o => %n = sub i8 %Op0, %TrueVal %r = select i1 %Cond, i8 %n, i8 0 https://rise4fun.com/Alive/aHRt https://bugs.llvm.org/show_bug.cgi?id=44426
* [NFC][InstCombine] 'subtract from one hands of select' pattern tests (PR44426)Roman Lebedev2020-01-042-11/+78
| | | | https://bugs.llvm.org/show_bug.cgi?id=44426
* [InstCombine] Sink sub into hands of select if one hand becomes zero (PR44426)Roman Lebedev2020-01-042-22/+16
| | | | | | | | | | | | | | | | | | | | | | | | | This decreases use count of %Op1, makes one hand of select to be 0, and possibly exposes further folding potential. Name: sub (select %Cond, %Op1, %FalseVal), %Op1 -> select %Cond, 0, (sub %FalseVal, %Op1) %Op1 = %TrueVal %o = select i1 %Cond, i8 %Op1, i8 %FalseVal %r = sub i8 %o, %Op1 => %n = sub i8 %FalseVal, %Op1 %r = select i1 %Cond, i8 0, i8 %n Name: sub (select %Cond, %TrueVal, %Op1), %Op1 -> select %Cond, (sub %TrueVal, %Op1), 0 %Op1 = %FalseVal %o = select i1 %Cond, i8 %TrueVal, i8 %Op1 %r = sub i8 %o, %Op1 => %n = sub i8 %TrueVal, %Op1 %r = select i1 %Cond, i8 %n, i8 0 https://rise4fun.com/Alive/avL https://bugs.llvm.org/show_bug.cgi?id=44426
* [NFC][InstCombine] 'subtract of one hands of select' pattern tests (PR44426)Roman Lebedev2020-01-041-0/+89
| | | | https://bugs.llvm.org/show_bug.cgi?id=44426
* [Transforms][GlobalSRA] huge array causes long compilation time and huge ↵Alexey Lapshin2020-01-041-0/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memory usage. Summary: For artificial cases (huge array, few usages), Global SRA optimization creates a lot of redundant data. It creates an instance of GlobalVariable for each array element. For huge array, that means huge compilation time and huge memory usage. Following example compiles for 10 minutes and requires 40GB of memory. namespace { char LargeBuffer[64 * 1024 * 1024]; } int main ( void ) { LargeBuffer[0] = 0; printf("\n "); return LargeBuffer[0] == 0; } The fix is to avoid Global SRA for large arrays. Reviewers: craig.topper, rnk, efriedma, fhahn Reviewed By: rnk Subscribers: xbolva00, lebedev.ri, lkail, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71993
* [PowerPC][LoopVectorize] Add tests for fp128 and fp16Jinsong Ji2020-01-031-0/+58
| | | | Add two tests to reg-usage.ll
* [NFC][InstCombine] '(Op1 & С) - Op1' -> '-(Op1 & ~C)' fold (PR44427)Roman Lebedev2020-01-032-10/+10
| | | | | | | | | | | | | | | | | | | This decreases use count of Op1, potentially allows us to further hoist said 'neg' later on, and results in marginally better X86 codegen. Name: (Op1 & С) - Op1 -> -(Op1 & ~C) %o = and i64 %Op1, C1 %r = sub i64 %o, %Op1 => %n = and i64 %Op1, ~C1 %r = sub i64 0, %n https://rise4fun.com/Alive/rwgA https://godbolt.org/z/R_RMfM https://bugs.llvm.org/show_bug.cgi?id=44427
* [NFC][InstCombine] '(Op1 & С) - Op1' pattern tests (PR44427)Roman Lebedev2020-01-031-0/+98
|
* [NFC][InstCombine] Autogenerate and2.ll checklinesRoman Lebedev2020-01-031-19/+19
|
* [NFC][InstCombine] '(X & (- Y)) - X' -> '- (X & (Y - 1))' fold (PR44448)Roman Lebedev2020-01-031-12/+12
| | | | | | | | | | | | | | | | | | | | | Name: (X & (- Y)) - X -> - (X & (Y - 1)) (PR44448) %negy = sub i8 0, %y %unbiasedx = and i8 %negy, %x %r = sub i8 %unbiasedx, %x => %ymask = add i8 %y, -1 %xmasked = and i8 %ymask, %x %r = sub i8 0, %xmasked https://rise4fun.com/Alive/OIpla This decreases use count of %x, may allow us to later hoist said negation even further, and results in marginally nicer X86 codegen. See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499
* [NFC][InstCombine] '(X & (- Y)) - X' pattern tests (PR44448)Roman Lebedev2020-01-031-0/+158
| | | | | As discussed in https://bugs.llvm.org/show_bug.cgi?id=44448, we can hoist negation out of the pattern.
* [Attributor][FIX] Allow dead users of rewritten functionJohannes Doerfert2020-01-031-0/+55
| | | | | | | If we replace a function with a new one because we rewrite the signature, dead users may still refer to the old version. With this patch we reuse the code that deals with dead functions, which the old versions are, to avoid problems.
* [Attributor][FIX] Don't crash on ptr2int/int2ptr instructionsJohannes Doerfert2020-01-031-0/+10
| | | | | An integer isn't allowed in getAlignmentForValue so we need to stop at a ptr2int instruction during exploration.
* [Attributor][FIX] Do not derive nonnull and dereferenceable w/o accessJohannes Doerfert2020-01-034-4/+23
| | | | | | | An inbounds GEP results in poison if the value is not "inbounds", not in UB. We accidentally derived nonnull and dereferenceable from these inbounds GEPs even in the absence of accesses that would make the poison to UB.
* [Attributor][FIX] Return CHANGED once a pessimistic fixpoint is reached.Johannes Doerfert2020-01-031-0/+49
|
* Fix for a dangling point bug in DeadStoreElimination passAnkit2020-01-031-0/+41
| | | | | | | | | | | | | | The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction. While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration. In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container. Reviewers: fhahn, bcahoon, efriedma Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65326
* [InstCombine] replace undef elements in vector constant when doing icmp ↵Sanjay Patel2020-01-035-10/+10
| | | | | | | | | | | | | | folds (PR44383) As shown in P44383: https://bugs.llvm.org/show_bug.cgi?id=44383 ...we can't safely propagate a vector constant through this icmp fold if that vector constant contains undefined elements. We know that each defined element of the constant is safe though, so find the first of those and replicate it into the formerly undef lanes. Differential Revision: https://reviews.llvm.org/D72101
* Revert "[Attributor] AAValueConstantRange: Value range analysis using ↵Hideto Ueno2020-01-037-796/+32
| | | | | | constant range" This reverts commit e9963034314edf49a12ea5e29f694d8f9f52734a.
* [InstCombine] add tests for vector icmp with undef constant elements; NFCSanjay Patel2020-01-025-0/+54
|
* [InstCombine] remove uses before deleting instructions (PR43723)Sanjay Patel2020-01-021-0/+42
| | | | | | | | | | | | | | | | | | | | | This is a less ambitious alternative to previous attempts to fix this bug with: rG56b2aee1875a rGef02831f0a4e rG56b2aee1875a ...because those all failed bot testing with use-after-free or other problems. The original crashing/assert problem is still showing up on various fuzzers, so I've added a new minimal test based on another one of those failures. Instead of trying to manage and coordinate the logic in isAllocSiteRemovable() with the deletion loops, just loosen the existing code that handles casts and GEP by replacing with undef to allow other opcodes. That means that no instructions with uses should assert on deletion, and there are hopefully no non-obvious sanitizer bugs induced.
* [InstCombine] Preserve inbounds when merging with zero-index GEP (PR44423)Nikita Popov2020-01-011-3/+3
| | | | | | | | This addresses https://bugs.llvm.org/show_bug.cgi?id=44423. If one of the GEPs is inbounds and the other is zero-index, we can also preserve inbounds. Differential Revision: https://reviews.llvm.org/D72060
* [InstCombine] Fix incorrect inbounds on GEP of GEP (PR44425)Nikita Popov2020-01-013-3/+3
| | | | | | | | This fixes https://bugs.llvm.org/show_bug.cgi?id=44425. We need to drop inbounds if one of the GEPs is not inbounds. This was already done when creating a new GEP, but not when modifying in place. Differential Revision: https://reviews.llvm.org/D72059
* [InstCombine] Add tests for PR44423 and PR44425; NFCNikita Popov2020-01-011-0/+40
|
* [InstCombine] Regenerate test checks; NFCNikita Popov2020-01-012-307/+520
|
* [InstCombine] Add tests for sub nuw of geps; NFCNikita Popov2020-01-011-0/+101
| | | | Tests for PR44419.
OpenPOWER on IntegriCloud