summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [SCEV] Make isLoopEntryGuardedByCond a bit smarterMax Kazantsev2018-02-074-7/+121
| | | | | | | | | | | Sometimes `isLoopEntryGuardedByCond` cannot prove predicate `a > b` directly. But it is a common situation when `a >= b` is known from ranges and `a != b` is known from a dominating condition. Thia patch teaches SCEV to sum these facts together and prove strict comparison via non-strict one. Differential Revision: https://reviews.llvm.org/D42835 llvm-svn: 324453
* The xfailed test from r324448 passed on one of the bots: remove it entirely ↵Michael Zolotukhin2018-02-071-69/+0
| | | | | | for now. llvm-svn: 324451
* Xfail the test added in r324445 until the underlying issue in LoopSink is fixed.Michael Zolotukhin2018-02-072-61/+69
| | | | llvm-svn: 324448
* Follow-up for r324429: "[LCSSAVerification] Run verification only when ↵Michael Zolotukhin2018-02-071-0/+61
| | | | | | | | | | | | | asserts are enabled." Before r324429 we essentially didn't have a verification of LCSSA, so no wonder that it has been broken: currently loop-sink breaks it (the attached test illustrates the failure). It was detected during a stage2 RA build, so to unbreak it I'm disabling the check for now. llvm-svn: 324445
* [SLP] Update test checks, NFC.Alexey Bataev2018-02-063-64/+1127
| | | | llvm-svn: 324387
* [InstCombine][ValueTracking] Match non-uniform constant power-of-two vectorsSimon Pilgrim2018-02-061-1/+1
| | | | | | | | Generalize existing constant matching to work with non-uniform constant vectors as well. Differential Revision: https://reviews.llvm.org/D42818 llvm-svn: 324369
* Regenerate vector-urem test. NFCI.Simon Pilgrim2018-02-061-2/+2
| | | | llvm-svn: 324357
* [MergeICmps] Handle chains with several complex BCE basic blocks.Clement Courbet2018-02-061-0/+58
| | | | | | | | | | - Fix condition for detecting that a complex basic block was the first in the chain. - Add tests. This was caught by buildbots when submitting rL324319. llvm-svn: 324341
* [ThinLTO] fix test failure without x86 backendHiroshi Inoue2018-02-062-0/+3
| | | | | | This patch moves ThinLTOBitcodeWriter/module-asm.ll test case into x86 directory to avoid a test failure when x86 backend is not enabled. llvm-svn: 324316
* ThinLTOBitcodeWriter: Do not include module-level inline asm in the merged ↵Peter Collingbourne2018-02-061-0/+12
| | | | | | | | | | | module. If the inline asm provides the definition of a symbol, this can result in duplicate symbol errors. Differential Revision: https://reviews.llvm.org/D42944 llvm-svn: 324313
* [ThinLTO] Remove dead and dropped symbol declarations when possibleTeresa Johnson2018-02-061-0/+71
| | | | | | | | | | | | | | | | | | | | | Summary: Removing the dropped symbols will prevent indirect call promotion in the ThinLTO Backend from adding a new reference to a symbol, which can result in linker unsats. This can happen when we compile with a sample profile collected from one binary by used for another, which may have profiled targets that aren't used in the new binary. Note that until dropDeadSymbols handles variables and aliases (in progress), we may not be able to remove the declaration and can still have an issue. Reviewers: grimar, davidxl Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D42816 llvm-svn: 324299
* [LoopStrengthReduce, x86] don't add cost for a cmp that will be macro-fused ↵Sanjay Patel2018-02-052-35/+32
| | | | | | | | | | | | | | | (PR35681) In the motivating case from PR35681 and represented by the macro-fuse-cmp test: https://bugs.llvm.org/show_bug.cgi?id=35681 ...there's a 37 -> 31 byte size win for the loop because we eliminate the big base address offsets. SPEC2017 on Ryzen shows no significant perf difference. Differential Revision: https://reviews.llvm.org/D42607 llvm-svn: 324289
* [InstCombine] don't try to evaluate instructions with >1 use (revert r324014)Sanjay Patel2018-02-052-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This example causes a compile-time explosion: define i16 @foo(i16 %in) { %x = zext i16 %in to i32 %a1 = mul i32 %x, %x %a2 = mul i32 %a1, %a1 %a3 = mul i32 %a2, %a2 %a4 = mul i32 %a3, %a3 %a5 = mul i32 %a4, %a4 %a6 = mul i32 %a5, %a5 %a7 = mul i32 %a6, %a6 %a8 = mul i32 %a7, %a7 %a9 = mul i32 %a8, %a8 %a10 = mul i32 %a9, %a9 %a11 = mul i32 %a10, %a10 %a12 = mul i32 %a11, %a11 %a13 = mul i32 %a12, %a12 %a14 = mul i32 %a13, %a13 %a15 = mul i32 %a14, %a14 %a16 = mul i32 %a15, %a15 %a17 = mul i32 %a16, %a16 %a18 = mul i32 %a17, %a17 %a19 = mul i32 %a18, %a18 %a20 = mul i32 %a19, %a19 %a21 = mul i32 %a20, %a20 %a22 = mul i32 %a21, %a21 %a23 = mul i32 %a22, %a22 %a24 = mul i32 %a23, %a23 %T = trunc i32 %a24 to i16 ret i16 %T } llvm-svn: 324276
* [InstCombine] add test corresponding to r324252 (PR36225); NFCSanjay Patel2018-02-051-0/+64
| | | | | | | | | | | As PR36225 shows, we definitely don't want to enable the canEvaluate* logic with phis. There's still a question of whether we should just revert r324014 completely because it exposes a compile-time sinkhole (although that problem might exist independently). llvm-svn: 324266
* [InstCombine] add unsigned saturation subtraction canonicalizationsSanjay Patel2018-02-051-36/+51
| | | | | | | | | | | | | | | | | | | | | | | | This is the instcombine part of unsigned saturation canonicalization. Backend patches already commited: https://reviews.llvm.org/D37510 https://reviews.llvm.org/D37534 It converts unsigned saturated subtraction patterns to forms recognized by the backend: (a > b) ? a - b : 0 -> ((a > b) ? a : b) - b) (b < a) ? a - b : 0 -> ((a > b) ? a : b) - b) (b > a) ? 0 : a - b -> ((a > b) ? a : b) - b) (a < b) ? 0 : a - b -> ((a > b) ? a : b) - b) ((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b) ((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b) ((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b) ((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b) Patch by Yulia Koval! Differential Revision: https://reviews.llvm.org/D41480 llvm-svn: 324255
* Revert r323472 "[Debug] Add dbg.value intrinsics for PHIs created during LCSSA."Hans Wennborg2018-02-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build; see PR36238. > This patch is an enhancement to propagate dbg.value information when > Phis are created on behalf of LCSSA. I noticed a case where a value > carried across a loop was reported as <optimized out>. > > Specifically this case: > > int bar(int x, int y) { > return x + y; > } > > int foo(int size) { > int val = 0; > for (int i = 0; i < size; ++i) { > val = bar(val, i); // Both val and i are correct > } > return val; // <optimized out> > } > > In the above case, after all of the interesting computation completes > our value is reported as "optimized out." This change will add a > dbg.value to correct this. > > This patch also moves the dbg.value insertion routine from > LoopRotation.cpp into Local.cpp, so that we can share it in both places > (LoopRotation and LCSSA). > > Patch by Matt Davis! > > Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 324247
* Revert [SimplifyCFG] Relax restriction for folding unconditional branchesSerguei Katkov2018-02-054-25/+12
| | | | | | | | | | | The patch causes the failure of the test compiler-rt/test/profile/Linux/counter_promo_nest.c To unblock buildbot, revert the patch while investigation is in progress. Differential Revision: https://reviews.llvm.org/D42691 llvm-svn: 324214
* [NFC] Add tests for PR35743Max Kazantsev2018-02-051-0/+102
| | | | llvm-svn: 324209
* [SimplifyCFG] Relax restriction for folding unconditional branchesSerguei Katkov2018-02-054-12/+25
| | | | | | | | | | | | | | | | | | The commit rL308422 introduces a restriction for folding unconditional branches. Specifically if empty block with unconditional branch leads to header of the loop then elimination of this basic block is prohibited. However it seems this condition is redundantly strict. If elimination of this basic block does not introduce more back edges then we can eliminate this block. The patch implements this relax of restriction. Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl Reviewed By: pacxx Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42691 llvm-svn: 324208
* Re-apply [SCEV] Fix isLoopEntryGuardedByCond usageSerguei Katkov2018-02-051-0/+54
| | | | | | | | | | | | | | | | | ScalarEvolution::isKnownPredicate invokes isLoopEntryGuardedByCond without check that SCEV is available at entry point of the loop. It is incorrect and fixed by patch. To bugs additionally fixed: assert is moved after the check whether loop is not a nullptr. Usage of isLoopEntryGuardedByCond in ScalarEvolution::isImpliedCondOperandsViaNoOverflow is guarded by isAvailableAtLoopEntry. Reviewers: sanjoy, mkazantsev, anna, dorit, reames Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42417 llvm-svn: 324204
* [PartialInliner] Update test (NFC).Florian Hahn2018-02-041-4/+5
| | | | llvm-svn: 324199
* [InlineFunction] Set arg attrs even if there only are VarArg attrs.Florian Hahn2018-02-041-0/+23
| | | | | | | | | | | | When using the partial inliner, we might have attributes for forwarded varargs, but the CodeExtractor does not create an empty argument attribute set for regular arguments in that case, because it does not know of the additional arguments. So in case we have attributes for VarArgs, we also have to make sure we create (empty) attributes for all regular arguments. This fixes PR36210. llvm-svn: 324197
* [LV] Use Demanded Bits and ValueTracking for reduction type-shrinkingChad Rosier2018-02-041-2/+35
| | | | | | | | | | | | | | | The type-shrinking logic in reduction detection, although narrow in scope, is also rather ad-hoc, which has led to bugs (e.g., PR35734). This patch modifies the approach to rely on the demanded bits and value tracking analyses, if available. We currently perform type-shrinking separately for reductions and other instructions in the loop. Long-term, we should probably think about computing minimal bit widths in a more complete way for the loops we want to vectorize. PR35734 Differential Revision: https://reviews.llvm.org/D42309 llvm-svn: 324195
* Remove unneeded -debug argument from new testDavid Green2018-02-031-1/+1
| | | | llvm-svn: 324176
* [InstCombine] Allow common type conversions to i8/i16/i32David Green2018-02-033-108/+176
| | | | | | | | | | | This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 324174
* [InstCombine] make sure tests are providing coverage for the stated pattern; NFCSanjay Patel2018-02-021-48/+68
| | | | | | | | Without extra instructions and uses, swapMayExposeCSEOpportunities() would change the icmp (as seen in the check lines), so we were not actually testing patterns that should be handled by D41480. llvm-svn: 324143
* [InstCombine] add baseline tests for unsigned saturated sub (D41480); NFCSanjay Patel2018-02-021-0/+143
| | | | llvm-svn: 324109
* [AMDGPU] Switch to the new addr space mapping by defaultYaxun Liu2018-02-0215-691/+688
| | | | | | | | This requires corresponding clang change. Differential Revision: https://reviews.llvm.org/D40955 llvm-svn: 324101
* [InstCombine] allow multi-use values in canEvaluate* if all uses are in 1 instSanjay Patel2018-02-012-20/+13
| | | | | | | | | | | | | | | | This is the enhancement suggested in D42536 to fix a shortcoming in regular InstCombine's canEvaluate* functionality. When we have multiple uses of a value, but they're all in one instruction, we can allow that expression to be narrowed or widened for the same cost as a single-use value. AFAICT, this can only matter for multiply: sub/and/or/xor/select would be simplified away if the operands are the same value; add becomes shl; shifts with a variable shift amount aren't handled. Differential Revision: https://reviews.llvm.org/D42739 llvm-svn: 324014
* Revert commit rL323951David Green2018-02-014-136/+115
| | | | | | | Looks like it's causing timeouts out on at least ppc64le buildbots. llvm-svn: 323959
* [InstCombine] Allow common type conversions to i8/i16/i32David Green2018-02-014-115/+136
| | | | | | | | | | | This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 323951
* [LSR] Don't force bases of foldable formulae to the final type.Mikael Holmen2018-02-012-42/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before emitting code for scaled registers, we prevent SCEVExpander from hoisting any scaled addressing mode by emitting all the bases first. However, these bases are being forced to the final type, resulting in some odd code. For example, if the type of the base is an integer and the final type is a pointer, we will emit an inttoptr for the base, a ptrtoint for the scale, and then a 'reverse' GEP where the GEP pointer is actually the base integer and the index is the pointer. It's more intuitive to use the pointer as a pointer and the integer as index. Patch by: Bevin Hansson Reviewers: atrick, qcolombet, sanjoy Reviewed By: qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42103 llvm-svn: 323946
* [AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf ↵Amjad Aboud2018-01-312-0/+87
| | | | | | | | | | | node correctly. This covers the case where TruncInst leaf node is a constant expression. See PR36121 for more details. Differential Revision: https://reviews.llvm.org/D42622 llvm-svn: 323926
* Followup on Proposal to move MIR physical register namespace to '$' sigil.Puyan Lotfi2018-01-311-2/+2
| | | | | | | | | | | | Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922
* [SeparateConstOffsetFromGEP] Fix up addrspace in the AMDGPU testMarek Olsak2018-01-311-6/+6
| | | | llvm-svn: 323913
* AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16}Marek Olsak2018-01-311-0/+108
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908
* [SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPsMarek Olsak2018-01-311-0/+45
| | | | | | | | | | | | | | Summary: !amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things happen. Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D42744 llvm-svn: 323907
* [CodeGenPrepare] Improve source and dest alignments of memory intrinsics ↵Daniel Neilson2018-01-311-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | independently Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the CodeGenPrepare pass to be more aggressive in improving the source and destination alignments of memcpy/memmove/memset by exploiting our new ability to record independent alignments for each argument. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 323891
* [InstCombine] move related tests into the same file; NFCSanjay Patel2018-01-312-31/+31
| | | | llvm-svn: 323882
* [InstCombine] add tests to show limit of canEvaluate* ; NFCSanjay Patel2018-01-311-10/+65
| | | | llvm-svn: 323881
* [AggressiveInstCombine] Make TruncCombine class ignore unreachable basic blocks.Amjad Aboud2018-01-311-0/+48
| | | | | | | | | Because dead code may contain non-standard IR that causes infinite looping or crashes in underlying analysis. See PR36134 for more details. Differential Revision: https://reviews.llvm.org/D42683 llvm-svn: 323862
* [SLP] Add extra test for extractelement shuffle, NFC.Alexey Bataev2018-01-301-0/+25
| | | | llvm-svn: 323815
* [LoopStrengthReduce] add test to show potential macro-fusion-based diff ↵Sanjay Patel2018-01-301-0/+126
| | | | | | | | (PR35681); NFC This is the baseline output for the test proposed with D42607. llvm-svn: 323806
* [X86][XOP] Update isVectorShiftByScalarCheap with cases covered by XOPSimon Pilgrim2018-01-301-6/+3
| | | | | | | | Similar to D42437, XOP supports variable shift for v16i8/v8i16/v4i32/v2i64 types. Differential Revision: https://reviews.llvm.org/D42526 llvm-svn: 323797
* [DeadArgumentElimination] Preserve llvm.dbg.values's first argumentPetar Jovanovic2018-01-301-0/+135
| | | | | | | | | | | | | When removing return value Dead Argument Elimination pass clobbers first llvm.dbg.value’s argument for live arguments of that function by replacing it with nullptr. In the next pass it will be deleted, so debug location about those arguments are lost. This change fixes it. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D42541 llvm-svn: 323784
* Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass ↵Zaara Syeda2018-01-303-0/+132
| | | | | | | | | | | | | | | | | | | | | to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
* [RS4GC] Handle call/invoke instructions as base defining values of vectorsDaniel Neilson2018-01-302-0/+42
| | | | | | | | | | | | | | | | | | Summary: There's an asymmetry in the definitions of findBaseDefiningValueOfVector() and findBaseDefiningValue() of RS4GC. The later handles call and invoke instructions, and the former does not. This appears to be simple oversight. This patch remedies the oversight by adding the call and invoke cases to findBaseDefiningValueOfVector(). Reviewers: DaniilSuchkov, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42653 llvm-svn: 323764
* [DSE] make sure memory is not modified before partial store merging (PR36129)Sanjay Patel2018-01-301-2/+4
| | | | | | | | | | | | We missed a critical check in D30703. We must make sure that no intermediate store is sitting between the stores that we want to merge. This should fix: https://bugs.llvm.org/show_bug.cgi?id=36129 Differential Revision: https://reviews.llvm.org/D42663 llvm-svn: 323759
* [InstSimplify] (X * Y) / Y --> X for relaxed floating-point opsSanjay Patel2018-01-301-0/+34
| | | | | | | | | This is the FP counterpart that was mentioned in PR35709: https://bugs.llvm.org/show_bug.cgi?id=35709 Differential Revision: https://reviews.llvm.org/D42385 llvm-svn: 323716
* [DSE] add test for PR36129; NFCSanjay Patel2018-01-291-0/+15
| | | | | | | We can miscompile because we're not checking is the memory might me modified between the seemingly redundant store ops. llvm-svn: 323704
OpenPOWER on IntegriCloud