summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [DeadArgElim] Remove allocsize attributes on callsitesGeorge Burgess IV2018-04-121-0/+5
| | | | | | | | | | | | | | | We're already removing allocsize attributes from Functions that we remove args from, since removing arguments from a function may make the allocsize attribute incorrect. It appears we forgot to also remove them from callsites. Without this, I get verifier errors on `@Test2`. It probably wouldn't be too hard to make DAE properly update allocsize attributes instead of dropping them, but I can't think of a scenario where that'd be useful in practice. llvm-svn: 329868
* [DSE] Add tests for atomic memory intrinsics (NFC)Daniel Neilson2018-04-114-0/+705
| | | | | | | | Summary: These tests show that DSE currently does nothing with the atomic memory intrinsics. Future work will teach DSE how to simplify these. llvm-svn: 329845
* [DSE] Regenerate tests with update_test_checks.py (NFC)Daniel Neilson2018-04-112-24/+135
| | | | | | | | | Summary: In preparation for a future commit, this regenerates the test checks for test/Transforms/DeadStoreElimination/OverwriteStoreBegin.ll test/Transforms/DeadStoreElimination/OverwriteStoreEnd.ll llvm-svn: 329839
* [DSE] Regenerate tests with update_test_checks.py (NFC)Daniel Neilson2018-04-112-175/+308
| | | | | | | | | Summary: In preparation for a future commit, this regenerates the test checks for test/Transforms/DeadStoreElimination/simple.ll test/Transforms/DeadStoreElimination/memintrinsics.ll llvm-svn: 329824
* [InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse()Sanjay Patel2018-04-111-8/+5
| | | | llvm-svn: 329821
* [SLP] update a test case. NFC.Haicheng Wu2018-04-111-15/+17
| | | | llvm-svn: 329818
* Eliminate a bitwise 'not' op of 'not' min/max by inverting the min/max.Artur Gainullin2018-04-112-4/+147
| | | | | | | | | | | | | | | | | | | | | Bitwise 'not' of the min/max could be eliminated in the pattern: %notx = xor i32 %x, -1 %cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y %smax = select i1 %cmp1, i32 %notx, i32 %y %res = xor i32 %smax, -1 https://rise4fun.com/Alive/lCN Reviewers: spatel Reviewed by: spatel Subscribers: a.elovikov, llvm-commits Differential Revision: https://reviews.llvm.org/D45317 llvm-svn: 329791
* AMDGPU: enable 128-bit for local addr space under an optionMarek Olsak2018-04-102-2/+4
| | | | | | | | | | | | | | | | | | | Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). v2: - fix regressions in merge-stores.ll and multiple_tails.ll Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329764
* [CVP] simplify phi with constant incoming values that match common variable ↵Sanjay Patel2018-04-101-0/+94
| | | | | | | | | | | | | | | | | | | | | | | edge values This is based on an example that was recently posted on llvm-dev: void *propagate_null(void* b, int* g) { if (!b) { return 0; } (*g)++; return b; } https://godbolt.org/g/xYk3qG The original code or constant propagation in other passes has obscured the fact that the phi can be removed completely. Differential Revision: https://reviews.llvm.org/D45448 llvm-svn: 329755
* [SSAUpdaterBulk] Handle CFG with unreachable from entry blocks.Michael Zolotukhin2018-04-101-0/+26
| | | | llvm-svn: 329660
* [MemorySSAUpdater] Mark Phi users of a node being moved as non-optimizeZhaoshi Zheng2018-04-091-0/+34
| | | | | | | | | | Fix PR36484, as suggested: <quote> during moves, mark the direct users of the erased things that were phis as "not to be optimized" <quote> llvm-svn: 329621
* [SLP] Additional tests for reorder reuse vectorization, NFC.Alexey Bataev2018-04-091-0/+230
| | | | llvm-svn: 329603
* [MergeICmp] Split blocks that do other work.Xin Tong2018-04-092-14/+123
| | | | | | | | | | | | | | | Summary: We do not try to move the instructions and split the block till we know the blocks can be split, i.e. BCE-cmp-insts can be separated from non-BCE-cmp-insts. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44443 llvm-svn: 329564
* [IRCE] Relax restriction on collected range checksMax Kazantsev2018-04-092-6/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In IRCE, we have a very old legacy check that works when we collect comparisons that we treat as range checks. It ensures that the value against which the indvar is compared is loop invariant and is also positive. This latter condition remained there since the times when IRCE was only able to handle signed latch comparison. As the optimization evolved, it now learned how to intersect signed or unsigned ranges, and this logic has no reliance on the fact that the right border of each range should be positive. The old implementation of this non-negativity check was also naive enough and just looked into ranges (while most of other IRCE logic tries to use power of SCEV implications), so this check did not allow to deal with the most simple case that looks like follows: int size; // not known non-negative int length; //known non-negative; i = 0; if (size != 0) { do { range_check(i < size); range_check(i < length); ++i; } while (i < size) } In this case, even if from some dominating conditions IRCE could parse loop structure, it could only remove the range check against `length` and simply ignored the check against `size`. In this patch we remove this obsolete check. It will allow IRCE to pick comparison against `size` as a potential range check and then let Range Intersection logic decide whether it is OK to eliminate it or not. Differential Revision: https://reviews.llvm.org/D45362 Reviewed By: samparker llvm-svn: 329547
* NFC: Update NewGVN invariant.group testPiotr Padlewski2018-04-081-15/+16
| | | | llvm-svn: 329533
* [InstCombine] add/move tests for fsub folds; NFCSanjay Patel2018-04-072-24/+100
| | | | | | | | There are a pair of folds that try to merge fneg into fsub with an intervening cast, but as shown in the FIXME tests, they can create extra instructions. llvm-svn: 329501
* [InstCombine] Get rid of select of bittest (PR36950 / PR17564)Roman Lebedev2018-04-071-86/+69
| | | | | | | | | | | | | | | | | | | | Summary: See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 | PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 | PR17564 ]], D45065, D45107 https://godbolt.org/g/iAYRup Alive proof: https://rise4fun.com/Alive/uiH Testing: `ninja check-llvm` Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45108 llvm-svn: 329492
* Runtime flag to control branch funnel thresholdVitaly Buka2018-04-061-0/+100
| | | | | | | | | | Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45193 llvm-svn: 329459
* [InstCombine] limit nsz: -(X - Y) --> Y - X to hasOneUse()Sanjay Patel2018-04-061-3/+7
| | | | | | | As noted in the post-commit discussion for r329350, we shouldn't generally assume that fsub is the same cost as fneg. llvm-svn: 329429
* [InstCombine] add test for fsub+fneg with extra use; NFCSanjay Patel2018-04-061-0/+15
| | | | llvm-svn: 329418
* [InstCombine] add potential calloc tests and regenerate checks; NFCSanjay Patel2018-04-061-23/+57
| | | | | | | | D45344 is proposing to remove the use restriction that made the calloc transform safe, but it doesn't currently address the problematic example given inD16337. Add a test to make sure that doesn't break. llvm-svn: 329412
* [GlobalOpt] Fix support for casts in ctors.Mircea Trofin2018-04-061-0/+62
| | | | | | | | | | | | | | | | Summary: Fixing an issue where initializations of globals where constructors use casts were silently translated to 0-initialization. Reviewers: davidxl, evgeny777 Reviewed By: evgeny777 Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45198 llvm-svn: 329409
* [LoopUnroll] Make LoopPeeling respect the AllowPeeling preference.Chad Rosier2018-04-061-0/+3
| | | | | | | | The SimpleLoopUnrollPass isn't suppose to perform loop peeling. Differential Revision: https://reviews.llvm.org/D45334 llvm-svn: 329395
* EntryExitInstrumenter: Handle musttail callsHans Wennborg2018-04-061-1/+24
| | | | | | | | Inserting instrumentation between a musttail call and ret instruction would create invalid IR. Instead, treat musttail calls as function exits. llvm-svn: 329385
* [InstCombine] FP: Z - (X - Y) --> Z + (Y - X)Sanjay Patel2018-04-051-6/+6
| | | | | | | | | | | | This restores what was lost with rL73243 but without re-introducing the bug that was present in the old code. Note that we already have these transforms if the ops are marked 'fast' (and I assume that's happening somewhere in the code added with rL170471), but we clearly don't need all of 'fast' for these transforms. llvm-svn: 329362
* [InstCombine] add FP tests for Z - (X - Y); NFCSanjay Patel2018-04-051-3/+29
| | | | | | | | A fold for this pattern was removed at rL73243 to fix PR4374: https://bugs.llvm.org/show_bug.cgi?id=4374 ...and apparently there were no tests that went with that fold. llvm-svn: 329360
* [InstCombine] nsz: -(X - Y) --> Y - XSanjay Patel2018-04-051-2/+1
| | | | | | This restores part of the fold that was removed with rL73243 (PR4374). llvm-svn: 329350
* [InstCombine][NFC] Regenerate select-of-bittest.ll with instnamer passRoman Lebedev2018-04-051-503/+504
| | | | | | As requested by spatel in https://reviews.llvm.org/D45329 llvm-svn: 329349
* [InstCombine] [NFC] Add more tests for getting rid of select of bittest ↵Roman Lebedev2018-04-051-19/+178
| | | | | | | | | | | | | | | | | | | (D45108, PR36950 / PR17564) Summary: More tests for D45108: * One use tests * allow shift to be a variable, too Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45329 llvm-svn: 329348
* [InstCombine] Properly change GEP type when reassociating loop invariant GEP ↵Daniel Neilson2018-04-051-0/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | chains Summary: This is a fix to PR37005. Essentially, rL328539 ([InstCombine] reassociate loop invariant GEP chains to enable LICM) contains a bug whereby it will convert: %src = getelementptr inbounds i8, i8* %base, <2 x i64> %val %res = getelementptr inbounds i8, <2 x i8*> %src, i64 %val2 into: %src = getelementptr inbounds i8, i8* %base, i64 %val2 %res = getelementptr inbounds i8, <2 x i8*> %src, <2 x i64> %val By swapping the index operands if the GEPs are in a loop, and %val is loop variant while %val2 is loop invariant. This fix recreates new GEP instructions if the index operand swap would result in the type of %src changing from vector to scalar, or vice versa. Reviewers: sebpop, spatel Reviewed By: sebpop Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45287 llvm-svn: 329331
* [InstCombine] add test for fneg+fsub with nsz; NFCSanjay Patel2018-04-051-3/+17
| | | | | | | | There used to be a fold that would handle this case more generally, but it was removed at rL73243 to fix PR4374: https://bugs.llvm.org/show_bug.cgi?id=4374 llvm-svn: 329322
* [InstCombine] use pattern matchers for fsub --> fadd foldsSanjay Patel2018-04-051-3/+2
| | | | | | This allows folding for vectors with undef elements. llvm-svn: 329316
* [InstCombine] add tests for fsub --> fadd; NFCSanjay Patel2018-04-051-3/+65
| | | | llvm-svn: 329313
* [PatternMatch] define m_FNeg using m_FSubSanjay Patel2018-04-054-20/+9
| | | | | | | | | Using cstfp_pred_ty in the definition allows us to match vectors with undef elements. This replicates the change for m_Not from D44076 / rL326823 and continues towards making all pattern matchers allow undef elements in vectors. llvm-svn: 329303
* [InstCombine] add vector and vector undef tests for FP folds; NFCSanjay Patel2018-04-053-89/+226
| | | | llvm-svn: 329294
* [LoopInterchange] Require asserts for test using -stats (NFC)Florian Hahn2018-04-051-1/+2
| | | | | | This fixes a buildbot failure. llvm-svn: 329279
* [LoopInterchange] Add stats counter for number of interchanged loops.Florian Hahn2018-04-051-1/+5
| | | | | | | | | | Reviewers: samparker, karthikthecool, blitz.opensource Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D45209 llvm-svn: 329269
* [LoopInterchange] Preserve LoopInfo after interchanging.Florian Hahn2018-04-0513-14/+22
| | | | | | | | | | | | | | LoopInterchange relies on LoopInfo being up-to-date, so we should preserve it after interchanging. This patch updates restructureLoops to move the BBs of the interchanged loops to the right place. Reviewers: davide, efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45278 llvm-svn: 329264
* [CallSiteSplitting] Do not perform callsite splitting inside landing padTaewook Oh2018-04-051-0/+40
| | | | | | | | | | | | | | | | | | | Summary: If the callsite is inside landing pad, do not perform callsite splitting. Callsite splitting uses utility function llvm::DuplicateInstructionsInSplitBetween, which eventually calls llvm::SplitEdge. llvm::SplitEdge calls llvm::SplitCriticalEdge with an assumption that the function returns nullptr only when the target edge is not a critical edge (and further assumes that if the return value was not nullptr, the predecessor of the original target edge always has a single successor because critical edge splitting was successful). However, this assumtion is not true because SplitCriticalEdge returns nullptr if the destination block is a landing pad. This invalid assumption results assertion failure. Fundamental solution might be fixing llvm::SplitEdge to not to rely on the invalid assumption. However, it'll involve a lot of work because current API assumes that llvm::SplitEdge never fails. Instead, this patch makes callsite splitting to not to attempt splitting if the callsite is in a landing pad. Attached test case will crash with assertion failure without the fix. Reviewers: fhahn, junbuml, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45130 llvm-svn: 329250
* Don't inline @llvm.icall.branch.funnelVitaly Buka2018-04-042-6/+49
| | | | | | | | | | | | | | Summary: @llvm.icall.branch.funnel is musttail with variable number of arguments. After inlining current backend can't separate call targets from call arguments. Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45116 llvm-svn: 329235
* [Analysis] Support aligned new/delete functions.Eric Fiselier2018-04-041-0/+40
| | | | | | | | | | | | | | | | Summary: Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well. This allows the compiler to perform certain optimizations including eliding new/delete calls. Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer Reviewed By: bkramer Subscribers: ckennelly, llvm-commits Differential Revision: https://reviews.llvm.org/D44769 llvm-svn: 329218
* Revert "[Analysis] Support aligned new/delete functions."Eric Fiselier2018-04-041-40/+0
| | | | | | This reverts commit bee3bbd9bdd3ab3364b8fb0cdb6326bc1ae740e0. llvm-svn: 329217
* [Analysis] Support aligned new/delete functions.Eric Fiselier2018-04-041-0/+40
| | | | | | | | | | | | | | | | Summary: Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well. This allows the compiler to perform certain optimizations including eliding new/delete calls. Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer Reviewed By: bkramer Subscribers: ckennelly, llvm-commits Differential Revision: https://reviews.llvm.org/D44769 llvm-svn: 329215
* [InstCombine] [NFC] Add tests for getting rid of select of bittest (PR36950 ↵Roman Lebedev2018-04-041-0/+464
| | | | | | | | | | | | | | | | / PR17564) Summary: See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 | PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 | PR17564 ]], D45065, D45108 Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45107 llvm-svn: 329198
* [SLPVectorizer][X86] Regenerate some tests. NFCISimon Pilgrim2018-04-045-74/+187
| | | | llvm-svn: 329196
* StructurizeCFG: Test for branch divergence correctlyNicolai Haehnle2018-04-041-0/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform, so the branch is non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. As discovered after committing an earlier version of this change, this exposes a subtle interaction between this pass and DivergenceAnalysis: since we remove and re-create branch instructions, we can no longer rely on DivergenceAnalysis for branches in subregions that were already processed by the pass. Explicitly remove branch instructions from DivergenceAnalysis to avoid dangling pointers as a matter of defensive programming, and change how we detect non-uniform subregions. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Differential Revision: https://reviews.llvm.org/D43743 llvm-svn: 329165
* [SCEV] Prove implications for SCEVUnknown PhisMax Kazantsev2018-04-042-0/+181
| | | | | | | | | | | | | | | | | This patch teaches SCEV how to prove implications for SCEVUnknown nodes that are Phis. If we need to prove `Pred` for `LHS, RHS`, and `LHS` is a Phi with possible incoming values `L1, L2, ..., LN`, then if we prove `Pred` for `(L1, RHS), (L2, RHS), ..., (LN, RHS)` then we can also prove it for `(LHS, RHS)`. If both `LHS` and `RHS` are Phis from the same block, it is sufficient to prove the predicate for values that come from the same predecessor block. The typical case that it handles is that we sometimes need to prove that `Phi(Len, Len - 1) >= 0` given that `Len > 0`. The new logic was added to `isImpliedViaOperations` and only uses it and non-recursive reasoning to prove the facts we need, so it should not hurt compile time a lot. Differential Revision: https://reviews.llvm.org/D44001 Reviewed By: anna llvm-svn: 329150
* [SimplifyCFG] Teach merge conditional stores to handle cases where the ↵Craig Topper2018-04-041-0/+39
| | | | | | | | | | | | | | | | | | | PostBB has more than 2 predecessors by inserting a new block for the store. Summary: Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors. This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store. Reviewers: efriedma, jmolloy, ABataev Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39760 llvm-svn: 329142
* [InstCombine] allow more fmul folds with 'reassoc'Sanjay Patel2018-04-031-25/+28
| | | | | | | | The tests marked with 'FIXME' require loosening the check in SimplifyAssociativeOrCommutative() to optimize completely; that's still checking isFast() in Instruction::isAssociative(). llvm-svn: 329121
* [coroutines] Respect alloca alignment requirements when building coroutine frameGor Nishanov2018-04-031-0/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment. For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned. ``` %PackedStruct = type <{ i64 }> ... %data = alloca %PackedStruct, align 8 ``` If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame: ``` %f.Frame = type { ..., i16, [6 x i8], %PackedStruct } ``` Reviewers: rnk, lewissbaker, modocache Reviewed By: modocache Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D45221 llvm-svn: 329112
OpenPOWER on IntegriCloud