summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [NFC] fix trivial typos in documents and commentsHiroshi Inoue2018-04-121-1/+1
| | | | | | "is is" -> "is", "if if" -> "if", "or or" -> "or" llvm-svn: 329878
* [DeadArgElim] Remove allocsize attributes on callsitesGeorge Burgess IV2018-04-121-1/+7
| | | | | | | | | | | | | | | We're already removing allocsize attributes from Functions that we remove args from, since removing arguments from a function may make the allocsize attribute incorrect. It appears we forgot to also remove them from callsites. Without this, I get verifier errors on `@Test2`. It probably wouldn't be too hard to make DAE properly update allocsize attributes instead of dropping them, but I can't think of a scenario where that'd be useful in practice. llvm-svn: 329868
* Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time.Michael Zolotukhin2018-04-111-13/+18
| | | | | | This reapplies commit r329644. llvm-svn: 329865
* [SSAUpdaterBulk] Fix linux bootstrap/sanitizer failures: explicitly specify ↵Michael Zolotukhin2018-04-111-1/+2
| | | | | | | | | | | | | | order of evaluation. The standard says that the order of evaluation of an expression s[x] = foo() is unspecified. In our case, we first create an empty entry in the map, then call foo(), then store its return value to the created entry. The problem is that foo uses the map as a cache, so if it finds that there is an entry in the map, it stops computation. This change explicitly sets the order, thus fixing this heisenbug. llvm-svn: 329864
* [InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse()Sanjay Patel2018-04-111-10/+10
| | | | llvm-svn: 329821
* Eliminate a bitwise 'not' op of 'not' min/max by inverting the min/max.Artur Gainullin2018-04-111-0/+30
| | | | | | | | | | | | | | | | | | | | | Bitwise 'not' of the min/max could be eliminated in the pattern: %notx = xor i32 %x, -1 %cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y %smax = select i1 %cmp1, i32 %notx, i32 %y %res = xor i32 %smax, -1 https://rise4fun.com/Alive/lCN Reviewers: spatel Reviewed by: spatel Subscribers: a.elovikov, llvm-commits Differential Revision: https://reviews.llvm.org/D45317 llvm-svn: 329791
* Simplification of libcall like printf->puts must check for RtLibUseGOT metadata.Sriraman Tallam2018-04-101-0/+11
| | | | | | | | | | With -fno-plt, for example, calls to printf when getting converted to puts still use the PLT. This patch checks for the metadata "RtLibUseGOT" and annotates the declaration with the right attributes. Differential Revision: https://reviews.llvm.org/D45180 llvm-svn: 329768
* [CVP] simplify phi with constant incoming values that match common variable ↵Sanjay Patel2018-04-101-0/+60
| | | | | | | | | | | | | | | | | | | | | | | edge values This is based on an example that was recently posted on llvm-dev: void *propagate_null(void* b, int* g) { if (!b) { return 0; } (*g)++; return b; } https://godbolt.org/g/xYk3qG The original code or constant propagation in other passes has obscured the fact that the phi can be removed completely. Differential Revision: https://reviews.llvm.org/D45448 llvm-svn: 329755
* Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time.Michael Zolotukhin2018-04-101-18/+13
| | | | | | This reverts r329661. Bots are still unhappy. llvm-svn: 329666
* Revert "Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading.""Michael Zolotukhin2018-04-101-13/+18
| | | | | | This reapplies commit r329644. llvm-svn: 329661
* [SSAUpdaterBulk] Handle CFG with unreachable from entry blocks.Michael Zolotukhin2018-04-101-1/+1
| | | | llvm-svn: 329660
* Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading."Michael Zolotukhin2018-04-101-18/+13
| | | | | | This reverts commit r329644. llvm-svn: 329650
* Fix for the buildbot failure. Now-unused private field TTI deleted.Hideki Saito2018-04-101-6/+2
| | | | llvm-svn: 329649
* [NFC][LV] Move InterleaveInfo from Legal to CostModelHideki Saito2018-04-091-57/+56
| | | | | | | | | | | | | | | | | | | | | | Summary: Another clean up, following D43208. Interleaved memory access analysis/optimization has nothing to do with vectorization legality. It doesn't really belong there. On the other hand, cost model certainly has to know about it. In principle, vectorization should proceed like Legality ==> Optimization ==> CostModel ==> CodeGen, and this change just does that, by moving the interleaved access analysis/decision out of Legal, and run it just before CostModel object is created. After this, I can move LoopVectorizationLegality and Hints/Requirements classes into it's own header file, making it shareable within Transform tree. I have the patch already but I don't want to mix with this change. Eventual goal is to move to Analysis tree, but I first need to move RecurrenceDescriptor/InductionDescriptor from Transform/Util/LoopUtil.* to Analysis. Reviewers: rengolin, hfinkel, mkuper, dcaballe, sguggill, fhahn, aemerson Reviewed By: rengolin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45072 llvm-svn: 329645
* [PR16756] Use SSAUpdaterBulk in JumpThreading.Michael Zolotukhin2018-04-091-13/+18
| | | | | | | | | | | | | | | | | Summary: SSAUpdater is a bottleneck in JumpThreading, and this patch improves the situation by using SSAUpdaterBulk instead. Compile time impact: no noticable changes on CTMark, a big improvement on the test from PR16756. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329644
* [PR16756] Add SSAUpdaterBulk.Michael Zolotukhin2018-04-092-0/+174
| | | | | | | | | | | | | | | | Summary: SSAUpdater is a bottleneck in a number of passes, and one of the reasons is that it performs a lot of unnecessary computations (DT/IDF) over and over again. This patch adds a new SSAUpdaterBulk that uses existing DT and avoids recomputing IDF when possible. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329643
* Support generic expansion of ordered vector reduction (PR36732)Simon Pilgrim2018-04-091-0/+32
| | | | | | | | | | Without the fast math flags, the llvm.experimental.vector.reduce.fadd/fmul intrinsic expansions must be expanded in order. This patch scalarizes the reduction, applying the accumulator at the start of the sequence: ((((Acc + Scl[0]) + Scl[1]) + Scl[2]) + ) ... + Scl[NumElts-1] Differential Revision: https://reviews.llvm.org/D45366 llvm-svn: 329585
* [MergeICmp] Update debug msg.NFCXin Tong2018-04-091-1/+1
| | | | llvm-svn: 329572
* [MergeICmp] Split blocks that do other work.Xin Tong2018-04-091-19/+118
| | | | | | | | | | | | | | | Summary: We do not try to move the instructions and split the block till we know the blocks can be split, i.e. BCE-cmp-insts can be separated from non-BCE-cmp-insts. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44443 llvm-svn: 329564
* [IRCE] Relax restriction on collected range checksMax Kazantsev2018-04-091-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In IRCE, we have a very old legacy check that works when we collect comparisons that we treat as range checks. It ensures that the value against which the indvar is compared is loop invariant and is also positive. This latter condition remained there since the times when IRCE was only able to handle signed latch comparison. As the optimization evolved, it now learned how to intersect signed or unsigned ranges, and this logic has no reliance on the fact that the right border of each range should be positive. The old implementation of this non-negativity check was also naive enough and just looked into ranges (while most of other IRCE logic tries to use power of SCEV implications), so this check did not allow to deal with the most simple case that looks like follows: int size; // not known non-negative int length; //known non-negative; i = 0; if (size != 0) { do { range_check(i < size); range_check(i < length); ++i; } while (i < size) } In this case, even if from some dominating conditions IRCE could parse loop structure, it could only remove the range check against `length` and simply ignored the check against `size`. In this patch we remove this obsolete check. It will allow IRCE to pick comparison against `size` as a potential range check and then let Range Intersection logic decide whether it is OK to eliminate it or not. Differential Revision: https://reviews.llvm.org/D45362 Reviewed By: samparker llvm-svn: 329547
* [NFC] fix trivial typos in comments and error messageHiroshi Inoue2018-04-091-1/+1
| | | | | | "is is" -> "is", "are are" -> "are" llvm-svn: 329546
* [LIR] Reorder header. NFCXin Tong2018-04-081-1/+1
| | | | llvm-svn: 329530
* [InstCombine] simplify code that propagates FMF; NFCSanjay Patel2018-04-071-12/+4
| | | | llvm-svn: 329503
* [InstCombine] Get rid of select of bittest (PR36950 / PR17564)Roman Lebedev2018-04-071-0/+49
| | | | | | | | | | | | | | | | | | | | Summary: See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 | PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 | PR17564 ]], D45065, D45107 https://godbolt.org/g/iAYRup Alive proof: https://rise4fun.com/Alive/uiH Testing: `ninja check-llvm` Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45108 llvm-svn: 329492
* Remove trailing space in build file.Nico Weber2018-04-071-1/+1
| | | | llvm-svn: 329479
* Fix warning by cl::opt<int> -> cl::opt<unsigned>Vitaly Buka2018-04-061-4/+5
| | | | llvm-svn: 329461
* Runtime flag to control branch funnel thresholdVitaly Buka2018-04-061-2/+6
| | | | | | | | | | Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45193 llvm-svn: 329459
* [EarlyCSE] Add debug counter for debugging mis-optimizations. NFC.Geoff Berry2018-04-061-24/+60
| | | | | | | | | | Reviewers: reames, spatel, davide, dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45162 llvm-svn: 329443
* [InstCombine] limit nsz: -(X - Y) --> Y - X to hasOneUse()Sanjay Patel2018-04-061-12/+9
| | | | | | | As noted in the post-commit discussion for r329350, we shouldn't generally assume that fsub is the same cost as fneg. llvm-svn: 329429
* Strip trailing whitespace. NFCI.Simon Pilgrim2018-04-061-8/+8
| | | | llvm-svn: 329421
* [GlobalOpt] Fix support for casts in ctors.Mircea Trofin2018-04-061-1/+5
| | | | | | | | | | | | | | | | Summary: Fixing an issue where initializations of globals where constructors use casts were silently translated to 0-initialization. Reviewers: davidxl, evgeny777 Reviewed By: evgeny777 Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45198 llvm-svn: 329409
* [LoopUnroll] Make LoopPeeling respect the AllowPeeling preference.Chad Rosier2018-04-061-10/+14
| | | | | | | | The SimpleLoopUnrollPass isn't suppose to perform loop peeling. Differential Revision: https://reviews.llvm.org/D45334 llvm-svn: 329395
* EntryExitInstrumenter: Handle musttail callsHans Wennborg2018-04-061-5/+15
| | | | | | | | Inserting instrumentation between a musttail call and ret instruction would create invalid IR. Instead, treat musttail calls as function exits. llvm-svn: 329385
* [NFC] Add missing end of line symbolsMax Kazantsev2018-04-061-3/+3
| | | | llvm-svn: 329383
* [InstCombine] FP: Z - (X - Y) --> Z + (Y - X)Sanjay Patel2018-04-051-2/+11
| | | | | | | | | | | | This restores what was lost with rL73243 but without re-introducing the bug that was present in the old code. Note that we already have these transforms if the ops are marked 'fast' (and I assume that's happening somewhere in the code added with rL170471), but we clearly don't need all of 'fast' for these transforms. llvm-svn: 329362
* [InstCombine] nsz: -(X - Y) --> Y - XSanjay Patel2018-04-051-4/+11
| | | | | | This restores part of the fold that was removed with rL73243 (PR4374). llvm-svn: 329350
* [InstCombine] Properly change GEP type when reassociating loop invariant GEP ↵Daniel Neilson2018-04-051-3/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | chains Summary: This is a fix to PR37005. Essentially, rL328539 ([InstCombine] reassociate loop invariant GEP chains to enable LICM) contains a bug whereby it will convert: %src = getelementptr inbounds i8, i8* %base, <2 x i64> %val %res = getelementptr inbounds i8, <2 x i8*> %src, i64 %val2 into: %src = getelementptr inbounds i8, i8* %base, i64 %val2 %res = getelementptr inbounds i8, <2 x i8*> %src, <2 x i64> %val By swapping the index operands if the GEPs are in a loop, and %val is loop variant while %val2 is loop invariant. This fix recreates new GEP instructions if the index operand swap would result in the type of %src changing from vector to scalar, or vice versa. Reviewers: sebpop, spatel Reviewed By: sebpop Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45287 llvm-svn: 329331
* [InstCombine] use pattern matchers for fsub --> fadd foldsSanjay Patel2018-04-051-4/+9
| | | | | | This allows folding for vectors with undef elements. llvm-svn: 329316
* [InstCombine] cleanup; NFCSanjay Patel2018-04-051-15/+12
| | | | llvm-svn: 329282
* [LoopInterchange] Add stats counter for number of interchanged loops.Florian Hahn2018-04-051-0/+4
| | | | | | | | | | Reviewers: samparker, karthikthecool, blitz.opensource Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D45209 llvm-svn: 329269
* [LoopInterchange] Preserve LoopInfo after interchanging.Florian Hahn2018-04-051-13/+72
| | | | | | | | | | | | | | LoopInterchange relies on LoopInfo being up-to-date, so we should preserve it after interchanging. This patch updates restructureLoops to move the BBs of the interchanged loops to the right place. Reviewers: davide, efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45278 llvm-svn: 329264
* [CallSiteSplitting] Do not perform callsite splitting inside landing padTaewook Oh2018-04-051-0/+6
| | | | | | | | | | | | | | | | | | | Summary: If the callsite is inside landing pad, do not perform callsite splitting. Callsite splitting uses utility function llvm::DuplicateInstructionsInSplitBetween, which eventually calls llvm::SplitEdge. llvm::SplitEdge calls llvm::SplitCriticalEdge with an assumption that the function returns nullptr only when the target edge is not a critical edge (and further assumes that if the return value was not nullptr, the predecessor of the original target edge always has a single successor because critical edge splitting was successful). However, this assumtion is not true because SplitCriticalEdge returns nullptr if the destination block is a landing pad. This invalid assumption results assertion failure. Fundamental solution might be fixing llvm::SplitEdge to not to rely on the invalid assumption. However, it'll involve a lot of work because current API assumes that llvm::SplitEdge never fails. Instead, this patch makes callsite splitting to not to attempt splitting if the callsite is in a landing pad. Attached test case will crash with assertion failure without the fix. Reviewers: fhahn, junbuml, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45130 llvm-svn: 329250
* hwasan: add -hwasan-match-all-tag flagEvgeniy Stepanov2018-04-041-0/+11
| | | | | | | | | | | | | | | | Sometimes instead of storing addresses as is, the kernel stores the address of a page and an offset within that page, and then computes the actual address when it needs to make an access. Because of this the pointer tag gets lost (gets set to 0xff). The solution is to ignore all accesses tagged with 0xff. This patch adds a -hwasan-match-all-tag flag to hwasan, which allows to ignore accesses through pointers with a particular pointer tag value for validity. Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D44827 llvm-svn: 329228
* Make helpers static. NFC.Benjamin Kramer2018-04-042-1/+4
| | | | llvm-svn: 329170
* StructurizeCFG: Test for branch divergence correctlyNicolai Haehnle2018-04-041-13/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform, so the branch is non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. As discovered after committing an earlier version of this change, this exposes a subtle interaction between this pass and DivergenceAnalysis: since we remove and re-create branch instructions, we can no longer rely on DivergenceAnalysis for branches in subregions that were already processed by the pass. Explicitly remove branch instructions from DivergenceAnalysis to avoid dangling pointers as a matter of defensive programming, and change how we detect non-uniform subregions. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Differential Revision: https://reviews.llvm.org/D43743 llvm-svn: 329165
* [SimplifyCFG] Teach merge conditional stores to handle cases where the ↵Craig Topper2018-04-041-1/+16
| | | | | | | | | | | | | | | | | | | PostBB has more than 2 predecessors by inserting a new block for the store. Summary: Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors. This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store. Reviewers: efriedma, jmolloy, ABataev Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39760 llvm-svn: 329142
* [Hexagon] peel loops with runtime small trip countsIkhlas Ajbar2018-04-031-3/+0
| | | | | | | | Move the check canPeel() to Hexagon Target before setting PeelCount. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329129
* [InstCombine] allow more fmul folds with 'reassoc'Sanjay Patel2018-04-031-64/+64
| | | | | | | | The tests marked with 'FIXME' require loosening the check in SimplifyAssociativeOrCommutative() to optimize completely; that's still checking isFast() in Instruction::isAssociative(). llvm-svn: 329121
* Fix bad copy-and-paste in r329108Vlad Tsyrklevich2018-04-031-1/+1
| | | | llvm-svn: 329118
* [coroutines] Respect alloca alignment requirements when building coroutine frameGor Nishanov2018-04-032-9/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment. For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned. ``` %PackedStruct = type <{ i64 }> ... %data = alloca %PackedStruct, align 8 ``` If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame: ``` %f.Frame = type { ..., i16, [6 x i8], %PackedStruct } ``` Reviewers: rnk, lewissbaker, modocache Reviewed By: modocache Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D45221 llvm-svn: 329112
OpenPOWER on IntegriCloud