summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* SROA: Allow eliminating addrspacecasted allocasMatt Arsenault2019-06-141-8/+43
| | | | | | | | | | | | | | | | | | | There is a circular dependency between SROA and InferAddressSpaces today that requires running both multiple times in order to be able to eliminate all simple allocas and addrspacecasts. InferAddressSpaces can't remove addrspacecasts when written to memory, and SROA helps move pointers out of memory. This should avoid inserting new commuting addrspacecasts with GEPs, since there are unresolved questions about pointer wrapping between different address spaces. For now, don't replace volatile operations that don't match the alloca addrspace, as it would change the address space of the access. It may be still OK to insert an addrspacecast from the new alloca, but be more conservative for now. llvm-svn: 363462
* Revert Fix a bug w/inbounds invalidation in LFTRFlorian Hahn2019-06-141-83/+11
| | | | | | | | | Reverting because it breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363289 (git commit eb88badff96dacef8fce3f003dec34c2ef6900bf) llvm-svn: 363427
* Revert [LFTR] Stylistic cleanup as suggested in last review comment of ↵Florian Hahn2019-06-141-9/+9
| | | | | | | | | | | D62939 [NFC] Reverting because it depends on r363289, which breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363292 (git commit 42a3fc133d3544b5c0c032fe99c6e8a469a836c2) llvm-svn: 363426
* Revert [LFTR] Rename variable to minimize confusion [NFC]Florian Hahn2019-06-141-15/+18
| | | | | | | | | | Reverting because it depends on r363289, which breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363293 (git commit c37be29634214fb1cb4c823840bffc31e5ebfe40) llvm-svn: 363425
* [SimpligyCFG] NFC intended, remove GCD that was only used for powers of twoShawn Landden2019-06-141-13/+11
| | | | | | | | | | | | and replace with an equilivent countTrailingZeros. GCD is much more expensive than this, with repeated division. This depends on D60823 Differential Revision: https://reviews.llvm.org/D61151 llvm-svn: 363422
* [Attributor] Disable the Attributor by default and fix a commentJohannes Doerfert2019-06-141-1/+1
| | | | llvm-svn: 363408
* AMDGPU: Fold readlane intrinsics of constantsMatt Arsenault2019-06-141-0/+7
| | | | | | | | I'm not 100% sure about this, since I'm worried about IR transforms that might end up introducing divergence downstream once replaced with a constant, but I haven't come up with an example yet. llvm-svn: 363406
* [AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32Stanislav Mekhanoshin2019-06-131-1/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D63301 llvm-svn: 363339
* [LFTR] Rename variable to minimize confusion [NFC]Philip Reames2019-06-131-18/+15
| | | | | | As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop. A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here. llvm-svn: 363293
* [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]Philip Reames2019-06-131-9/+9
| | | | llvm-svn: 363292
* Fix a bug w/inbounds invalidation in LFTRPhilip Reames2019-06-131-11/+83
| | | | | | | | | | | | | | This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363289
* [clang][NewPM] Fix broken -O0 test from missing assumptionsLeonard Chan2019-06-131-2/+11
| | | | | | | | | | | Add an AssumptionCache callback to the InlineFuntionInfo used for the AlwaysInlinerPass to match codegen of the AlwaysInlinerLegacyPass to generate llvm.assume. This fixes CodeGen/builtin-movdir.c when new PM is enabled by default. Differential Revision: https://reviews.llvm.org/D63170 llvm-svn: 363287
* [EarlyCSE] Ensure equal keys have the same hash valueJoseph Tremoulet2019-06-131-64/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The logic in EarlyCSE that looks through 'not' operations in the predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is equivalent to `select (cmp sgt X, Y), Y, X`. Without this change, however, only the latter is recognized as a form of `smin X, Y`, so the two expressions receive different hash codes. This leads to missed optimization opportunities when the quadratic probing for the two hashes doesn't happen to collide, and assertion failures when probing doesn't collide on insertion but does collide on a subsequent table grow operation. This change inverts the order of some of the pattern matching, checking first for the optional `not` and then for the min/max/abs patterns, so that e.g. both expressions above are recognized as a form of `smin X, Y`. It also adds an assertion to isEqual verifying that it implies equal hash codes; this fires when there's a collision during insertion, not just grow, and so will make it easier to notice if these functions fall out of sync again. A new flag --earlycse-debug-hash is added which can be used when changing the hash function; it forces hash collisions so that any pair of values inserted which compare as equal but hash differently will be caught by the isEqual assertion. Reviewers: spatel, nikic Reviewed By: spatel, nikic Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62644 llvm-svn: 363274
* [IntrinsicEmitter] Extend argument overloading with forward references.Sander de Smalen2019-06-131-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the mechanism to overload intrinsic arguments by using either backward or forward references to the overloadable arguments. In for example: def int_something : Intrinsic<[LLVMPointerToElt<0>], [llvm_anyvector_ty], []>; LLVMPointerToElt<0> is a forward reference to the overloadable operand of type 'llvm_anyvector_ty' and would allow intrinsics such as: declare i32* @llvm.something.v4i32(<4 x i32>); declare i64* @llvm.something.v2i64(<2 x i64>); where the result pointer type is deduced from the element type of the first argument. If the returned pointer is not a pointer to the element type, LLVM will give an error: Intrinsic has incorrect return type! i64* (<4 x i32>)* @llvm.something.v4i32 Reviewers: RKSimon, arsenm, rnk, greened Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D62995 llvm-svn: 363233
* [SimplifyCFG] reverting preliminary Switch patches againShawn Landden2019-06-131-11/+13
| | | | | | | | | This reverts 363226 and 363227, both NFC intended I swear I fixed the test case that is failing, and ran the tests, but I will look into it again. llvm-svn: 363229
* [SimpligyCFG] NFC intended, remove GCD that was only used for powers of twoShawn Landden2019-06-131-13/+11
| | | | | | | | | | | | and replace with an equilivent countTrailingZeros. GCD is much more expensive than this, with repeated division. This depends on D60823 Differential Revision: https://reviews.llvm.org/D61151 llvm-svn: 363227
* Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG ↵David L. Jones2019-06-131-15/+14
| | | | | | | | | | | | | | | | SinkCommonCodeFromPredecessors ...' We have observed some failures with internal builds with this revision. - Performance regressions: - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings). - Benchmarks for Abseil's SwissTable. - Correctness: - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP). hwennborg has already been notified, and is aware of reproducer failures. llvm-svn: 363220
* [IndVars] Extend diagnostic -replexitval flag w/ability to bypass hard use ↵Philip Reames2019-06-121-2/+5
| | | | | | | hueristic Note: This does mean that "always" is now more powerful than it was. llvm-svn: 363196
* LoopVersioning: Respect convergentMatt Arsenault2019-06-121-2/+3
| | | | | | | | | This changes the standalone pass only. Arguably the utility class itself should assert there are no convergent calls. However, a target pass with additional context may still be able to version a loop if all of the dynamic conditions are sufficiently uniform. llvm-svn: 363165
* LoopLoadElim: Respect convergentMatt Arsenault2019-06-121-0/+6
| | | | llvm-svn: 363162
* LoopDistribute/LAA: Respect convergentMatt Arsenault2019-06-121-1/+14
| | | | | | | | | | | | | | | | | | This case is slightly tricky, because loop distribution should be allowed in some cases, and not others. As long as runtime dependency checks don't need to be introduced, this should be OK. This is further complicated by the fact that LoopDistribute partially ignores if LAA says that vectorization is safe, and then does its own runtime pointer legality checks. Note this pass still does not handle noduplicate correctly, as this should always be forbidden with it. I'm not going to bother trying to fix it, as it would require more effort and I think noduplicate should be removed. https://reviews.llvm.org/D62607 llvm-svn: 363160
* Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step ↵Orlando Cazalet-Hyams2019-06-122-21/+7
| | | | | | | | | through loop even after completion" This reverts commit 1a0f7a2077b70c9864faa476e15b048686cf1ca7. See phabricator thread for D60831. llvm-svn: 363132
* Generalize icmp matching in IndVars' eliminateTruncPhilip Reames2019-06-111-14/+15
| | | | | | | | We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases. Differential Revision: https://reviews.llvm.org/D63112 llvm-svn: 363108
* Only passes that preserve MemorySSA must mark it as preserved.Alina Sbirlea2019-06-115-3/+17
| | | | | | | | | | | | | | | | | | | | Summary: The method `getLoopPassPreservedAnalyses` should not mark MemorySSA as preserved, because it's being called in a lot of passes that do not preserve MemorySSA. Instead, mark the MemorySSA analysis as preserved by each pass that does preserve it. These changes only affect the new pass mananger. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62536 llvm-svn: 363091
* [InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZCameron McInally2019-06-111-1/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D62612 llvm-svn: 363082
* [InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNegCameron McInally2019-06-111-4/+12
| | | | | | Differential Revision: https://reviews.llvm.org/D62629 llvm-svn: 363080
* [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵Orlando Cazalet-Hyams2019-06-112-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 363046
* Change semantics of fadd/fmul vector reductions.Sander de Smalen2019-06-111-8/+4
| | | | | | | | | | | | | | | | | | | | This patch changes how LLVM handles the accumulator/start value in the reduction, by never ignoring it regardless of the presence of fast-math flags on callsites. This change introduces the following new intrinsics to replace the existing ones: llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul and adds functionality to auto-upgrade existing LLVM IR and bitcode. Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D60261 llvm-svn: 363035
* [PGO] Fix the buildbot failure in r362995Rong Xu2019-06-101-5/+2
| | | | | | Fixed one unused variable warning. llvm-svn: 363004
* [PGO] Handle cases of non-instrument BBsRong Xu2019-06-101-38/+85
| | | | | | | | | | | As shown in PR41279, some basic blocks (such as catchswitch) cannot be instrumented. This patch filters out these BBs in PGO instrumentation. It also sets the profile count to the fail-to-instrument edge, so that we can propagate the counts in the CFG. Differential Revision: https://reviews.llvm.org/D62700 llvm-svn: 362995
* [LFTR] Use recomputed BE countPhilip Reames2019-06-101-9/+1
| | | | | | | | | | | | This was discussed as part of D62880. The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening. Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion. This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable. For the nestedIV example from elim-extend.ll, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) Note that before is an i32 type, and the after is an i64. Truncating the i64 produces the i32. llvm-svn: 362975
* Prepare for multi-exit LFTR [NFC]Philip Reames2019-06-101-65/+77
| | | | | | | | | | | | | | | | This change does the plumbing to wire an ExitingBB parameter through the LFTR implementation, and reorganizes the code to work in terms of a set of individual loop exits. Most of it is fairly obvious, but there's one key complexity which makes it worthy of consideration. The actual multi-exit LFTR patch is in D62625 for context. Specifically, it turns out the existing code uses the backedge taken count from before a IV is widened. Oddly, we can end up with a different (more expensive, but semantically equivelent) BE count for the loop when requerying after widening. For the nestedIV example from elim-extend, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) This is the only test in tree which seems sensitive to this difference. The actual result of using the wider BETC on this example is that we actually produce slightly better code. :) In review, we decided to accept that test change. This patch is structured to preserve the old behavior, but a separate change will immediate follow with the behavior change. (I wanted it separate for problem attribution purposes.) Differential Revision: https://reviews.llvm.org/D62880 llvm-svn: 362971
* [InstCombine] allow unordered preds when canonicalizing to fabs()Sanjay Patel2019-06-101-2/+2
| | | | | | | | | | We have a known-never-nan value via 'nnan', so an unordered predicate is the same as its ordered sibling. Similar to: rL362937 llvm-svn: 362954
* [InstCombine] fix bug in canonicalization to fabs()Sanjay Patel2019-06-101-2/+4
| | | | | | Forgot to translate the predicate clauses in rL362943. llvm-svn: 362945
* [InstCombine] change canonicalization to fabs() to use FMF on fsubSanjay Patel2019-06-101-19/+21
| | | | | | | | | | | | | | Similar to rL362909: This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fsub because they have the same operand. llvm-svn: 362943
* [InstCombine] allow unordered preds when canonicalizing to fabs()Sanjay Patel2019-06-101-2/+4
| | | | | | | PR42179: https://bugs.llvm.org/show_bug.cgi?id=42179 llvm-svn: 362937
* Do not derive no-recurse attribute if function does not have exact definition.Vivek Pandya2019-06-101-1/+1
| | | | | | | | | | | This is fix for https://bugs.llvm.org/show_bug.cgi?id=41336 Reviewers: jdoerfert Reviewed by: jdoerfert Differential Revision: https://reviews.llvm.org/D63045 llvm-svn: 362918
* [InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompilesRoman Lebedev2019-06-091-0/+8
| | | | | | | | | | | | | A precondition 'x != 0' was forgotten by me: https://rise4fun.com/Alive/JFNP https://rise4fun.com/Alive/jHvL These 4 folds with non-constants could be re-enabled, but for now let's go for the simplest solution. https://bugs.llvm.org/show_bug.cgi?id=42198 llvm-svn: 362911
* [InstCombine] change canonicalization to fabs() to use FMF on fnegSanjay Patel2019-06-091-13/+25
| | | | | | | | | | | | | | | | | This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fneg (fsub) because they have the same operand. This works around the most glaring FMF logical inconsistency cited in PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 llvm-svn: 362909
* [GVN] non-functional code movementKeno Fischer2019-06-072-16/+16
| | | | | | | | | | | | Summary: Move some code around, in preparation for later fixes to the non-integral addrspace handling (D59661) Patch By Jameson Nash <jameson@juliacomputing.com> Reviewed By: reames, loladiro Differential Revision: https://reviews.llvm.org/D59729 llvm-svn: 362853
* [DomTreeUpdater] Add all insert before all delete updates to reduce compile ↵Alina Sbirlea2019-06-071-4/+10
| | | | | | | | | | | | | | | | | | | | | time. Summary: The cleanup in D62751 introduced a compile-time regression due to the way DT updates are performed. Add all insert edges then all delete edges in DTU to match the previous compile time. Compile time on the test provided by @mstorsjo before and after this patch on my machine: 113.046s vs 35.649s Repro: clang -target x86_64-w64-mingw32 -c -O3 glew-preproc.c; on https://martin.st/temp/glew-preproc.c. Reviewers: kuhar, NutshellySima, mstorsjo Subscribers: jlebar, mstorsjo, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62981 llvm-svn: 362839
* test-commitStefan Stipanovic2019-06-071-1/+0
| | | | llvm-svn: 362802
* [LV] Fix -Wunused-function after r362736Fangrui Song2019-06-071-0/+2
| | | | llvm-svn: 362762
* [LV] Wrap LV illegality reporting in a function. NFC.Renato Golin2019-06-061-100/+120
| | | | | | | | | | | | | | | | | | | | | | | A function for loop vectorization illegality reporting has been introduced: void LoopVectorizationLegality::reportVectorizationFailure( const StringRef DebugMsg, const StringRef OREMsg, const StringRef ORETag, Instruction * const I) const; The function prints a debug message when the debug for the compilation unit is enabled as well as invokes the optimization report emitter to generate a message with a specified tag. The function doesn't cover any complicated logic when a custom lambda should be passed to the emitter, only generating a message with a tag is supported. The function always prints the instruction `I` after the debug message whenever the instruction is specified, otherwise the debug message ends with a dot: 'LV: Not vectorizing: Disabled/already vectorized.' Patch by Pavel Samolysov <samolisov@gmail.com> llvm-svn: 362736
* [LoopPred] Fix a bug in unconditional latch bailout introduced in r362284Philip Reames2019-06-061-2/+2
| | | | | | This is a really silly bug that even a simple test w/an unconditional latch would have caught. I tried to guard against the case, but put it in the wrong if check. Oops. llvm-svn: 362727
* [MSAN] Add unary FNeg visitor to the MemorySanitizerCameron McInally2019-06-051-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D62909 llvm-svn: 362664
* [CallSite removal] Refactoring llvm::InlineFunction APIsMircea Trofin2019-06-051-8/+2
| | | | | | | | | | | | | | | | | | | | Summary: This change only unifies the API previous API pair accepting CallInst and InvokeInst, thus making it easier to refactor inliner pass ode to CallBase. The implementation of the unified API still relies on the CallSite implementation. Reviewers: eraman, chandlerc, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62283 llvm-svn: 362656
* [InstCombine] simplify code for bitcast of insertelement; NFCSanjay Patel2019-06-051-5/+4
| | | | llvm-svn: 362655
* NewGVN: Handle addrspacecastMatt Arsenault2019-06-051-2/+3
| | | | | | | | | | The AllConstant check needs to be moved out of the if/else if chain to avoid a test regression. The "there is no SimplifyZExt" comment puzzles me, since there is SimplifyCastInst. Additionally, the Simplify* calls seem to not see the operand as constant, so this needs to be tried if the simplify failed. llvm-svn: 362653
* InstCombine: correctly change byval type attribute alongside call args.Tim Northover2019-06-051-4/+20
| | | | | | | | When the byval attribute has a type, it must match the pointee type of any parameter; but InstCombine was not updating the attribute when folding casts of various kinds away. llvm-svn: 362643
OpenPOWER on IntegriCloud