summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* SROA: Allow eliminating addrspacecasted allocasMatt Arsenault2019-06-143-55/+165
| | | | | | | | | | | | | | | | | | | There is a circular dependency between SROA and InferAddressSpaces today that requires running both multiple times in order to be able to eliminate all simple allocas and addrspacecasts. InferAddressSpaces can't remove addrspacecasts when written to memory, and SROA helps move pointers out of memory. This should avoid inserting new commuting addrspacecasts with GEPs, since there are unresolved questions about pointer wrapping between different address spaces. For now, don't replace volatile operations that don't match the alloca addrspace, as it would change the address space of the access. It may be still OK to insert an addrspacecast from the new alloca, but be more conservative for now. llvm-svn: 363462
* SROA: Add baseline test for addrspacecast changesMatt Arsenault2019-06-141-0/+348
| | | | llvm-svn: 363460
* Revert Fix a bug w/inbounds invalidation in LFTRFlorian Hahn2019-06-144-13/+11
| | | | | | | | | Reverting because it breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363289 (git commit eb88badff96dacef8fce3f003dec34c2ef6900bf) llvm-svn: 363427
* [CodeGenPrepare] propagate debuginfo when copying a shuffleSanjay Patel2019-06-141-14/+18
| | | | llvm-svn: 363409
* AMDGPU: Fold readlane intrinsics of constantsMatt Arsenault2019-06-141-0/+56
| | | | | | | | I'm not 100% sure about this, since I'm worried about IR transforms that might end up introducing divergence downstream once replaced with a constant, but I haven't come up with an example yet. llvm-svn: 363406
* [SCEV] Pass NoWrapFlags when expanding an AddExprSam Parker2019-06-1414-24/+24
| | | | | | | | | | | | InsertBinop now accepts NoWrapFlags, so pass them through when expanding a simple add expression. This is the first re-commit of the functional changes from rL362687, which was previously reverted. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363364
* [AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32Stanislav Mekhanoshin2019-06-131-138/+136
| | | | | | Differential Revision: https://reviews.llvm.org/D63301 llvm-svn: 363339
* [SimplifyCFG] NFC, update Switch tests as a baseline.Shawn Landden2019-06-1319-1422/+2511
| | | | | | | | | | | | | Also add baseline tests to show effect of later patches. There were a couple of regressions here that were never caught, but my patch set that this is a preparation to will fix them. This is the third attempt to land this patch. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363319
* [InstCombine] add test for failed libfunction prototype matching; NFCSanjay Patel2019-06-131-7/+25
| | | | llvm-svn: 363291
* Fix a bug w/inbounds invalidation in LFTRPhilip Reames2019-06-134-11/+13
| | | | | | | | | | | | | | This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363289
* [InstCombine] auto-generate complete test checks; NFCSanjay Patel2019-06-131-23/+20
| | | | llvm-svn: 363286
* [NFC] Updated testcase for D54411/rL363284David Bolvansky2019-06-131-14/+8
| | | | llvm-svn: 363285
* [EarlyCSE] Ensure equal keys have the same hash valueJoseph Tremoulet2019-06-131-13/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The logic in EarlyCSE that looks through 'not' operations in the predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is equivalent to `select (cmp sgt X, Y), Y, X`. Without this change, however, only the latter is recognized as a form of `smin X, Y`, so the two expressions receive different hash codes. This leads to missed optimization opportunities when the quadratic probing for the two hashes doesn't happen to collide, and assertion failures when probing doesn't collide on insertion but does collide on a subsequent table grow operation. This change inverts the order of some of the pattern matching, checking first for the optional `not` and then for the min/max/abs patterns, so that e.g. both expressions above are recognized as a form of `smin X, Y`. It also adds an assertion to isEqual verifying that it implies equal hash codes; this fires when there's a collision during insertion, not just grow, and so will make it easier to notice if these functions fall out of sync again. A new flag --earlycse-debug-hash is added which can be used when changing the hash function; it forces hash collisions so that any pair of values inserted which compare as equal but hash differently will be caught by the isEqual assertion. Reviewers: spatel, nikic Reviewed By: spatel, nikic Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62644 llvm-svn: 363274
* Improve reduction intrinsics by overloading result value.Sander de Smalen2019-06-135-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch uses the mechanism from D62995 to strengthen the definitions of the reduction intrinsics by letting the scalar result/accumulator type be overloaded from the vector element type. For example: ; The LLVM LangRef specifies that the scalar result must equal the ; vector element type, but this is not checked/enforced by LLVM. declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a) This patch changes that into: declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a) Which has the type-constraint more explicit and causes LLVM to check the result type with the vector element type. Reviewers: RKSimon, arsenm, rnk, greened, aemerson Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D62996 llvm-svn: 363240
* [ARM][TTI] Scan for existing loop intrinsicsSam Parker2019-06-131-0/+68
| | | | | | | | | TTI should report that it's not profitable to generate a hardware loop if it, or one of its child loops, has already been converted. Differential Revision: https://reviews.llvm.org/D63212 llvm-svn: 363234
* [SimplifyCFG] reverting preliminary Switch patches againShawn Landden2019-06-1319-2582/+1422
| | | | | | | | | This reverts 363226 and 363227, both NFC intended I swear I fixed the test case that is failing, and ran the tests, but I will look into it again. llvm-svn: 363229
* [SimplifyCFG] NFC, update Switch tests to better examine successive patchesShawn Landden2019-06-1319-1422/+2582
| | | | | | | | | | | Also add baseline tests to show effect of later patches. There were a couple of regressions here that were never caught, but my patch set that this is a preparation to will fix them. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363226
* [SimplifyCFG] revert the last commit.Shawn Landden2019-06-1316-2469/+1308
| | | | | | I ran ALL the test suite locally, so I will look into this... llvm-svn: 363223
* [SimplifyCFG] NFC, update Switch tests to HEAD so I canShawn Landden2019-06-1316-1308/+2469
| | | | | | | | | | see if my changes change anything Also add baseline tests to show effect of later patches. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363222
* Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG ↵David L. Jones2019-06-131-44/+0
| | | | | | | | | | | | | | | | SinkCommonCodeFromPredecessors ...' We have observed some failures with internal builds with this revision. - Performance regressions: - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings). - Benchmarks for Abseil's SwissTable. - Correctness: - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP). hwennborg has already been notified, and is aware of reproducer failures. llvm-svn: 363220
* [SLP] Update propagate_ir_flags.ll test to check that we do retain the ↵Dinar Temirbulatov2019-06-131-0/+36
| | | | | | common subset, NFC. llvm-svn: 363218
* [Tests] Highlight impact of multiple exit LFTR (D62625) as requested by reviewerPhilip Reames2019-06-121-0/+158
| | | | llvm-svn: 363217
* [x86] add tests for vector shifts; NFCSanjay Patel2019-06-121-0/+117
| | | | llvm-svn: 363203
* [Tests] Autogen RLEV test and add tests for a future enhancementPhilip Reames2019-06-121-57/+171
| | | | llvm-svn: 363193
* [Tests] Add tests to highlight sibling loop optimization order issue for ↵Philip Reames2019-06-121-2/+151
| | | | | | | | exit rewriting The issue addressed in r363180 is more broadly relevant. For the moment, we don't actually get any of these cases because we a) restrict SCEV formation due to SCEExpander needing to preserve LCSSA, and b) don't iterate between loops. llvm-svn: 363192
* [SCEV] Teach computeSCEVAtScope benefit from one-input Phi. PR39673Philip Reames2019-06-121-2/+1
| | | | | | | | | | | SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA. It's not entirely clear this restriction is neccessary, but for the moment it exists. For this reason, we don't analyze single-entry phi inputs. However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant. Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops. This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values. We should probably also add similiar logic directly in the SCEV construction path itself. Patch by: mkazantsev (with revised commit message by me) Differential Revision: https://reviews.llvm.org/D58113 llvm-svn: 363180
* [InstCombine] add tests for fmin/fmax libcalls; NFCSanjay Patel2019-06-121-0/+18
| | | | llvm-svn: 363175
* Revert rL363156.Sam Parker2019-06-126-11/+0
| | | | | | | The patch was to fix buildbots, but rL363157 should now be fixing it in a cleaner way. llvm-svn: 363174
* StackProtector: Use PointerMayBeCapturedMatt Arsenault2019-06-122-0/+142
| | | | | | | | | | | | | | | | This was using its own, outdated list of possible captures. This was at minimum not catching cmpxchg and addrspacecast captures. One change is now any volatile access is treated as capturing. The test coverage for this pass is quite inadequate, but this required removing volatile in the lifetime capture test. Also fixes some infrastructure issues to allow running just the IR pass. Fixes bug 42238. llvm-svn: 363169
* LoopVersioning: Respect convergentMatt Arsenault2019-06-121-0/+40
| | | | | | | | | This changes the standalone pass only. Arguably the utility class itself should assert there are no convergent calls. However, a target pass with additional context may still be able to version a loop if all of the dynamic conditions are sufficiently uniform. llvm-svn: 363165
* [InstCombine] add tests for fcmp+select with FMF (minnum/maxnum); NFCSanjay Patel2019-06-121-0/+132
| | | | llvm-svn: 363163
* LoopLoadElim: Respect convergentMatt Arsenault2019-06-121-0/+51
| | | | llvm-svn: 363162
* LoopDistribute/LAA: Respect convergentMatt Arsenault2019-06-125-1/+416
| | | | | | | | | | | | | | | | | | This case is slightly tricky, because loop distribution should be allowed in some cases, and not others. As long as runtime dependency checks don't need to be introduced, this should be OK. This is further complicated by the fact that LoopDistribute partially ignores if LAA says that vectorization is safe, and then does its own runtime pointer legality checks. Note this pass still does not handle noduplicate correctly, as this should always be forbidden with it. I'm not going to bother trying to fix it, as it would require more effort and I think noduplicate should be removed. https://reviews.llvm.org/D62607 llvm-svn: 363160
* LoopDistribute/LAA: Add tests to catch regressionsMatt Arsenault2019-06-123-0/+118
| | | | | | | | | I broke 2 of these with a patch, but were not covered by existing tests. https://reviews.llvm.org/D63035 llvm-svn: 363158
* [NFC] Add HardwareLoops lit.local.cfg fileSam Parker2019-06-121-0/+3
| | | | | | | Set Transforms/HardwareLoops/ARM/ tests as unsupported if there isn't an arm target. llvm-svn: 363157
* Attempt to fix non-Arm buildbotsSam Parker2019-06-126-0/+11
| | | | | | Adding REQUIRES: arm to failing tests llvm-svn: 363156
* [ARM] Implement TTI::isHardwareLoopProfitableSam Parker2019-06-126-0/+1132
| | | | | | | | | | | | | | | | | | | | Implement the backend target hook to drive the HardwareLoops pass. The low-overhead branch extension for Arm M-class cores is flexible enough that we don't have to ensure correctness at this point, except checking that the loop counter variable can be stored in LR - a 32-bit register. For it to be profitable, we want to avoid loops that contain function calls, or any other instruction that alters the PC. This implementation uses TargetLoweringInfo, to query type and operation actions, looks at intrinsic calls and also performs some manual checks for remainder/division and FP operations. I think this should be a good base to start and extra details can be filled out later. Differential Revision: https://reviews.llvm.org/D62907 llvm-svn: 363149
* Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step ↵Orlando Cazalet-Hyams2019-06-1211-368/+56
| | | | | | | | | through loop even after completion" This reverts commit 1a0f7a2077b70c9864faa476e15b048686cf1ca7. See phabricator thread for D60831. llvm-svn: 363132
* Generalize icmp matching in IndVars' eliminateTruncPhilip Reames2019-06-111-0/+104
| | | | | | | | We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases. Differential Revision: https://reviews.llvm.org/D63112 llvm-svn: 363108
* [InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZCameron McInally2019-06-111-3/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D62612 llvm-svn: 363082
* [InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNegCameron McInally2019-06-113-17/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D62629 llvm-svn: 363080
* [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵Orlando Cazalet-Hyams2019-06-1111-56/+368
| | | | | | | | | | | | | | | | | | | | | | | | | | | | loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 363046
* AtomicExpand: Don't crash on non-0 allocaMatt Arsenault2019-06-111-0/+37
| | | | | | | This now produces garbage on AMDGPU with a call to an nonexistent, anonymous libcall but won't assert. llvm-svn: 363022
* AMDGPU: Expand < 32-bit atomicsMatt Arsenault2019-06-113-45/+422
| | | | | | Also fix AtomicExpand asserting on atomicrmw fadd/fsub. llvm-svn: 363021
* [Tests] Adjust LFTR dead-iv tests to bypass undef casesPhilip Reames2019-06-101-23/+17
| | | | | | As pointed out by Nikita in review, undef and poison need to be handled separately. Since we're no longer expecting any test improvements - just fixes for miscompiles - update the tests to bypass the existing undef check. llvm-svn: 363002
* [PGO] Handle cases of non-instrument BBsRong Xu2019-06-104-5/+106
| | | | | | | | | | | As shown in PR41279, some basic blocks (such as catchswitch) cannot be instrumented. This patch filters out these BBs in PGO instrumentation. It also sets the profile count to the fail-to-instrument edge, so that we can propagate the counts in the CFG. Differential Revision: https://reviews.llvm.org/D62700 llvm-svn: 362995
* [Tests] Split an LFTR dead-iv casePhilip Reames2019-06-101-2/+33
| | | | | | There are two interesting sub-cases here. 1) Switching IVs is legal, but only in pre-increment form. and 2) Switching IVs is legal, and so is post-increment form. llvm-svn: 362993
* [Tests] Add tests for D62939 (miscompiles around dead pointer IVs)Philip Reames2019-06-101-0/+228
| | | | | | Flesh out a collection of tests for switching to a dead IV within LFTR, both for the current miscompile, and for some cases which we should be able to handle via simple reasoning. llvm-svn: 362976
* [LFTR] Use recomputed BE countPhilip Reames2019-06-101-10/+10
| | | | | | | | | | | | This was discussed as part of D62880. The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening. Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion. This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable. For the nestedIV example from elim-extend.ll, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) Note that before is an i32 type, and the after is an i64. Truncating the i64 produces the i32. llvm-svn: 362975
* [InstCombine] allow unordered preds when canonicalizing to fabs()Sanjay Patel2019-06-101-8/+4
| | | | | | | | | | We have a known-never-nan value via 'nnan', so an unordered predicate is the same as its ordered sibling. Similar to: rL362937 llvm-svn: 362954
OpenPOWER on IntegriCloud