summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [LoopInterchange] Add test case for D43236.Florian Hahn2018-02-261-0/+44
| | | | llvm-svn: 326078
* [InstSimplify] Add test cases for removal of vector fabs on known positive.Craig Topper2018-02-251-0/+118
| | | | llvm-svn: 326050
* [InstSimplify] Remove unused parameter from test cases.Craig Topper2018-02-251-7/+7
| | | | llvm-svn: 326049
* Revert "StructurizeCFG: Test for branch divergence correctly"Adam Nemet2018-02-241-82/+0
| | | | | | | | This reverts commit r325881. Breaks many bots llvm-svn: 326037
* [InstSimplify] sqrt(X) * sqrt(X) --> XSanjay Patel2018-02-232-11/+15
| | | | | | This was misplaced in InstCombine. We can loosen the FMF as a follow-up step. llvm-svn: 325965
* [InstCombine] allow fmul-sqrt folds with less than full -ffast-mathSanjay Patel2018-02-231-21/+27
| | | | | | Also, add a Builder method for intrinsics to reduce code duplication for clients. llvm-svn: 325960
* [Test] Fix the test to output to /dev/null instead of redirecting.Matt Davis2018-02-231-1/+1
| | | | | | The redirection was confusing the windows build machine. llvm-svn: 325937
* [Debug] Add dbg.value intrinsics for PHIs created during LCSSA.Matt Davis2018-02-232-4/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: ``` int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } ``` In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Reviewers: mzolotukhin, aprantl, vsk, davide Reviewed By: aprantl, vsk Subscribers: dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 325926
* StructurizeCFG: Test for branch divergence correctlyNicolai Haehnle2018-02-231-0/+82
| | | | | | | | | | | | | | | | | | | | | | Summary: This fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Reviewers: arsenm, rampitec, jlebar Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D40546 llvm-svn: 325881
* Mark MergedLoadStoreMotion as not preserving MemDep resultsBjorn Steinbrink2018-02-231-0/+22
| | | | | | | | | | | | | | | | | | Summary: MemDep caches results that signify that a dependence is non-local, and there is currently no way to invalidate such cache entries. Unfortunately, when MLSM sinks a store that can result in a non-local dependence becoming a local one, and then MemDep gives wrong answers. The easiest way out here is to just say that MLSM does indeed not preserve MemDep results. Reviewers: davide, Gerolf Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43177 llvm-svn: 325880
* [AlignmentFromAssumptions] Set source and dest alignments of memory ↵Daniel Neilson2018-02-222-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | intrinsiscs separately Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of MemoryIntrinsic in favour of getting/setting source & dest specific alignments through the new API. This allows us to simplify some of the code in this pass and also be more aggressive about setting the source and destination alignments separately. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: hfinkel, bollu, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D43081 llvm-svn: 325816
* [FunctionAttrs][ArgumentPromotion][GlobalOpt] Disable some optimisations ↵Luke Cheeseman2018-02-223-0/+71
| | | | | | | | | | | | | | passes for naked functions - Fix for bug 36078. - Prevent the functionattrs, function-attrs, globalopt and argpromotion passes from changing naked functions. - These passes can perform some alterations to the functions that should not be applied. An example is removing parameters that are seemingly not used because they are only referenced in the inline assembly. Another example is marking the function as fastcc. llvm-svn: 325788
* [InstCombine] add fmul multi-use test; NFCSanjay Patel2018-02-221-14/+33
| | | | | | Also, rename tests to make their intent clearer. llvm-svn: 325785
* [SLPVectorizer][X86] Add load extend tests (PR36091)Simon Pilgrim2018-02-222-0/+1696
| | | | llvm-svn: 325772
* [InstCombine] add some random FMF to tests so we know it's not dropped; NFCSanjay Patel2018-02-211-8/+8
| | | | llvm-svn: 325734
* [AArch64] fix IR names to not be 'tmp' because that gives the CHECK script ↵Sanjay Patel2018-02-211-40/+40
| | | | | | problems llvm-svn: 325718
* [AArch64] add SLP test for matmul (PR36280); NFCSanjay Patel2018-02-211-0/+139
| | | | | | | | This is a slight reduction of one of the benchmarks that suffered with D43079. Cost model changes should not cause this test to remain scalarized. llvm-svn: 325717
* [LV] Fix test checks, NFCAlexey Bataev2018-02-211-76/+2363
| | | | llvm-svn: 325699
* [SLP] Fix test checks, NFC.Alexey Bataev2018-02-211-15/+30
| | | | llvm-svn: 325689
* [SCEV] Temporarily disable loop versioning for the purposeSilviu Baranga2018-02-213-4/+4
| | | | | | | | | | of turning SCEVUnknowns of PHIs into AddRecExprs. This feature is now hidden behind the -scev-version-unknown flag. Fixes PR36032 and PR35432. llvm-svn: 325687
* [BDCE] Salvage debug info from dying instsVedant Kumar2018-02-211-0/+11
| | | | | | | | This results in 15 additional unique source variables in a stage2 build of FileCheck (at '-Os -g'), with a negligible increase in the size of the .debug_loc section. llvm-svn: 325660
* revert r325515: [TTI CostModel] change default cost of FP ops to 1 (PR36280)Sanjay Patel2018-02-217-123/+183
| | | | | | | | There are too many perf regressions resulting from this, so we need to investigate (and add tests for) targets like ARM and AArch64 before trying to reinstate. llvm-svn: 325658
* [InstCombine] C / -X --> -C / XSanjay Patel2018-02-211-4/+2
| | | | | | | | | We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. This is similar to rL325648. llvm-svn: 325649
* [InstCombine] -X / C --> X / -C for FPSanjay Patel2018-02-201-10/+8
| | | | | | | We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. llvm-svn: 325648
* [InstCombine] add tests for fdiv with negated op and constant op; NFCSanjay Patel2018-02-201-0/+44
| | | | llvm-svn: 325644
* [PatternMatch] allow vector matches with m_FNegSanjay Patel2018-02-202-10/+3
| | | | llvm-svn: 325642
* [DSE] Don't DSE stores that subsequent memmove calls read fromSanjoy Das2018-02-201-0/+51
| | | | | | | | | | | | | | | | | | | | | | Summary: We used to remove the first memmove in cases like this: memmove(p, p+2, 8); memmove(p, p+2, 8); which is incorrect. Fix this by changing isPossibleSelfRead to what was most likely the intended behavior. Historical note: the buggy code was added in https://reviews.llvm.org/rL120974 to address PR8728. Reviewers: rsmith Subscribers: mcrosier, llvm-commits, jlebar Differential Revision: https://reviews.llvm.org/D43425 llvm-svn: 325641
* [InstCombine] auto-generate full checks; NFCSanjay Patel2018-02-201-41/+47
| | | | llvm-svn: 325639
* [InstCombine] add test for vector -X/-Y; NFCSanjay Patel2018-02-201-4/+17
| | | | | | m_FNeg doesn't match vector types. llvm-svn: 325637
* Fix broken test from r325630.Benjamin Kramer2018-02-201-1/+1
| | | | llvm-svn: 325634
* [MemoryBuiltins] Check nobuiltin status when identifying calls to free.Benjamin Kramer2018-02-201-1/+21
| | | | | | | | This is usually not a problem because this code's main purpose is eliminating unused new/delete pairs. We got deletes of nullptr or nobuiltin deletes of builtin new wrong though. llvm-svn: 325630
* [PatternMatch] enhance m_SignMask() to ignore undef elements in vectorsSanjay Patel2018-02-201-6/+2
| | | | llvm-svn: 325623
* [InstSimplify] add tests for m_SignMask with undef vector elements; NFCSanjay Patel2018-02-201-2/+28
| | | | llvm-svn: 325622
* [LV] Fix test checks, NFC.Alexey Bataev2018-02-202-140/+3506
| | | | llvm-svn: 325617
* [SLP] Fix tests checks, NFC.Alexey Bataev2018-02-205-74/+249
| | | | llvm-svn: 325605
* [InstCombine] fold fdiv with non-splat divisor to fmul: X/C --> X * (1/C)Sanjay Patel2018-02-202-4/+16
| | | | llvm-svn: 325590
* [InstCombine] allow fdiv with constant dividend folds with less than full ↵Sanjay Patel2018-02-191-6/+29
| | | | | | | | | | | | -ffast-math It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this should be conservatively better than what we have right now. GCC allows this with only -freciprocal-math. The last test is changed to show a case that is expected to fold, but we need D43398. llvm-svn: 325533
* [InstCombine] move fdiv tests; NFCSanjay Patel2018-02-192-33/+37
| | | | | | Also, use vector constants just to prove that already works. llvm-svn: 325530
* [TTI CostModel] change default cost of FP ops to 1 (PR36280)Sanjay Patel2018-02-197-183/+123
| | | | | | | | | | | | | | | | | | This change was mentioned at least as far back as: https://bugs.llvm.org/show_bug.cgi?id=26837#c26 ...and I found a real program that is harmed by this: Himeno running on AMD Jaguar gets 6% slower with SLP vectorization: https://bugs.llvm.org/show_bug.cgi?id=36280 ...but the change here appears to solve that bug only accidentally. The div/rem costs for x86 look very wrong in some cases, but that's already true, so we can fix those in follow-up patches. There's also evidence that more cost model changes are needed to solve SLP problems as shown in D42981, but that's an independent problem (though the solution may be adjusted after this change is made). Differential Revision: https://reviews.llvm.org/D43079 llvm-svn: 325515
* [Transforms] Propagate new-format TBAA tags on simplification of ↵Ivan A. Kosarev2018-02-191-0/+53
| | | | | | | | | | | | | | memory-transfer intrinsics With this patch in place, when a new-format TBAA tag is available for a memory-transfer intrinsic call, we prefer propagating that new-format tag. Otherwise, we fallback to the old approach where we try to construct a proper TBAA access tag from 'tbaa.struct' metadata. Differential Revision: https://reviews.llvm.org/D41543 llvm-svn: 325488
* [PatternMatch, InstSimplify] enhance m_AllOnes() to ignore undef elements in ↵Sanjay Patel2018-02-182-4/+2
| | | | | | | | | | | | | | | | | | | vectors Loosening the matcher definition reveals a subtle bug in InstSimplify (we should not assume that because an operand constant matches that it's safe to return it as a result). So I'm making that change here too (that diff could be independent, but I'm not sure how to reveal it before the matcher change). This also seems like a good reason to *not* include matchers that capture the value. We don't want to encourage the potential misstep of propagating undef values when it's not allowed/intended. I didn't include the capture variant option here or in the related rL325437 (m_One), but it already exists for other constant matchers. llvm-svn: 325466
* [InstSimplify] add tests with vector undef elts; NFCSanjay Patel2018-02-182-7/+18
| | | | llvm-svn: 325465
* [PatternMatch] enhance m_One() to ignore undef elements in vectorsSanjay Patel2018-02-173-9/+7
| | | | llvm-svn: 325437
* [InstSimplify, InstCombine] add tests with vector undef elts; NFCSanjay Patel2018-02-172-6/+43
| | | | | | These would fold if the m_One pattern matcher accounted for undef elts. llvm-svn: 325436
* [InstSimplify] add vector select tests with undef elts in condition; NFCSanjay Patel2018-02-171-0/+20
| | | | llvm-svn: 325419
* [InstCombine] add FMF to better show current fdiv fold behavior; NFCSanjay Patel2018-02-161-4/+4
| | | | llvm-svn: 325365
* [ThinLTO] Fix data race in test #2Eugene Leviant2018-02-161-1/+1
| | | | | | Switched to the right option (-thinlto-threads) llvm-svn: 325362
* [ThinLTO] Fix data race in testEugene Leviant2018-02-161-1/+1
| | | | llvm-svn: 325361
* [JumpThreading] PR36133 enable/disable DominatorTree for LVI analysisBrian M. Rzycki2018-02-161-0/+44
| | | | | | | | | | | | | | | | | | | | | | Summary: The LazyValueInfo pass caches a copy of the DominatorTree when available. Whenever there are pending DominatorTree updates within JumpThreading's DeferredDominance object we cannot use the cached DT for LVI analysis. This commit adds the new methods enableDT() and disableDT() to LVI. JumpThreading also sets the appropriate usage model before calling LVI analysis methods. Fixes https://bugs.llvm.org/show_bug.cgi?id=36133 Reviewers: sebpop, dberlin, kuhar Reviewed by: sebpop, kuhar Subscribers: uabelho, llvm-commits, aprantl, hiraditya, a.elovikov Differential Revision: https://reviews.llvm.org/D42717 llvm-svn: 325356
* [Transforms] Propagate TBAA info in SROAIvan A. Kosarev2018-02-161-151/+274
| | | | | | | | | | | | | | | Now that we have the new TBAA metadata format that is capable of representing accesses to aggregates, we can propagate TBAA access tags from memory setting and transferring intrinsics to load and store instructions and vice versa. Since SROA produces lots of new loads and stores on optimized builds, this change significantly decreases the share of undecorated memory accesses on such builds. Differential Revision: https://reviews.llvm.org/D41563 llvm-svn: 325329
OpenPOWER on IntegriCloud