summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] add tests to show type limitations of InsertRangeTest and callersSanjay Patel2016-08-303-3/+56
| | | | llvm-svn: 280175
* [LoopVectorizer] Predicate instructions in blocks with several incoming edgesMichael Kuperstein2016-08-302-4/+62
| | | | | | | | | | We don't need to limit predication to blocks that have a single incoming edge, we just need to use the right mask. This fixes PR30172. Differential Revision: https://reviews.llvm.org/D24009 llvm-svn: 280148
* IntrArgMemOnly is only defined (and current AA machinery only sanely ↵Daniel Berlin2016-08-301-0/+42
| | | | | | supports) pointer arguments, and these intrinsics have vector of pointer arguments. Remove ArgMemOnly until we either have the machinery, define a new attribute, or something similar llvm-svn: 280143
* [InstCombine] replace divide-by-constant checks with asserts; NFCSanjay Patel2016-08-301-1/+16
| | | | | | | These folds already have tests for scalar and vector types, except for the vector div-by-0 case, so I'm adding tests for that. llvm-svn: 280115
* [SimplifyCFG] Properly CSE metadata in SinkThenElseCodeToEndJames Molloy2016-08-301-0/+37
| | | | | | This was missing, meaning the metadata in sunk instructions was potentially bogus and could cause miscompiles. llvm-svn: 280072
* [ThinLTO] Indirect call promotion fixes for promoted local functionsTeresa Johnson2016-08-292-3/+19
| | | | | | | | | | | | | | | | | | | Summary: Fix a couple issues limiting the application of indirect call promotion in ThinLTO mode: - Invoke indirect call promotion before globalopt, since it may eliminate imported functions which appear unreferenced. - Invoke indirect call promotion with InLTO=true so that the PGOFuncName metadata is used to get the name for locals which would have been renamed during promotion. Reviewers: davidxl, mehdi_amini Subscribers: Prazek, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24004 llvm-svn: 280024
* [LV] Move insertelement sequence after scalar definitionsMatthew Simpson2016-08-292-16/+58
| | | | | | | | | | | | | | After r279649 when getting a vector value from VectorLoopValueMap, we create an insertelement sequence on-demand if the value has been scalarized instead of vectorized. We previously inserted this insertelement sequence before the value's first vector user. However, this insert location is problematic if that user is the phi node of a first-order recurrence. With this patch, we move the insertelement sequence after the last scalar instruction we created when scalarizing the value. Thus, the value's vector definition in the new loop will immediately follow its scalar definitions. This should fix PR30183. Reference: https://llvm.org/bugs/show_bug.cgi?id=30183 llvm-svn: 280001
* [SimplifyCFG] Hoisting invalidates metadataDavid Majnemer2016-08-291-0/+31
| | | | | | | | | We forgot to remove optimization metadata when performing hosting during FoldTwoEntryPHINode. This fixes PR29163. llvm-svn: 279980
* [StatepointsForGC] Rematerialize in the presence of PHIsAnna Thomas2016-08-291-0/+37
| | | | | | | | | | | | | | | | Summary: While walking the use chain for identifying rematerializable values in RS4GC, add the case where the current value and base value are the same PHI nodes. This will aid rematerialization of geps and casts instead of relocating. Reviewers: sanjoy, reames, igor Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23920 llvm-svn: 279975
* [Constant] remove fdiv and frem from canTrap()Sanjay Patel2016-08-291-6/+3
| | | | | | | | | | | Assuming the default FP env, we should not treat fdiv and frem any differently in terms of trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env. This matches how we treat the fdiv/frem in IR with isSafeToSpeculativelyExecute() and in the backend after: https://reviews.llvm.org/rL279970 llvm-svn: 279973
* [SimplifyCFG] rename test file, regenerate checks, and add testSanjay Patel2016-08-292-41/+70
| | | | | | | The fdiv test shows a problem similar to: https://reviews.llvm.org/rL279970 llvm-svn: 279972
* [Coroutines] Part 9: Add cleanup subfunction.Gor Nishanov2016-08-299-39/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [Coroutines] Part 9: Add cleanup subfunction. This patch completes coroutine heap allocation elision. Now, the heap elision example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex3.ll) Intrinsic Changes: * coro.free gets a token parameter tying it to coro.id to allow reliably discovering all coro.frees associated with a particular coroutine. * coro.id gets an extra parameter that points back to a coroutine function. This allows to check whether a coro.id describes the enclosing function or it belongs to a different function that was later inlined. CoroSplit now creates three subfunctions: # f$resume - resume logic # f$destroy - cleanup logic, followed by a deallocation code # f$cleanup - just the cleanup code CoroElide pass during devirtualization replaces coro.destroy with either f$destroy or f$cleanup depending whether heap elision is performed or not. Other fixes, improvements: * Fixed buglet in Shape::buildFrame that was not creating coro.save properly if coroutine has more than one suspend point. * Switched to using variable width suspend index field (no longer limited to 32 bit index field can be as little as i1 or as large as i<whatever-size_t-is>) Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23844 llvm-svn: 279971
* [InstCombine] use m_APInt to allow icmp (and X, Y), C folds for splat ↵Sanjay Patel2016-08-284-18/+8
| | | | | | constant vectors llvm-svn: 279937
* [Loop Vectorizer] Fixed memory confilict checks.Elena Demikhovsky2016-08-283-14/+14
| | | | | | | | | Fixed a bug in run-time checks for possible memory conflicts inside loop. The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size". Differential revision: https://reviews.llvm.org/D23176 llvm-svn: 279930
* [Inliner] Report when inlining fails because callee's def is unavailableAdam Nemet2016-08-261-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is obviously an interesting case because it may motivate code restructuring or LTO. Reporting this requires instantiation of ORE in the loop where the call sites are first gathered. I've checked compile-time overhead *with* -Rpass-with-hotness and the worst slow-down was 6% in mcf and quickly tailing off. As before without -Rpass-with-hotness there is no overhead. Because this could be a pretty noisy diagnostics, it is currently qualified as 'verbose'. As of this patch, 'verbose' diagnostics are only emitted with -Rpass-with-hotness, i.e. when the output is expected to be filtered. Reviewers: eraman, chandlerc, davidxl, hfinkel Subscribers: tejohnson, Prazek, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D23415 llvm-svn: 279860
* [MemCpy] Add comments for r279769Tim Shen2016-08-251-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279778
* [MemCpy] Check for alias in performMemCpyToMemSetOptzn, instead of the ↵Tim Shen2016-08-251-0/+38
| | | | | | | | | | | | | | | identity of two operands Summary: This fixes pr29105. The reason is that lifetime marks creates new aliasing pointers the original ones, but before this patch aliases were not checked in performMemCpyToMemSetOptzn. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279769
* GVN-hoist: fix hoistingFromAllPaths for loops (PR29034)Sebastian Pop2016-08-252-0/+173
| | | | | | | | | | | | | | | | It is invalid to hoist stores or loads if they are not executed on all paths from the hoisting point to the exit of the function. In the testcase, there are paths in the loop that do not execute the stores or the loads, and so hoisting them within the loop is unsafe. The problem is that the current implementation of hoistingFromAllPaths is incomplete: it walks all blocks dominated by the hoisting point, and does not return false when the loop contains a path on which the hoisted ld/st is not executed. Differential Revision: https://reviews.llvm.org/D23843 llvm-svn: 279732
* [Profile] Propagate branch metadata properly in instcombineXinliang David Li2016-08-251-0/+135
| | | | | | Differential Revision: http://reviews.llvm.org/D23590 llvm-svn: 279693
* [InstCombine] use m_APInt to allow icmp eq/ne (shr X, C2), C folds for splat ↵Sanjay Patel2016-08-244-22/+32
| | | | | | constant vectors llvm-svn: 279677
* [LV] Unify vector and scalar mapsMatthew Simpson2016-08-244-89/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch unifies the data structures we use for mapping instructions from the original loop to their corresponding instructions in the new loop. Previously, we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap. WidenMap maintained the vector values each instruction from the old loop was represented with, and ScalarIVMap maintained the scalar values each scalarized induction variable was represented with. With this patch, all values created for the new loop are maintained in VectorLoopValueMap. The change allows for several simplifications. Previously, when an instruction was scalarized, we had to insert the scalar values into vectors in order to maintain the mapping in WidenMap. Then, if a user of the scalarized value was also scalar, we had to extract the scalar values from the temporary vector we created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions. If a scalarized value is used by a scalar instruction, the scalar value is used directly. However, if the scalarized value is needed by a vector instruction, we generate the needed insertelement instructions on-demand. A common idiom in several locations in the code (including the scalarization code), is to first get the vector values an instruction from the original loop maps to, and then extract a particular scalar value. This patch adds getScalarValue for this purpose along side getVectorValue as an interface into VectorLoopValueMap. These functions work together to return the requested values if they're available or to produce them if they're not. The mapping has also be made less permissive. Entries can be added to VectorLoopValue map with the new initVector and initScalar functions. getVectorValue has been modified to return a constant reference to the mapped entries. There's no real functional change with this patch; however, in some cases we will generate slightly different code. For example, instead of an insertelement sequence following the definition of an instruction, it will now precede the first use of that instruction. This can be seen in the test case changes. Differential Revision: https://reviews.llvm.org/D23169 llvm-svn: 279649
* [SCCP] Don't delete side-effecting instructionsSanjoy Das2016-08-241-5/+15
| | | | | | | | I'm not sure if the `!isa<CallInst>(Inst) && !isa<TerminatorInst>(Inst))` bit is correct either, but this fixes the case we know is broken. llvm-svn: 279647
* [Loop Vectorizer] Support predication of div/remGil Rapaport2016-08-243-27/+267
| | | | | | | | | | div/rem instructions in basic blocks that require predication currently prevent vectorization. This patch extends the existing mechanism for predicating stores to handle other instructions and leverages it to predicate divs and rems. Differential Revision: https://reviews.llvm.org/D22918 llvm-svn: 279620
* [Coroutines] Part 8: Coroutine Frame Building algorithmGor Nishanov2016-08-243-1/+114
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds coroutine frame building algorithm. Now, simple coroutines such as ex0.ll and ex1.ll (first examples from docs\Coroutines.rst can be compiled). Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) ... 7. Split coroutine into subfunctions. (https://reviews.llvm.org/D23461) 8. Coroutine Frame Building algorithm <= we are here 9. Add f.cleanup subfunction. 10+. The rest of the logic Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23586 llvm-svn: 279609
* [SLP] Avoid signed integer overflowMatthew Simpson2016-08-231-0/+4
| | | | | | | | | | | | | | | | | | | The test case included with r279125 exposed an existing signed integer overflow. Since getTreeCost can return INT_MAX, we can't sum this cost together with other costs, such as getReductionCost. This patch removes the possibility of assigning a cost of INT_MAX. Since we were previously using INT_MAX as an indicator for "should not vectorize", we now explicitly check this condition with "isTreeTinyAndNotFullyVectorizable" before computing a cost. This patch adds a run-line to the test case used for r279125 that ensures we don't vectorize. Previously, this line would vectorize the test case by chance due to undefined behavior in the cost calculation. Differential Revision: https://reviews.llvm.org/D23723 llvm-svn: 279562
* [InstSimplify] allow icmp with constant folds for splat vectors, part 2Sanjay Patel2016-08-231-60/+20
| | | | | | | | | | | | Completes the m_APInt changes for simplifyICmpWithConstant(). Other commits in this series: https://reviews.llvm.org/rL279492 https://reviews.llvm.org/rL279530 https://reviews.llvm.org/rL279534 https://reviews.llvm.org/rL279538 llvm-svn: 279543
* [InstSimplify] allow icmp with constant folds for splat vectors, part 1Sanjay Patel2016-08-231-4/+2
| | | | llvm-svn: 279538
* [InstSimplify] add tests to show missing vector icmp foldsSanjay Patel2016-08-231-0/+238
| | | | llvm-svn: 279534
* [InstSimplify] move icmp with constant tests to another file; NFCSanjay Patel2016-08-232-165/+222
| | | | | | | | | | | ...because like the corresponding code, this is just too big to keep adding to. And the next step is to add a vector version of each of these tests to show missed folds. Also, auto-generate CHECK lines and add comments for the tests that correspond to the source code. llvm-svn: 279530
* [InstCombine] use m_APInt to allow icmp (shr exact X, Y), 0 folds for splat ↵Sanjay Patel2016-08-221-4/+2
| | | | | | constant vectors llvm-svn: 279472
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-222-1/+231
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Recommitting now an unrelated assertion in SROA is sorted out] The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279460
* [InstCombine] Allow sinking from unique predecessor with multiple edgesJun Bum Lim2016-08-221-0/+23
| | | | | | | | | | | | Summary: We can allow sinking if the single user block has only one unique predecessor, regardless of the number of edges. Note that a switch statement with multiple cases can have the same destination. Reviewers: mcrosier, majnemer, spatel, reames Subscribers: reames, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23722 llvm-svn: 279448
* Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"James Molloy2016-08-222-231/+1
| | | | | | This reverts commit r279443. It caused buildbot failures. llvm-svn: 279447
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-222-1/+231
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279443
* Revert -r278267 [ValueTracking] An improvement to IR ValueTracking on ↵Artur Pilipenko2016-08-221-1/+1
| | | | | | | | | | Non-negative Integers This change cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See https://reviews.llvm.org/D18777 for details. llvm-svn: 279433
* Revert -r278269 [IndVarSimplify] Eliminate zext of a signed IV when the IV ↵Artur Pilipenko2016-08-221-82/+0
| | | | | | | | | | is known to be non-negative This change needs to be reverted in order to revert -r278267 which cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See comments on https://reviews.llvm.org/D18777 for details. llvm-svn: 279432
* [PM] Port LoopDataPrefetch AArch64 tests to new pass managerBalaram Makam2016-08-224-0/+12
| | | | | | | | | | Reviewers: mcrosier, tejohnson Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23724 llvm-svn: 279431
* [InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat ↵Sanjay Patel2016-08-211-3/+2
| | | | | | | | | | | constant vectors, part 4 This concludes the fixes for icmp+shl in this series: https://reviews.llvm.org/rL279339 https://reviews.llvm.org/rL279398 https://reviews.llvm.org/rL279399 llvm-svn: 279401
* remove FIXME comment; fixed by previous commitSanjay Patel2016-08-211-1/+0
| | | | llvm-svn: 279400
* [InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat ↵Sanjay Patel2016-08-211-2/+2
| | | | | | | | constant vectors, part 3 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279399
* [InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat ↵Sanjay Patel2016-08-211-3/+1
| | | | | | | | constant vectors, part 2 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279398
* Reapply "[SLP] Initialize VectorizedValue when gathering"Matthew Simpson2016-08-201-0/+87
| | | | | | | | | | | The test case included in r279125 exposed existing undefined behavior in the SLP vectorizer that it did not introduce. This patch reapplies the original patch, but modifies the test case to avoid hitting the undefined behavior. This allows us to close PR28330 while keeping the UBSan bot happy. The undefined behavior the original test uncovered will be addressed in a follow-on patch. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 llvm-svn: 279370
* Revert "[SLP] Initialize VectorizedValue when gathering" to fix ubsan bot.Vitaly Buka2016-08-201-95/+0
| | | | | | | | This reverts commit r279125. https://reviews.llvm.org/D23410 llvm-svn: 279363
* [Profile] add test with large countsXinliang David Li2016-08-202-0/+15
| | | | llvm-svn: 279361
* [InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat ↵Sanjay Patel2016-08-192-16/+20
| | | | | | | | | constant vectors, part 1 This is a partial enablement (move the ConstantInt guard down) because there are many different folds here and one of the later ones will require reworking 'isSignBitCheck'. llvm-svn: 279339
* Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"Reid Kleckner2016-08-192-203/+1
| | | | | | | This reverts commit r279229. It breaks intrinsic function calls in diamonds. llvm-svn: 279313
* Fix regression in InstCombine introduced by r278944Reid Kleckner2016-08-191-0/+13
| | | | | | | | | | | The intended transform is: // Simplify icmp eq (or (ptrtoint P), (ptrtoint Q)), 0 // -> and (icmp eq P, null), (icmp eq Q, null). P and Q are both pointer types, but may have different types. We need two calls to getNullValue() to make the icmps. llvm-svn: 279271
* [CloneFunction] Don't remove unrelated nodes from the CGSSCDavid Majnemer2016-08-191-2/+24
| | | | | | | | CGSCC use a WeakVH to track call sites. RAUW a call within a function can result in that WeakVH getting confused about whether or not the call site is still around. llvm-svn: 279268
* [InstCombine] use m_APInt to allow icmp (shl 1, Y), C folds for splat ↵Sanjay Patel2016-08-191-24/+8
| | | | | | constant vectors llvm-svn: 279266
* [InstCombine] use m_APInt to allow icmp X, C folds for splat constant vectorsSanjay Patel2016-08-191-2/+6
| | | | | | | | | Of course, we really need to refactor and fix all of the cmp predicates, but this one is interesting because without it, we later perform an information-losing transform of icmp (shl 1, Y), C, and we can't recover the better fold. llvm-svn: 279263
OpenPOWER on IntegriCloud