summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* WholeProgramDevirt: Check that VCP candidate functions are defined before ↵Peter Collingbourne2017-02-091-0/+32
| | | | | | | | evaluating them. This was crashing before. llvm-svn: 294666
* [InstCombine] allow (X * C2) << C1 --> X * (C2 << C1) for vectorsSanjay Patel2017-02-091-0/+14
| | | | | | | | | | This fold already existed for vectors but only when 'C1' was a splat constant (but 'C2' could be any constant). There were no tests for any vector constants, so I'm adding a test that shows non-splat constants for both operands. llvm-svn: 294650
* [LoadCombine] Fix combining of loads which span an aliasing store.Michael J. Spencer2017-02-091-0/+21
| | | | | | | | Fixes PR31517 Differential Revision: https://reviews.llvm.org/D28922 llvm-svn: 294632
* [InstCombine] use m_APInt to allow demanded bits analysis on splat constantsSanjay Patel2017-02-092-4/+3
| | | | llvm-svn: 294628
* [InstCombine] add test for demanded bits with splat vector constants; NFCSanjay Patel2017-02-091-0/+13
| | | | llvm-svn: 294625
* [JumpThreading] Thread through guardsSanjoy Das2017-02-091-0/+132
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch allows JumpThreading also thread through guards. Virtually, guard(cond) is equivalent to the following construction: if (cond) { do something } else {deoptimize} Yet it is not explicitly converted into IFs before lowering. This patch enables early threading through guards in simple cases. Currently it covers the following situation: if (cond1) { // code A } else { // code B } // code C guard(cond2) // code D If there is implication cond1 => cond2 or !cond1 => cond2, we can transform this construction into the following: if (cond1) { // code A // code C } else { // code B // code C guard(cond2) } // code D Thus, removing the guard from one of execution branches. Patch by Max Kazantsev! Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29620 llvm-svn: 294617
* [InstCombine] add tests for icmp with add nsw; NFCSanjay Patel2017-02-091-0/+123
| | | | llvm-svn: 294601
* LowerTypeTests: Change a few vtable globals in tests to constants.Peter Collingbourne2017-02-097-14/+14
| | | | | | | | | It turns out that some of our negative tests were not in fact providing the test coverage we expected: they were passing because the vtables were failing an early check that they were constant. Fix this by changing the globals in these tests to constants. llvm-svn: 294550
* [InstCombine] add tests to show information-losing add nsw/nuw transforms; NFCSanjay Patel2017-02-083-14/+63
| | | | llvm-svn: 294524
* ThinLTOBitcodeWriter: Strip debug info from merged module.Peter Collingbourne2017-02-081-0/+7
| | | | | | | | | | This module will contain nothing but vtable definitions and (soon) available_externally function definitions, so there is no point in keeping debug info in the module. Differential Revision: https://reviews.llvm.org/D28913 llvm-svn: 294511
* [SLP] Additional test to check correct work of horizontal reductions,Alexey Bataev2017-02-081-0/+576
| | | | | | NFC. llvm-svn: 294505
* [Loop Vectorizer] Cost-based decision for vectorization form of memory ↵Elena Demikhovsky2017-02-083-1/+81
| | | | | | | | | | | | | | | | | | instruction. Making the cost model selecting between Interleave, GatherScatter or Scalar vectorization form of memory instruction. The right decision should be done for non-consecutive memory access instrcuctions that may have more than one vectorization solution. This patch includes the following changes: - Cost Model calculates the cost of Load/Store vector form and choose the better option between Widening, Interleave, GatherScactter and Scalarization. Cost Model keeps the widening decision. - Arrays of Uniform and Scalar values are moved from Legality to Cost Model. - Cost Model collects Uniforms and Scalars per VF. The collection is based on CM decision map of Loadis/Stores vectorization form. - Vectorization of memory instruction is performed according to the CM decision. Differential Revision: https://reviews.llvm.org/D27919 llvm-svn: 294503
* [InstCombine] add test for missed vector icmp fold; NFCSanjay Patel2017-02-082-10/+23
| | | | | | | Also, move the related existing scalar test to a renamed file where I'm planning to add more icmp-add tests. llvm-svn: 294487
* [InstCombineCalls] Unfold element atomic memcpy instructionIgor Laevsky2017-02-081-0/+92
| | | | | | Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294453
* [ArgPromote] Delete a test that makes no sense (any more).Chandler Carruth2017-02-081-23/+0
| | | | | | | | | This test is under 'ArgumentPromotion' but there are no arguments that get promoted in the test case, so there seems to be no point. Also, there are no assertions about the output at all, so this seems like something we should just delete given the low value. llvm-svn: 294428
* [ArgPromote] Clean up a crash test case by rinsing it through opt,Chandler Carruth2017-02-081-35/+46
| | | | | | | | | | | renaming things to at least have somewhat spelled out names, and even have meaningful names where I could guess at what they should be. Also add FileCheck assertions that we're actually doing what we set out to do for some of the tests, for example not promoting a type that would result in infinite promotion. llvm-svn: 294426
* [ArgPromote] Actually add FileCheck to a test that I actually updated toChandler Carruth2017-02-081-2/+1
| | | | | | | have nice CHECK patterns instead of relying on a coarse 'not grep' check. Sorry that I missed this the first time through. llvm-svn: 294422
* [ArgPromote] Actually run FileCheck on this test. The CHECK lines areChandler Carruth2017-02-081-1/+1
| | | | | | already there, just waiting to, well, be checked. =] llvm-svn: 294421
* LSR: Check atomic instruction pointer operandsMatt Arsenault2017-02-081-0/+87
| | | | llvm-svn: 294410
* [X86] Remove PCOMMIT instruction support since Intel has deprecated this ↵Craig Topper2017-02-081-2/+2
| | | | | | | | instruction with no plans to release products with it. Intel's documentation for the deprecation https://software.intel.com/en-us/blogs/2016/09/12/deprecate-pcommit-instruction llvm-svn: 294405
* [IRCE] Add a missing invariant checkSanjoy Das2017-02-071-0/+45
| | | | | | | | | | | | | | | | | | | | | | | Currently IRCE relies on the loops it transforms to be (semantically) of the form: for (i = START; i < END; i++) ... or for (i = START; i > END; i--) ... However, we were not verifying the presence of the START < END entry check (i.e. check before the first iteration). We were only verifying that the backedge was guarded by (i + 1) < END. Usually this would work "fine" since (especially in Java) most loops do actually have the START < END check, but of course that is not guaranteed. llvm-svn: 294375
* Add PredicateInfo utility and printing passDaniel Berlin2017-02-072-0/+668
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds a utility to build extended SSA (see "ABCD: eliminating array bounds checks on demand"), and an intrinsic to support it. This is then used to get functionality equivalent to propagateEquality in GVN, in NewGVN (without having to replace instructions as we go). It would work similarly in SCCP or other passes. This has been talked about a few times, so i built a real implementation and tried to productionize it. Copies are inserted for operands used in assumes and conditional branches that are based on comparisons (see below for more) Every use affected by the predicate is renamed to the appropriate intrinsic result. E.g. %cmp = icmp eq i32 %x, 50 br i1 %cmp, label %true, label %false true: ret i32 %x false: ret i32 1 will become %cmp = icmp eq i32, %x, 50 br i1 %cmp, label %true, label %false true: ; Has predicate info ; branch predicate info { TrueEdge: 1 Comparison: %cmp = icmp eq i32 %x, 50 } %x.0 = call @llvm.ssa_copy.i32(i32 %x) ret i32 %x.0 false: ret i23 1 (you can use -print-predicateinfo to get an annotated-with-predicateinfo dump) This enables us to easily determine what operations are affected by a given predicate, and how operations affected by a chain of predicates. Reviewers: davide, sanjoy Subscribers: mgorny, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D29519 Update for review comments Fix a bug Nuno noticed where we are giving information about and/or on edges where the info is not useful and easy to use wrong Update for review comments llvm-svn: 294351
* [LV] Add new ARM/AArch64 interleaved access cost model tests (NFC)Matthew Simpson2017-02-072-0/+153
| | | | llvm-svn: 294342
* [LV] Simplify ARM/AArch64 interleaved access cost model tests (NFC)Matthew Simpson2017-02-072-89/+76
| | | | | | | | This patch removes unneeded instructions from the existing ARM/AArch64 interleaved access cost model tests. I'll be adding a similar set of tests in a follow-on patch to increase coverage. llvm-svn: 294336
* Fix my GVNHoist test case from r294317Reid Kleckner2017-02-071-3/+4
| | | | llvm-svn: 294319
* Revert "[GVNHoist] Merge DebugLoc metadata on hoisted instructions"Reid Kleckner2017-02-072-69/+82
| | | | | | | | | This reverts commit r294250. It caused PR31891. Add a test case that shows that inlinable calls retain location information with an accurate scope. llvm-svn: 294317
* Fix the samplepgo indirect call promotion bug: we should not promote a ↵Dehao Chen2017-02-062-0/+18
| | | | | | | | | | | | | | | | direct call. Summary: Checking CS.getCalledFunction() == nullptr does not necessary indicate indirect call. We also need to check if CS.getCalledValue() is not a constant. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29570 llvm-svn: 294260
* Merge DebugLoc on combined stores; in this case, when combining storesPaul Robinson2017-02-061-2/+2
| | | | | | | | from the end of two blocks, merge instead of arbitrarily picking one. Differential Revision: http://reviews.llvm.org/D29504 llvm-svn: 294251
* [GVNHoist] Merge DebugLoc metadata on hoisted instructionsTaewook Oh2017-02-061-0/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When instructions are hoisted, current implementation keeps DebugLoc metadata of the instruction that chosen as Repl (and its GEP operand if Repl is a load or a store). However, DebugLoc metadata should be updated to the 'merged' location across all hoisted instructions. See the following example code: ``` 1: typedef struct { 2: int a[10]; 3: } S1; 4: 5: extern S1 *s1[10]; 6: 7: void foo(int x, int y, int i) { 8: if (y) 9: s1[i]->a[i] = x + y; 10: else 11: s1[i]->a[i] = x; 12: } ``` Below is LLVM IR representation of the program before gvn-hoist: ``` %struct.S1 = type { [10 x i32] } @s1 = external local_unnamed_addr global [10 x %struct.S1*], align 16 define void @foo(i32 %x, i32 %y, i32 %i) !dbg !4 { entry: %tobool = icmp ne i32 %y, 0, !dbg !8 br i1 %tobool, label %if.then, label %if.else, !dbg !10 if.then: ; preds = %entry %add = add nsw i32 %x, %y, !dbg !11 %idxprom = sext i32 %i to i64, !dbg !12 %arrayidx = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom, !dbg !12 %0 = load %struct.S1*, %struct.S1** %arrayidx, align 8, !dbg !12, !tbaa !13 %a = getelementptr inbounds %struct.S1, %struct.S1* %0, i32 0, i32 0, !dbg !17 br label %if.end, !dbg !12 if.else: ; preds = %entry %idxprom3 = sext i32 %i to i64, !dbg !18 %arrayidx4 = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom3, !dbg !18 %1 = load %struct.S1*, %struct.S1** %arrayidx4, align 8, !dbg !18, !tbaa !13 %a5 = getelementptr inbounds %struct.S1, %struct.S1* %1, i32 0, i32 0, !dbg !19 br label %if.end if.end: ; preds = %if.else, %if.then %a5.sink = phi [10 x i32]* [ %a5, %if.else ], [ %a, %if.then ] %.sink = phi i32 [ %x, %if.else ], [ %add, %if.then ] %idxprom6 = sext i32 %i to i64 %arrayidx7 = getelementptr inbounds [10 x i32], [10 x i32]* %a5.sink, i64 0, i64 %idxprom6 store i32 %.sink, i32* %arrayidx7, align 4, !tbaa !20 ret void, !dbg !22 } ``` where ``` !11 = !DILocation(line: 9, column: 18, scope: !9) !12 = !DILocation(line: 9, column: 5, scope: !9) !18 = !DILocation(line: 11, column: 5, scope: !9) !19 = !DILocation(line: 11, column: 9, scope: !9) ``` . And below is after gvn-hoist: ``` define void @foo(i32 %x, i32 %y, i32 %i) !dbg !4 { entry: %tobool = icmp ne i32 %y, 0, !dbg !8 %idxprom = sext i32 %i to i64, !dbg !10 %0 = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom, !dbg !10 %1 = load %struct.S1*, %struct.S1** %0, align 8, !dbg !10, !tbaa !11 br i1 %tobool, label %if.then, label %if.else, !dbg !15 if.then: ; preds = %entry %add = add nsw i32 %x, %y, !dbg !16 %arrayidx = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom, !dbg !10 %a = getelementptr inbounds %struct.S1, %struct.S1* %1, i32 0, i32 0, !dbg !17 br label %if.end, !dbg !10 if.else: ; preds = %entry %arrayidx4 = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom, !dbg !18 %a5 = getelementptr inbounds %struct.S1, %struct.S1* %1, i32 0, i32 0, !dbg !19 br label %if.end if.end: ; preds = %if.else, %if.then %a5.sink = phi [10 x i32]* [ %a5, %if.else ], [ %a, %if.then ] %.sink = phi i32 [ %x, %if.else ], [ %add, %if.then ] %arrayidx7 = getelementptr inbounds [10 x i32], [10 x i32]* %a5.sink, i64 0, i64 %idxprom store i32 %.sink, i32* %arrayidx7, align 4, !tbaa !20 ret void, !dbg !22 } ``` As you see, loads and their GEPs have been hosited from if.then/if.else block to entry block. However, DebugLoc metadata of these new instructions are still same as the instructions in if.then block, as they are moved/cloned from if.then block. This may result incorrect stepping and imprecise sample profile result. Reviewers: majnemer, pcc, sebpop Reviewed By: sebpop Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29377 llvm-svn: 294250
* [SLP] Revert "Allow using of extra values in horizontal reductions."Michael Kuperstein2017-02-061-114/+134
| | | | | | | | This breaks when one of the extra values is also a scalar that participates in the same vectorization tree which we'll end up reducing. llvm-svn: 294245
* IR: Consider two DISubprograms to be odr-equal if they have the same ↵Peter Collingbourne2017-02-061-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | template parameters. In ValueMapper we create new operands for MDNodes and rely on MDNode::replaceWithUniqued to create a new MDNode with the specified operands. However this doesn't always actually happen correctly for DISubprograms because when we uniquify the new node, we only odr-compare it with existing nodes (MDNodeSubsetEqualImpl<DISubprogram>::isDeclarationOfODRMember). Although the TemplateParameters field can refer to a distinct DICompileUnit via DITemplateTypeParameter::type -> DICompositeType::scope -> DISubprogram::unit, it is not currently included in the odr comparison. As a result, we can end up getting our original DISubprogram back, which means we will have a cloned module referring to the DICompileUnit in the original module, which causes a verification error. The fix I implemented was to consider TemplateParameters to be one of the odr-equal properties. But I'm a little uncomfortable with this. In general it seems unsound to rely on distinct MDNodes never being reachable from nodes which we only check odr-equality of. My only long term suggestion would be to separate odr-uniquing from full uniquing. Differential Revision: https://reviews.llvm.org/D29240 llvm-svn: 294240
* [ValueTracking] emit a remark when we detect a conflicting assumption (PR31809)Sanjay Patel2017-02-061-5/+28
| | | | | | | | | | | | This is a follow-up to D29395 where we try to be good citizens and let the user know that we've probably gone off the rails. This should allow us to resolve: https://llvm.org/bugs/show_bug.cgi?id=31809 Differential Revision: https://reviews.llvm.org/D29404 llvm-svn: 294208
* Fix the bug of samplepgo indirect call promption when type casting of the ↵Dehao Chen2017-02-061-7/+9
| | | | | | | | | | | | | | | | return value is needed. Summary: When type casting of the return value is needed, promoteIndirectCall will return the type casting instruction instead of the direct call. This patch changed to return the direct call instruction instead. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29569 llvm-svn: 294205
* [ArgPromote] Replace all the grep-based testing with precise FileCheckChandler Carruth2017-02-065-63/+117
| | | | | | | | | | tests. This also removes the use of instcombine as we can max the patterns produced by argument promotion directly with the more powerful tools in FileCheck. llvm-svn: 294174
* [IPCP] Don't propagate return value for naked functions.Davide Italiano2017-02-041-0/+1
| | | | | | This is pretty much the same change made in SCCP. llvm-svn: 294098
* [InstCombine] treat i1 as a special type in shouldChangeType()Sanjay Patel2017-02-031-3/+6
| | | | | | | | | | | | | | | | | | | | This patch is based on the llvm-dev discussion here: http://lists.llvm.org/pipermail/llvm-dev/2017-January/109631.html Folding to i1 should always be desirable because that's better for value tracking and we have special folds for i1 types. I checked for other users of shouldChangeType() where this might have an effect, but we already handle the i1 case differently than other types in all of those cases. Side note: the default datalayout includes i1, so it seems we only find this gap in shouldChangeType + phi folding for the case when there is (1) an explicit datalayout without i1, (2) casting to i1 from a legal type, and (3) a phi with exactly 2 incoming casted operands (as Björn mentioned). Differential Revision: https://reviews.llvm.org/D29336 llvm-svn: 294066
* [InstCombine] fix operand-complexity-based canonicalization (PR28296)Sanjay Patel2017-02-0314-28/+27
| | | | | | | | | | | | | | | | | | | The code comments didn't match the code logic, and we didn't actually distinguish the fake unary (not/neg/fneg) operators from arguments. Adding another level to the weighting scheme provides more structure and can help simplify the pattern matching in InstCombine and other places. I fixed regressions that would have shown up from this change in: rL290067 rL290127 But that doesn't mean there are no pattern-matching logic holes left; some combines may just be missing regression tests. Should fix: https://llvm.org/bugs/show_bug.cgi?id=28296 Differential Revision: https://reviews.llvm.org/D27933 llvm-svn: 294049
* [InstCombine] auto-generate better test checks; NFCSanjay Patel2017-02-031-16/+38
| | | | llvm-svn: 294040
* [InstCombine] auto-generate better test checks; NFCSanjay Patel2017-02-031-95/+116
| | | | llvm-svn: 294034
* AMDGPU: Don't unroll for private with dynamic allocasMatt Arsenault2017-02-031-0/+34
| | | | | | | This won't be elimnated, so this will just bloat code if/when these are ever used/supported. llvm-svn: 294030
* [SLP] Use SCEV to sort memory accesses.Michael Kuperstein2017-02-031-7/+120
| | | | | | | | | | This generalizes memory access sorting to use differences between SCEVs, instead of relying on constant offsets. That allows us to properly do SLP vectorization of non-sequentially ordered loads within loops bodies. Differential Revision: https://reviews.llvm.org/D29425 llvm-svn: 294027
* [SLP] Fix for PR31690: Allow using of extra values in horizontal reductions.Alexey Bataev2017-02-031-134/+114
| | | | | | | | | | | | | | | | | | | | | Currently LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also it supports a vectorization of instructions with some extra arguments, like: float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D28961 llvm-svn: 293994
* [AMDGPU] Unroll preferences improvementsStanislav Mekhanoshin2017-02-031-0/+120
| | | | | | | | | | | Exit loop analysis early if suitable private access found. Do not account for GEPs which are invariant to loop induction variable. Do not account for Allocas which are too big to fit into register file anyway. Add option for tuning: -amdgpu-unroll-threshold-private. Differential Revision: https://reviews.llvm.org/D29473 llvm-svn: 293991
* FunctionImport: Remove the -disable-force-link-odr flag and change ↵Peter Collingbourne2017-02-021-7/+3
| | | | | | | | | | importFunctions to never force link. This removes some functionality that was only being used by tests. Differential Revision: https://reviews.llvm.org/D29439 llvm-svn: 293919
* [ThinLTO] Resolve old FIXME for alias importing in testTeresa Johnson2017-02-021-2/+4
| | | | | | | This FIXME was added with r265941 and should have been resolved with r266517. llvm-svn: 293901
* [JumpThread] Enhance finding partial redundant loads by continuing scanning ↵Jun Bum Lim2017-02-021-0/+79
| | | | | | | | | | | | | | | | single predecessor Summary: While scanning predecessors to find an available loaded value, if the predecessor has a single predecessor, we can continue scanning through the single predecessor. Reviewers: mcrosier, rengolin, reames, davidxl, haicheng Reviewed By: rengolin Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D29200 llvm-svn: 293896
* [LICM] Hoist loads that are dominated by invariant.start intrinsic, and are ↵Anna Thomas2017-02-021-0/+171
| | | | | | | | | | | | | | | | | | invariant in the loop. Summary: We can hoist out loads that are dominated by invariant.start, to the preheader. We conservatively assume the load is variant, if we see a corresponding use of invariant.start (it could be an invariant.end or an escaping call). Reviewers: mkuper, sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29331 llvm-svn: 293887
* [LV] Also port failure remarks to new OptimizationRemarkEmitter APIAdam Nemet2017-02-021-0/+57
| | | | llvm-svn: 293866
* InferAddressSpaces: Handle more cases with constant select operandsMatt Arsenault2017-02-021-15/+114
| | | | llvm-svn: 293859
* [IPSCCP] Restore the old behaviour (pre r293799).Davide Italiano2017-02-021-21/+0
| | | | | | | It's not clear the change I made a good idea, and it definitely needs further discussion. Thanks to Eli for pointing out. llvm-svn: 293846
OpenPOWER on IntegriCloud