summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [SLP] Fix for PR31690: Allow using of extra values in horizontal reductions.Alexey Bataev2017-02-031-134/+114
| | | | | | | | | | | | | | | | | | | | | Currently LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also it supports a vectorization of instructions with some extra arguments, like: float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D28961 llvm-svn: 293994
* [AMDGPU] Unroll preferences improvementsStanislav Mekhanoshin2017-02-031-0/+120
| | | | | | | | | | | Exit loop analysis early if suitable private access found. Do not account for GEPs which are invariant to loop induction variable. Do not account for Allocas which are too big to fit into register file anyway. Add option for tuning: -amdgpu-unroll-threshold-private. Differential Revision: https://reviews.llvm.org/D29473 llvm-svn: 293991
* FunctionImport: Remove the -disable-force-link-odr flag and change ↵Peter Collingbourne2017-02-021-7/+3
| | | | | | | | | | importFunctions to never force link. This removes some functionality that was only being used by tests. Differential Revision: https://reviews.llvm.org/D29439 llvm-svn: 293919
* [ThinLTO] Resolve old FIXME for alias importing in testTeresa Johnson2017-02-021-2/+4
| | | | | | | This FIXME was added with r265941 and should have been resolved with r266517. llvm-svn: 293901
* [JumpThread] Enhance finding partial redundant loads by continuing scanning ↵Jun Bum Lim2017-02-021-0/+79
| | | | | | | | | | | | | | | | single predecessor Summary: While scanning predecessors to find an available loaded value, if the predecessor has a single predecessor, we can continue scanning through the single predecessor. Reviewers: mcrosier, rengolin, reames, davidxl, haicheng Reviewed By: rengolin Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D29200 llvm-svn: 293896
* [LICM] Hoist loads that are dominated by invariant.start intrinsic, and are ↵Anna Thomas2017-02-021-0/+171
| | | | | | | | | | | | | | | | | | invariant in the loop. Summary: We can hoist out loads that are dominated by invariant.start, to the preheader. We conservatively assume the load is variant, if we see a corresponding use of invariant.start (it could be an invariant.end or an escaping call). Reviewers: mkuper, sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29331 llvm-svn: 293887
* [LV] Also port failure remarks to new OptimizationRemarkEmitter APIAdam Nemet2017-02-021-0/+57
| | | | llvm-svn: 293866
* InferAddressSpaces: Handle more cases with constant select operandsMatt Arsenault2017-02-021-15/+114
| | | | llvm-svn: 293859
* [IPSCCP] Restore the old behaviour (pre r293799).Davide Italiano2017-02-021-21/+0
| | | | | | | It's not clear the change I made a good idea, and it definitely needs further discussion. Thanks to Eli for pointing out. llvm-svn: 293846
* [ValueTracking] remove a FIXME for something we don't want to do; NFCSanjay Patel2017-02-011-2/+1
| | | | | | | | The comment was added with: https://reviews.llvm.org/rL293773 ...but there would be a cost to implement this and possibly no payoff. llvm-svn: 293823
* fix typos; NFCSanjay Patel2017-02-011-4/+4
| | | | llvm-svn: 293816
* [InstCombine] move folds for shift-shift pairs; NFCISanjay Patel2017-02-011-0/+52
| | | | | | | | | | | Although this is 'no-functional-change-intended', I'm adding tests for shl-shl and lshr-lshr pairs because there is no existing test coverage for those folds. It seems like we should be able to remove some code from foldShiftedShift() at this point because we're handling those patterns on the general path. llvm-svn: 293814
* [SCCP] Make sure we get this case right without noinline.Davide Italiano2017-02-011-1/+1
| | | | | | | Thanks to Hal for pointing out in the post-commit review of r293727. llvm-svn: 293801
* [IPSCCP] Don't propagate return values of functions marked as noinline.Davide Italiano2017-02-011-0/+21
| | | | | | | | | | | | This tries to address what Hal defined (in the post-commit review of r293727) a long-standing problem with noinline, where we end up de facto inlining trivial functions e.g. __attribute__((noinline)) int patatino(void) { return 5; } because of return value propagation. llvm-svn: 293799
* [InstCombine] Allow InstCombine to merge adjacent guardsSanjoy Das2017-02-011-8/+10
| | | | | | | | | | | | | | | | | | | | Summary: If there are two adjacent guards with different conditions, we can remove one of them and include its condition into the condition of another one. This patch allows InstCombine to merge them by the following pattern: guard(a); guard(b) -> guard(a & b). Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29378 llvm-svn: 293778
* [ValueTracking] avoid crashing from bad assumptions (PR31809)Sanjay Patel2017-02-011-0/+36
| | | | | | | | | | | | | A program may contain llvm.assume info that disagrees with other analysis. This may be caused by UB in the program, so we must not crash because of that. As noted in the code comments: https://llvm.org/bugs/show_bug.cgi?id=31809 ...we can do better, but this at least avoids the assert/crash in the bug report. Differential Revision: https://reviews.llvm.org/D29395 llvm-svn: 293773
* [IPSCCP] Teach how to not propagate return values of naked functions.Davide Italiano2017-02-011-0/+28
| | | | | | Differential Revision: https://reviews.llvm.org/D29360 llvm-svn: 293727
* InferAddressSpaces: Handle selectMatt Arsenault2017-02-011-0/+165
| | | | | | | This fails to handle some cases where one of the inputs is a constant to be fixed in a later commit. llvm-svn: 293723
* InferAddressSpaces: Fix broken casting of constantsMatt Arsenault2017-01-311-2/+20
| | | | llvm-svn: 293718
* Do not propagate DebugLoc across basic blocksTaewook Oh2017-01-312-0/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: DebugLoc shouldn't be propagated across basic blocks to prevent incorrect stepping and imprecise sample profile result. rL288903 addressed the wrong DebugLoc propagation issue by limiting the copy of DebugLoc when GVN removes a fully redundant load that is dominated by some other load. However, DebugLoc is still incorrectly propagated in the following example: ``` 1: extern int g; 2: 3: void foo(int x, int y, int z) { 4: if (x) 5: g = 0; 6: else 7: g = 1; 8: 9: int i = 0; 10: for ( ; i < y ; i++) 11: if (i > z) 12: g++; 13: } ``` Below is LLVM IR representation of the program before GVN: ``` @g = external local_unnamed_addr global i32, align 4 ; Function Attrs: nounwind uwtable define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr #0 !dbg !4 { entry: %not.tobool = icmp eq i32 %x, 0, !dbg !8 %.sink = zext i1 %not.tobool to i32, !dbg !8 store i32 %.sink, i32* @g, align 4, !tbaa !9 %cmp8 = icmp sgt i32 %y, 0, !dbg !13 br i1 %cmp8, label %for.body.preheader, label %for.end, !dbg !17 for.body.preheader: ; preds = %entry br label %for.body, !dbg !19 for.body: ; preds = %for.body.preheader, %for.inc %i.09 = phi i32 [ %inc4, %for.inc ], [ 0, %for.body.preheader ] %cmp1 = icmp sgt i32 %i.09, %z, !dbg !19 br i1 %cmp1, label %if.then2, label %for.inc, !dbg !21 if.then2: ; preds = %for.body %0 = load i32, i32* @g, align 4, !dbg !22, !tbaa !9 %inc = add nsw i32 %0, 1, !dbg !22 store i32 %inc, i32* @g, align 4, !dbg !22, !tbaa !9 br label %for.inc, !dbg !23 for.inc: ; preds = %for.body, %if.then2 %inc4 = add nuw nsw i32 %i.09, 1, !dbg !24 %exitcond = icmp ne i32 %inc4, %y, !dbg !13 br i1 %exitcond, label %for.body, label %for.end.loopexit, !dbg !17 for.end.loopexit: ; preds = %for.inc br label %for.end, !dbg !26 for.end: ; preds = %for.end.loopexit, %entry ret void, !dbg !26 } ``` where ``` !21 = !DILocation(line: 11, column: 9, scope: !15) !22 = !DILocation(line: 12, column: 8, scope: !20) !23 = !DILocation(line: 12, column: 7, scope: !20) !24 = !DILocation(line: 10, column: 20, scope: !25) ``` And below is after GVN: ``` @g = external local_unnamed_addr global i32, align 4 define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr !dbg !4 { entry: %not.tobool = icmp eq i32 %x, 0, !dbg !8 %.sink = zext i1 %not.tobool to i32, !dbg !8 store i32 %.sink, i32* @g, align 4, !tbaa !9 %cmp8 = icmp sgt i32 %y, 0, !dbg !13 br i1 %cmp8, label %for.body.preheader, label %for.end, !dbg !17 for.body.preheader: ; preds = %entry br label %for.body, !dbg !19 for.body: ; preds = %for.inc, %for.body.preheader %0 = phi i32 [ %1, %for.inc ], [ %.sink, %for.body.preheader ], !dbg !21 %i.09 = phi i32 [ %inc4, %for.inc ], [ 0, %for.body.preheader ] %cmp1 = icmp sgt i32 %i.09, %z, !dbg !19 br i1 %cmp1, label %if.then2, label %for.inc, !dbg !22 if.then2: ; preds = %for.body %inc = add nsw i32 %0, 1, !dbg !21 store i32 %inc, i32* @g, align 4, !dbg !21, !tbaa !9 br label %for.inc, !dbg !23 for.inc: ; preds = %if.then2, %for.body %1 = phi i32 [ %inc, %if.then2 ], [ %0, %for.body ] %inc4 = add nuw nsw i32 %i.09, 1, !dbg !24 %exitcond = icmp ne i32 %inc4, %y, !dbg !13 br i1 %exitcond, label %for.body, label %for.end.loopexit, !dbg !17 for.end.loopexit: ; preds = %for.inc br label %for.end, !dbg !26 for.end: ; preds = %for.end.loopexit, %entry ret void, !dbg !26 } ``` As you see, GVN removes the load in if.then2 block and creates a phi instruction in for.body for it. The problem is that DebugLoc of remove load instruction is propagated to the newly created phi instruction, which is wrong. rL288903 cannot handle this case because ValuesPerBlock.size() is not 1 in this example when the load is removed. Reviewers: aprantl, andreadb, wolfgangp Reviewed By: andreadb Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D29254 llvm-svn: 293688
* InterleaveAccessPass: Avoid constructing invalid shuffle masksMatthias Braun2017-01-312-0/+36
| | | | | | | | | Fix a bug where we would construct shufflevector instructions addressing invalid elements. Differential Revision: https://reviews.llvm.org/D29313 llvm-svn: 293673
* [Instcombine] Combine consecutive identical fencesDavide Italiano2017-01-311-0/+47
| | | | | | Differential Revision: https://reviews.llvm.org/D29314 llvm-svn: 293661
* Don't combine stores to a swifterror pointer operand to a different typeArnold Schwaighofer2017-01-311-0/+19
| | | | llvm-svn: 293658
* Explicitly promote indirect calls before sample profile annotation.Dehao Chen2017-01-312-0/+43
| | | | | | | | | | | | | | Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation. Reviewers: xur, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29040 llvm-svn: 293657
* [InstCombine] add test for possible zext-phi transform; NFCSanjay Patel2017-01-311-0/+29
| | | | | | | | | The datalayout doesn't include i1, so we don't do a potential shrink and sink transform. Example based on discussion here: http://lists.llvm.org/pipermail/llvm-dev/2017-January/109631.html llvm-svn: 293656
* [InstCombine] Make sure that LHS and RHS have the same type inSilviu Baranga2017-01-311-0/+17
| | | | | | | | | | | | | | | | | transformToIndexedCompare If they don't have the same type, the size of the constant index would need to be adjusted (and this wouldn't be always possible). Alternatively we could try the analysis with the initial RHS value, which would guarantee that the two sides have the same type. However it is unlikely that in practice this would pass our transformation requirements. Fixes PR31808 (https://llvm.org/bugs/show_bug.cgi?id=31808). llvm-svn: 293629
* InferAddressSpaces: Handle icmpMatt Arsenault2017-01-312-4/+145
| | | | llvm-svn: 293593
* InferAddressSpaces: Support memory intrinsicsMatt Arsenault2017-01-313-1/+249
| | | | llvm-svn: 293587
* InferAddressSpaces: Support atomicsMatt Arsenault2017-01-312-0/+78
| | | | llvm-svn: 293584
* InferAddressSpaces: Don't replace volatile usersMatt Arsenault2017-01-311-0/+82
| | | | llvm-svn: 293582
* AMDGPU: Implement hook for InferAddressSpacesMatt Arsenault2017-01-314-0/+453
| | | | | | | | | | For now just port some of the existing NVPTX tests and from an old HSAIL optimization pass which approximately did the same thing. Don't enable the pass yet until more testing is done. llvm-svn: 293580
* [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors ↵Sanjay Patel2017-01-301-2/+1
| | | | | | with splat constants llvm-svn: 293570
* [InstCombine] add vector test for (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2); NFCSanjay Patel2017-01-301-0/+15
| | | | llvm-svn: 293566
* [InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat ↵Sanjay Patel2017-01-301-4/+12
| | | | | | constants llvm-svn: 293562
* Revert r292979 which causes compile time failure.Dehao Chen2017-01-302-39/+0
| | | | llvm-svn: 293557
* [InstCombine] add tests for more shift-shift patterns; NFCSanjay Patel2017-01-301-0/+33
| | | | llvm-svn: 293555
* LSR: Don't drop address space when type doesn't matchMatt Arsenault2017-01-301-0/+54
| | | | | | | | | | For targets with different addressing modes in each address space, if this is dropped querying isLegalAddressingMode later with this will give a nonsense result, breaking the isLegalUse assertions. This is a candidate for the 4.0 release branch. llvm-svn: 293542
* [InstCombine] enable (X >>?exact C1) << C2 --> X >>?exact (C1-C2) for ↵Sanjay Patel2017-01-301-4/+2
| | | | | | vectors with splat constants llvm-svn: 293524
* [InstCombine] add vector splat tests for (X >>?exact C1) << C2 --> X ↵Sanjay Patel2017-01-301-6/+36
| | | | | | >>?exact (C1-C2); NFC llvm-svn: 293517
* NewGVN: Instead of changeToUnreachable, insert an instruction SimplifyCFG ↵Daniel Berlin2017-01-302-0/+2
| | | | | | will turn into unreachable when it runs llvm-svn: 293515
* [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1-C2) for vectors ↵Sanjay Patel2017-01-301-3/+3
| | | | | | with splat constants llvm-svn: 293507
* Update pr31758.ll for unreachable revertDaniel Berlin2017-01-301-1/+1
| | | | llvm-svn: 293502
* Revert "NewGVN: Make unreachable blocks be marked with unreachable"Daniel Berlin2017-01-302-20/+20
| | | | | | | | | This reverts commit r293196 Besides making things look nicer, ATM, we'd like to preserve analysis more than we'd like to destroy the CFG. We'll probably revisit in the future llvm-svn: 293501
* [InstCombine] fixed to propagate 'exact' on lshrSanjay Patel2017-01-301-1/+1
| | | | | | | | | | | | | | | | | The original shift is bigger, so this may qualify as 'obvious', but here's an attempt at an Alive-based proof: Name: exact Pre: (C1 u< C2) %a = shl i8 %x, C1 %b = lshr exact i8 %a, C2 => %c = lshr exact i8 %x, C2 - C1 %b = and i8 %c, ((1 << width(C1)) - 1) u>> C2 Optimization is correct! llvm-svn: 293498
* [InstCombine] add 'exact' to lshr to show that it got dropped; NFC Sanjay Patel2017-01-301-1/+2
| | | | llvm-svn: 293496
* [Inliner] Fold analysis remarks into missed remarksAdam Nemet2017-01-302-4/+2
| | | | | | This significantly reduces the noise level of these messages. llvm-svn: 293492
* [InstCombine] enable lshr(shl X, C1), C2 folds for vectors with splat constantsSanjay Patel2017-01-301-4/+3
| | | | llvm-svn: 293489
* [InstCombine] add tests for shift-shift patterns; NFCSanjay Patel2017-01-301-0/+57
| | | | llvm-svn: 293487
* [InstCombine] enable (X >>?,exact C1) << C2 --> X << (C2 - C1) for vectors ↵Sanjay Patel2017-01-291-6/+2
| | | | | | with splats llvm-svn: 293435
* [InstCombine] add tests for shl(shr X, C1), C2 transforms; NFCSanjay Patel2017-01-291-4/+58
| | | | llvm-svn: 293434
OpenPOWER on IntegriCloud