summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* NewGVN: Cleanup conditions to match realityDaniel Berlin2017-01-311-13/+8
| | | | llvm-svn: 293707
* NewGVN: Add basic support for symbolic comparison evaluationDaniel Berlin2017-01-311-3/+20
| | | | llvm-svn: 293706
* NewGVN: Formatting cleanup after lookupOperandLeader changeDaniel Berlin2017-01-311-6/+3
| | | | llvm-svn: 293705
* NewGVN: Remove the unsued two arguments from lookupOperandLeader.Daniel Berlin2017-01-311-21/+17
| | | | llvm-svn: 293704
* NewGVN: Cleanup header files we are using.Daniel Berlin2017-01-311-5/+0
| | | | llvm-svn: 293703
* [NewGVN] Preserve TargetLibraryInfo analysis.Davide Italiano2017-01-311-0/+2
| | | | | | | We can maybe preserve more but this is a first step. Ack'ed by Danny on IRC. llvm-svn: 293694
* Do not propagate DebugLoc across basic blocksTaewook Oh2017-01-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: DebugLoc shouldn't be propagated across basic blocks to prevent incorrect stepping and imprecise sample profile result. rL288903 addressed the wrong DebugLoc propagation issue by limiting the copy of DebugLoc when GVN removes a fully redundant load that is dominated by some other load. However, DebugLoc is still incorrectly propagated in the following example: ``` 1: extern int g; 2: 3: void foo(int x, int y, int z) { 4: if (x) 5: g = 0; 6: else 7: g = 1; 8: 9: int i = 0; 10: for ( ; i < y ; i++) 11: if (i > z) 12: g++; 13: } ``` Below is LLVM IR representation of the program before GVN: ``` @g = external local_unnamed_addr global i32, align 4 ; Function Attrs: nounwind uwtable define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr #0 !dbg !4 { entry: %not.tobool = icmp eq i32 %x, 0, !dbg !8 %.sink = zext i1 %not.tobool to i32, !dbg !8 store i32 %.sink, i32* @g, align 4, !tbaa !9 %cmp8 = icmp sgt i32 %y, 0, !dbg !13 br i1 %cmp8, label %for.body.preheader, label %for.end, !dbg !17 for.body.preheader: ; preds = %entry br label %for.body, !dbg !19 for.body: ; preds = %for.body.preheader, %for.inc %i.09 = phi i32 [ %inc4, %for.inc ], [ 0, %for.body.preheader ] %cmp1 = icmp sgt i32 %i.09, %z, !dbg !19 br i1 %cmp1, label %if.then2, label %for.inc, !dbg !21 if.then2: ; preds = %for.body %0 = load i32, i32* @g, align 4, !dbg !22, !tbaa !9 %inc = add nsw i32 %0, 1, !dbg !22 store i32 %inc, i32* @g, align 4, !dbg !22, !tbaa !9 br label %for.inc, !dbg !23 for.inc: ; preds = %for.body, %if.then2 %inc4 = add nuw nsw i32 %i.09, 1, !dbg !24 %exitcond = icmp ne i32 %inc4, %y, !dbg !13 br i1 %exitcond, label %for.body, label %for.end.loopexit, !dbg !17 for.end.loopexit: ; preds = %for.inc br label %for.end, !dbg !26 for.end: ; preds = %for.end.loopexit, %entry ret void, !dbg !26 } ``` where ``` !21 = !DILocation(line: 11, column: 9, scope: !15) !22 = !DILocation(line: 12, column: 8, scope: !20) !23 = !DILocation(line: 12, column: 7, scope: !20) !24 = !DILocation(line: 10, column: 20, scope: !25) ``` And below is after GVN: ``` @g = external local_unnamed_addr global i32, align 4 define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr !dbg !4 { entry: %not.tobool = icmp eq i32 %x, 0, !dbg !8 %.sink = zext i1 %not.tobool to i32, !dbg !8 store i32 %.sink, i32* @g, align 4, !tbaa !9 %cmp8 = icmp sgt i32 %y, 0, !dbg !13 br i1 %cmp8, label %for.body.preheader, label %for.end, !dbg !17 for.body.preheader: ; preds = %entry br label %for.body, !dbg !19 for.body: ; preds = %for.inc, %for.body.preheader %0 = phi i32 [ %1, %for.inc ], [ %.sink, %for.body.preheader ], !dbg !21 %i.09 = phi i32 [ %inc4, %for.inc ], [ 0, %for.body.preheader ] %cmp1 = icmp sgt i32 %i.09, %z, !dbg !19 br i1 %cmp1, label %if.then2, label %for.inc, !dbg !22 if.then2: ; preds = %for.body %inc = add nsw i32 %0, 1, !dbg !21 store i32 %inc, i32* @g, align 4, !dbg !21, !tbaa !9 br label %for.inc, !dbg !23 for.inc: ; preds = %if.then2, %for.body %1 = phi i32 [ %inc, %if.then2 ], [ %0, %for.body ] %inc4 = add nuw nsw i32 %i.09, 1, !dbg !24 %exitcond = icmp ne i32 %inc4, %y, !dbg !13 br i1 %exitcond, label %for.body, label %for.end.loopexit, !dbg !17 for.end.loopexit: ; preds = %for.inc br label %for.end, !dbg !26 for.end: ; preds = %for.end.loopexit, %entry ret void, !dbg !26 } ``` As you see, GVN removes the load in if.then2 block and creates a phi instruction in for.body for it. The problem is that DebugLoc of remove load instruction is propagated to the newly created phi instruction, which is wrong. rL288903 cannot handle this case because ValuesPerBlock.size() is not 1 in this example when the load is removed. Reviewers: aprantl, andreadb, wolfgangp Reviewed By: andreadb Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D29254 llvm-svn: 293688
* [Instcombine] Combine consecutive identical fencesDavide Italiano2017-01-312-0/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D29314 llvm-svn: 293661
* Don't combine stores to a swifterror pointer operand to a different typeArnold Schwaighofer2017-01-311-1/+2
| | | | llvm-svn: 293658
* Explicitly promote indirect calls before sample profile annotation.Dehao Chen2017-01-311-5/+24
| | | | | | | | | | | | | | Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation. Reviewers: xur, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29040 llvm-svn: 293657
* fix formatting; NFCSanjay Patel2017-01-315-17/+17
| | | | llvm-svn: 293652
* [InstCombine] Make sure that LHS and RHS have the same type inSilviu Baranga2017-01-311-0/+4
| | | | | | | | | | | | | | | | | transformToIndexedCompare If they don't have the same type, the size of the constant index would need to be adjusted (and this wouldn't be always possible). Alternatively we could try the analysis with the initial RHS value, which would guarantee that the two sides have the same type. However it is unlikely that in practice this would pass our transformation requirements. Fixes PR31808 (https://llvm.org/bugs/show_bug.cgi?id=31808). llvm-svn: 293629
* [LoopUnroll] Use addClonedBlockToLoopInfo to clone the top level loop (NFC)Florian Hahn2017-01-311-14/+6
| | | | | | | | | | | | | | | | | | | | | Summary: rL293124 added the necessary infrastructure to properly add the cloned top level loop to LoopInfo, which means we do not have to do it manually in CloneLoopBlocks. @mkuper sorry for not pointing this out during my review of D29156, I just realized that today. Reviewers: mzolotukhin, chandlerc, mkuper Reviewed By: mkuper Subscribers: llvm-commits, mkuper Differential Revision: https://reviews.llvm.org/D29173 llvm-svn: 293615
* InferAddressSpaces: Rename constantMatt Arsenault2017-01-311-6/+6
| | | | llvm-svn: 293594
* InferAddressSpaces: Handle icmpMatt Arsenault2017-01-311-8/+64
| | | | llvm-svn: 293593
* InferAddressSpaces: Support memory intrinsicsMatt Arsenault2017-01-311-14/+146
| | | | llvm-svn: 293587
* InferAddressSpaces: Support atomicsMatt Arsenault2017-01-311-16/+44
| | | | llvm-svn: 293584
* InferAddressSpaces: Don't replace volatile usersMatt Arsenault2017-01-311-2/+5
| | | | llvm-svn: 293582
* NVPTX: Move InferAddressSpaces to generic codeMatt Arsenault2017-01-313-0/+612
| | | | llvm-svn: 293579
* [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors ↵Sanjay Patel2017-01-301-54/+19
| | | | | | with splat constants llvm-svn: 293570
* [ICP] Fix bool conversion warning and actually write out the reason instead ↵Benjamin Kramer2017-01-301-1/+1
| | | | | | of dropping it. llvm-svn: 293564
* [InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat ↵Sanjay Patel2017-01-301-23/+17
| | | | | | constants llvm-svn: 293562
* Expose isLegalToPromot as a global helper function so that SamplePGO pass ↵Dehao Chen2017-01-301-46/+36
| | | | | | | | | | | | | | | | can call it for legality check. Summary: SamplePGO needs to check if it is legal to promote a target before it actually promotes it. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29306 llvm-svn: 293559
* Revert r292979 which causes compile time failure.Dehao Chen2017-01-301-19/+5
| | | | llvm-svn: 293557
* LSR: Don't drop address space when type doesn't matchMatt Arsenault2017-01-301-4/+7
| | | | | | | | | | For targets with different addressing modes in each address space, if this is dropped querying isLegalAddressingMode later with this will give a nonsense result, breaking the isLegalUse assertions. This is a candidate for the 4.0 release branch. llvm-svn: 293542
* [InstCombine] enable (X >>?exact C1) << C2 --> X >>?exact (C1-C2) for ↵Sanjay Patel2017-01-301-24/+22
| | | | | | vectors with splat constants llvm-svn: 293524
* NewGVN: Instead of changeToUnreachable, insert an instruction SimplifyCFG ↵Daniel Berlin2017-01-301-4/+5
| | | | | | will turn into unreachable when it runs llvm-svn: 293515
* [InstCombine] use auto with obvious type; NFCSanjay Patel2017-01-301-3/+3
| | | | llvm-svn: 293508
* [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1-C2) for vectors ↵Sanjay Patel2017-01-301-20/+16
| | | | | | with splat constants llvm-svn: 293507
* Revert "NewGVN: Make unreachable blocks be marked with unreachable"Daniel Berlin2017-01-301-13/+18
| | | | | | | | | This reverts commit r293196 Besides making things look nicer, ATM, we'd like to preserve analysis more than we'd like to destroy the CFG. We'll probably revisit in the future llvm-svn: 293501
* [InstCombine] fixed to propagate 'exact' on lshrSanjay Patel2017-01-301-1/+1
| | | | | | | | | | | | | | | | | The original shift is bigger, so this may qualify as 'obvious', but here's an attempt at an Alive-based proof: Name: exact Pre: (C1 u< C2) %a = shl i8 %x, C1 %b = lshr exact i8 %a, C2 => %c = lshr exact i8 %x, C2 - C1 %b = and i8 %c, ((1 << width(C1)) - 1) u>> C2 Optimization is correct! llvm-svn: 293498
* [Coroutines] Add header guard to header that's missing one.Benjamin Kramer2017-01-301-0/+5
| | | | llvm-svn: 293494
* [Inliner] Fold analysis remarks into missed remarksAdam Nemet2017-01-301-15/+12
| | | | | | This significantly reduces the noise level of these messages. llvm-svn: 293492
* [Inliner] Fix a comment to match the code. NFC.Haicheng Wu2017-01-301-2/+2
| | | | | | | | TotalAltCost => TotalSecondaryCost Differential Revision: https://reviews.llvm.org/D29231 llvm-svn: 293490
* [InstCombine] enable lshr(shl X, C1), C2 folds for vectors with splat constantsSanjay Patel2017-01-301-25/+25
| | | | llvm-svn: 293489
* Revert "[MemorySSA] Revert r293361 and r293363, as the tests fail under asan."Daniel Berlin2017-01-302-15/+34
| | | | | | | This reverts commit r293471, reapplying r293361 and r293363 with a fix for an out-of-bounds read. llvm-svn: 293474
* [MemorySSA] Revert r293361 and r293363, as the tests fail under asan.Sam McCall2017-01-302-27/+12
| | | | llvm-svn: 293471
* [LoopVectorize] Improve getVectorCallCost() getScalarizationOverhead() call.Jonas Paulsson2017-01-301-19/+8
| | | | | | | | | | | | | | By calling getScalarizationOverhead with the CallInst instead of the types of its arguments, we make sure that only unique call arguments are added to the scalarization cost. getScalarizationOverhead() is extended to handle calls by only passing on the actual call arguments (which is not all the operands). This also eliminates a wrapper function with the same name. review: Hal Finkel llvm-svn: 293459
* [MemorySSA] Correct an assertion surrounding with parentheses.Davide Italiano2017-01-301-3/+2
| | | | llvm-svn: 293453
* [InstCombine] enable (X >>?,exact C1) << C2 --> X << (C2 - C1) for vectors ↵Sanjay Patel2017-01-291-17/+17
| | | | | | with splats llvm-svn: 293435
* NewGVN: Fix where newline is printed in debug printing of memory equivalenceDaniel Berlin2017-01-291-1/+1
| | | | llvm-svn: 293428
* [ArgPromote] Move static helpers to modern LLVM naming conventions whileChandler Carruth2017-01-291-15/+15
| | | | | | | | | | here. NFC. Simple refactoring while prepping a port to the new PM. Differential Revision: https://reviews.llvm.org/D29249 llvm-svn: 293426
* [ArgPromote] Run clang-format to normalize remarkably idiosyncraticChandler Carruth2017-01-291-112/+121
| | | | | | | | | | formatting that has evolved here over the past years prior to making somewhat invasive changes to thread new PM support through the business logic. Differential Revision: https://reviews.llvm.org/D29248 llvm-svn: 293425
* [ArgPromote] Re-arrange the code in a more typical, logical way.Chandler Carruth2017-01-291-561/+547
| | | | | | | | | | | | | | | | | | | | This arranges the static helpers in an order where they are defined prior to their use to avoid the need of forward declarations, and collect the core pass components at the bottom below their helpers. This also folds one trivial function into the pass itself. Factoring this 'runImpl' was an attempt to help porting to the new pass manager, however in my attempt to begin this port in earnest it turned out to not be a substantial help. I think it will be easier to factor things without it. This is an NFC change and does a minimal amount of edits over all. Subsequent NFC cleanups will normalize the formatting with clang-format and improve the basic doxygen commenting. Differential Revision: https://reviews.llvm.org/D29247 llvm-svn: 293424
* Remove inclusion of SSAUpdater from several passes.Davide Italiano2017-01-293-3/+1
| | | | | | | | It is, in fact, unused. Found while reviewing Danny's new SSAUpdater and porting passes to it to see how the new API looked like. llvm-svn: 293407
* [PM] MLSM has been enabled for a way. Reclaim a cl::opt.Davide Italiano2017-01-281-8/+2
| | | | llvm-svn: 293401
* [SLP] Vectorize loads of consecutive memory accesses, accessed in ↵Mohammad Shahid2017-01-281-57/+120
| | | | | | | | | | | | | non-consecutive (jumbled) way. The jumbled scalar loads will be sorted while building the tree and these accesses will be marked to generate shufflevector after the vectorized load with proper mask. Reviewers: hfinkel, mssimpso, mkuper Differential Revision: https://reviews.llvm.org/D26905 Change-Id: I9c0c8e6f91a00076a7ee1465440a3f6ae092f7ad llvm-svn: 293386
* [InstCombine] Merge DebugLoc when speculatively hoisting store instructionTaewook Oh2017-01-281-8/+11
| | | | | | | | | | | | | | Summary: Along with https://reviews.llvm.org/D27804, debug locations need to be merged when hoisting store instructions as well. Not sure if just dropping debug locations would make more sense for this case, but as the branch instruction will have at least different discriminator with the hoisted store instruction, I think there will be no difference in practice. Reviewers: aprantl, andreadb, danielcdh Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29062 llvm-svn: 293372
* Use print() instead of dump() in codeMatthias Braun2017-01-281-5/+2
| | | | llvm-svn: 293371
* MemorySSA: Allow movement to arbitrary placesDaniel Berlin2017-01-281-1/+7
| | | | | | | | | | | | Summary: Extend the MemorySSAUpdater API to allow movement to arbitrary places Reviewers: davide, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29239 llvm-svn: 293363
OpenPOWER on IntegriCloud