| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 293707
|
| |
|
|
| |
llvm-svn: 293706
|
| |
|
|
| |
llvm-svn: 293705
|
| |
|
|
| |
llvm-svn: 293704
|
| |
|
|
| |
llvm-svn: 293703
|
| |
|
|
|
|
|
| |
We can maybe preserve more but this is a first step.
Ack'ed by Danny on IRC.
llvm-svn: 293694
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
DebugLoc shouldn't be propagated across basic blocks to prevent incorrect stepping and imprecise sample profile result. rL288903 addressed the wrong DebugLoc propagation issue by limiting the copy of DebugLoc when GVN removes a fully redundant load that is dominated by some other load. However, DebugLoc is still incorrectly propagated in the following example:
```
1: extern int g;
2:
3: void foo(int x, int y, int z) {
4: if (x)
5: g = 0;
6: else
7: g = 1;
8:
9: int i = 0;
10: for ( ; i < y ; i++)
11: if (i > z)
12: g++;
13: }
```
Below is LLVM IR representation of the program before GVN:
```
@g = external local_unnamed_addr global i32, align 4
; Function Attrs: nounwind uwtable
define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr #0 !dbg !4 {
entry:
%not.tobool = icmp eq i32 %x, 0, !dbg !8
%.sink = zext i1 %not.tobool to i32, !dbg !8
store i32 %.sink, i32* @g, align 4, !tbaa !9
%cmp8 = icmp sgt i32 %y, 0, !dbg !13
br i1 %cmp8, label %for.body.preheader, label %for.end, !dbg !17
for.body.preheader: ; preds = %entry
br label %for.body, !dbg !19
for.body: ; preds = %for.body.preheader, %for.inc
%i.09 = phi i32 [ %inc4, %for.inc ], [ 0, %for.body.preheader ]
%cmp1 = icmp sgt i32 %i.09, %z, !dbg !19
br i1 %cmp1, label %if.then2, label %for.inc, !dbg !21
if.then2: ; preds = %for.body
%0 = load i32, i32* @g, align 4, !dbg !22, !tbaa !9
%inc = add nsw i32 %0, 1, !dbg !22
store i32 %inc, i32* @g, align 4, !dbg !22, !tbaa !9
br label %for.inc, !dbg !23
for.inc: ; preds = %for.body, %if.then2
%inc4 = add nuw nsw i32 %i.09, 1, !dbg !24
%exitcond = icmp ne i32 %inc4, %y, !dbg !13
br i1 %exitcond, label %for.body, label %for.end.loopexit, !dbg !17
for.end.loopexit: ; preds = %for.inc
br label %for.end, !dbg !26
for.end: ; preds = %for.end.loopexit, %entry
ret void, !dbg !26
}
```
where
```
!21 = !DILocation(line: 11, column: 9, scope: !15)
!22 = !DILocation(line: 12, column: 8, scope: !20)
!23 = !DILocation(line: 12, column: 7, scope: !20)
!24 = !DILocation(line: 10, column: 20, scope: !25)
```
And below is after GVN:
```
@g = external local_unnamed_addr global i32, align 4
define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr !dbg !4 {
entry:
%not.tobool = icmp eq i32 %x, 0, !dbg !8
%.sink = zext i1 %not.tobool to i32, !dbg !8
store i32 %.sink, i32* @g, align 4, !tbaa !9
%cmp8 = icmp sgt i32 %y, 0, !dbg !13
br i1 %cmp8, label %for.body.preheader, label %for.end, !dbg !17
for.body.preheader: ; preds = %entry
br label %for.body, !dbg !19
for.body: ; preds = %for.inc, %for.body.preheader
%0 = phi i32 [ %1, %for.inc ], [ %.sink, %for.body.preheader ], !dbg !21
%i.09 = phi i32 [ %inc4, %for.inc ], [ 0, %for.body.preheader ]
%cmp1 = icmp sgt i32 %i.09, %z, !dbg !19
br i1 %cmp1, label %if.then2, label %for.inc, !dbg !22
if.then2: ; preds = %for.body
%inc = add nsw i32 %0, 1, !dbg !21
store i32 %inc, i32* @g, align 4, !dbg !21, !tbaa !9
br label %for.inc, !dbg !23
for.inc: ; preds = %if.then2, %for.body
%1 = phi i32 [ %inc, %if.then2 ], [ %0, %for.body ]
%inc4 = add nuw nsw i32 %i.09, 1, !dbg !24
%exitcond = icmp ne i32 %inc4, %y, !dbg !13
br i1 %exitcond, label %for.body, label %for.end.loopexit, !dbg !17
for.end.loopexit: ; preds = %for.inc
br label %for.end, !dbg !26
for.end: ; preds = %for.end.loopexit, %entry
ret void, !dbg !26
}
```
As you see, GVN removes the load in if.then2 block and creates a phi instruction in for.body for it. The problem is that DebugLoc of remove load instruction is propagated to the newly created phi instruction, which is wrong. rL288903 cannot handle this case because ValuesPerBlock.size() is not 1 in this example when the load is removed.
Reviewers: aprantl, andreadb, wolfgangp
Reviewed By: andreadb
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D29254
llvm-svn: 293688
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D29314
llvm-svn: 293661
|
| |
|
|
| |
llvm-svn: 293658
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation.
Reviewers: xur, davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29040
llvm-svn: 293657
|
| |
|
|
| |
llvm-svn: 293652
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
transformToIndexedCompare
If they don't have the same type, the size of the constant
index would need to be adjusted (and this wouldn't be always
possible).
Alternatively we could try the analysis with the initial
RHS value, which would guarantee that the two sides have
the same type. However it is unlikely that in practice this
would pass our transformation requirements.
Fixes PR31808 (https://llvm.org/bugs/show_bug.cgi?id=31808).
llvm-svn: 293629
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
rL293124 added the necessary infrastructure to properly add the cloned
top level loop to LoopInfo, which means we do not have to do it manually
in CloneLoopBlocks.
@mkuper sorry for not pointing this out during my review of D29156, I just
realized that today.
Reviewers: mzolotukhin, chandlerc, mkuper
Reviewed By: mkuper
Subscribers: llvm-commits, mkuper
Differential Revision: https://reviews.llvm.org/D29173
llvm-svn: 293615
|
| |
|
|
| |
llvm-svn: 293594
|
| |
|
|
| |
llvm-svn: 293593
|
| |
|
|
| |
llvm-svn: 293587
|
| |
|
|
| |
llvm-svn: 293584
|
| |
|
|
| |
llvm-svn: 293582
|
| |
|
|
| |
llvm-svn: 293579
|
| |
|
|
|
|
| |
with splat constants
llvm-svn: 293570
|
| |
|
|
|
|
| |
of dropping it.
llvm-svn: 293564
|
| |
|
|
|
|
| |
constants
llvm-svn: 293562
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
can call it for legality check.
Summary: SamplePGO needs to check if it is legal to promote a target before it actually promotes it.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29306
llvm-svn: 293559
|
| |
|
|
| |
llvm-svn: 293557
|
| |
|
|
|
|
|
|
|
|
| |
For targets with different addressing modes in each address space,
if this is dropped querying isLegalAddressingMode later with this
will give a nonsense result, breaking the isLegalUse assertions.
This is a candidate for the 4.0 release branch.
llvm-svn: 293542
|
| |
|
|
|
|
| |
vectors with splat constants
llvm-svn: 293524
|
| |
|
|
|
|
| |
will turn into unreachable when it runs
llvm-svn: 293515
|
| |
|
|
| |
llvm-svn: 293508
|
| |
|
|
|
|
| |
with splat constants
llvm-svn: 293507
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r293196
Besides making things look nicer, ATM, we'd like to preserve analysis
more than we'd like to destroy the CFG. We'll probably revisit in the future
llvm-svn: 293501
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original shift is bigger, so this may qualify as 'obvious',
but here's an attempt at an Alive-based proof:
Name: exact
Pre: (C1 u< C2)
%a = shl i8 %x, C1
%b = lshr exact i8 %a, C2
=>
%c = lshr exact i8 %x, C2 - C1
%b = and i8 %c, ((1 << width(C1)) - 1) u>> C2
Optimization is correct!
llvm-svn: 293498
|
| |
|
|
| |
llvm-svn: 293494
|
| |
|
|
|
|
| |
This significantly reduces the noise level of these messages.
llvm-svn: 293492
|
| |
|
|
|
|
|
|
| |
TotalAltCost => TotalSecondaryCost
Differential Revision: https://reviews.llvm.org/D29231
llvm-svn: 293490
|
| |
|
|
| |
llvm-svn: 293489
|
| |
|
|
|
|
|
| |
This reverts commit r293471, reapplying r293361 and r293363 with a fix
for an out-of-bounds read.
llvm-svn: 293474
|
| |
|
|
| |
llvm-svn: 293471
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
By calling getScalarizationOverhead with the CallInst instead of the types of
its arguments, we make sure that only unique call arguments are added to the
scalarization cost.
getScalarizationOverhead() is extended to handle calls by only passing on the
actual call arguments (which is not all the operands).
This also eliminates a wrapper function with the same name.
review: Hal Finkel
llvm-svn: 293459
|
| |
|
|
| |
llvm-svn: 293453
|
| |
|
|
|
|
| |
with splats
llvm-svn: 293435
|
| |
|
|
| |
llvm-svn: 293428
|
| |
|
|
|
|
|
|
|
|
| |
here. NFC.
Simple refactoring while prepping a port to the new PM.
Differential Revision: https://reviews.llvm.org/D29249
llvm-svn: 293426
|
| |
|
|
|
|
|
|
|
|
| |
formatting that has evolved here over the past years prior to making
somewhat invasive changes to thread new PM support through the business
logic.
Differential Revision: https://reviews.llvm.org/D29248
llvm-svn: 293425
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This arranges the static helpers in an order where they are defined
prior to their use to avoid the need of forward declarations, and
collect the core pass components at the bottom below their helpers.
This also folds one trivial function into the pass itself. Factoring
this 'runImpl' was an attempt to help porting to the new pass manager,
however in my attempt to begin this port in earnest it turned out to not
be a substantial help. I think it will be easier to factor things
without it.
This is an NFC change and does a minimal amount of edits over all.
Subsequent NFC cleanups will normalize the formatting with clang-format
and improve the basic doxygen commenting.
Differential Revision: https://reviews.llvm.org/D29247
llvm-svn: 293424
|
| |
|
|
|
|
|
|
| |
It is, in fact, unused. Found while reviewing Danny's new
SSAUpdater and porting passes to it to see how the new API
looked like.
llvm-svn: 293407
|
| |
|
|
| |
llvm-svn: 293401
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
non-consecutive (jumbled) way.
The jumbled scalar loads will be sorted while building the tree and these accesses will be marked to generate shufflevector after the vectorized load with proper mask.
Reviewers: hfinkel, mssimpso, mkuper
Differential Revision: https://reviews.llvm.org/D26905
Change-Id: I9c0c8e6f91a00076a7ee1465440a3f6ae092f7ad
llvm-svn: 293386
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Along with https://reviews.llvm.org/D27804, debug locations need to be merged when hoisting store instructions as well. Not sure if just dropping debug locations would make more sense for this case, but as the branch instruction will have at least different discriminator with the hoisted store instruction, I think there will be no difference in practice.
Reviewers: aprantl, andreadb, danielcdh
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29062
llvm-svn: 293372
|
| |
|
|
| |
llvm-svn: 293371
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Extend the MemorySSAUpdater API to allow movement to arbitrary places
Reviewers: davide, george.burgess.iv
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29239
llvm-svn: 293363
|