| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
This folds a simple loop tail into a loop latch. It covers the common (in fortran) case of postincrement loops. It's a "free" way to expose this type of loop to downstream loop optimizations that bail out on non-canonical loops (getLoopLatch is a heavily used check).
llvm-svn: 150439
|
|
|
|
|
|
| |
subsumed by extractvalue.
llvm-svn: 133247
|
|
|
|
|
|
|
| |
syntax and has been long obsolete. As usual, updating the tests is the nasty
part of this.
llvm-svn: 133242
|
|
|
|
|
|
| |
indirectbr.
llvm-svn: 129203
|
|
|
|
|
|
| |
that is modified inside loop.
llvm-svn: 125529
|
|
|
|
| |
llvm-svn: 123220
|
|
|
|
|
|
|
| |
neccesarily an uncond branch to the header. This fixes
PR8955 (the assertion tripping).
llvm-svn: 123219
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to be foldable into an uncond branch. When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.
Handle this case more aggressively. For example, previously on
phi-duplicate.ll we would get this:
define void @test(i32 %N, double* %G) nounwind ssp {
entry:
%cmp1 = icmp slt i64 1, 1000
br i1 %cmp1, label %bb.nph, label %for.end
bb.nph: ; preds = %entry
br label %for.body
for.body: ; preds = %bb.nph, %for.cond
%j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
%arrayidx = getelementptr inbounds double* %G, i64 %j.02
%tmp3 = load double* %arrayidx
%sub = sub i64 %j.02, 1
%arrayidx6 = getelementptr inbounds double* %G, i64 %sub
%tmp7 = load double* %arrayidx6
%add = fadd double %tmp3, %tmp7
%arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
store double %add, double* %arrayidx10
%inc = add nsw i64 %j.02, 1
br label %for.cond
for.cond: ; preds = %for.body
%cmp = icmp slt i64 %inc, 1000
br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge
for.cond.for.end_crit_edge: ; preds = %for.cond
br label %for.end
for.end: ; preds = %for.cond.for.end_crit_edge, %entry
ret void
}
Now we get the much nicer:
define void @test(i32 %N, double* %G) nounwind ssp {
entry:
br label %for.body
for.body: ; preds = %entry, %for.body
%j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
%arrayidx = getelementptr inbounds double* %G, i64 %j.01
%tmp3 = load double* %arrayidx
%sub = sub i64 %j.01, 1
%arrayidx6 = getelementptr inbounds double* %G, i64 %sub
%tmp7 = load double* %arrayidx6
%add = fadd double %tmp3, %tmp7
%arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
store double %add, double* %arrayidx10
%inc = add nsw i64 %j.01, 1
%cmp = icmp slt i64 %inc, 1000
br i1 %cmp, label %for.body, label %for.end
for.end: ; preds = %for.body
ret void
}
With all of these recent changes, we are now able to compile:
void foo(char *X) {
for (int i = 0; i != 100; ++i)
for (int j = 0; j != 100; ++j)
X[j+i*100] = 0;
}
into a single memset of 10000 bytes. This series of changes
should also be helpful for other nested loop scenarios as well.
llvm-svn: 123079
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Rip out LoopRotate's domfrontier updating code. It isn't
needed now that LICM doesn't use DF and it is super complex
and gross.
2. Make DomTree updating code a lot simpler and faster. The
old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
SplitCriticalEdge instead of doing an overcomplex
reimplementation of it.
No behavior change, except for the name of the inserted preheader.
llvm-svn: 123072
|
|
|
|
|
|
|
|
|
|
|
|
| |
them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops. This also better exposes the
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.
Not aggressively handling this is pessimizing later loop optimizations
somethin' fierce by making "dominates all exit blocks" checks fail.
llvm-svn: 123060
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the duplicated block instead of duplicating them.
Duplicating them into the end of the loop and the preheader
means that we got a phi node in the header of the loop,
which prevented LICM from hoisting them. GVN would
usually come around later and merge the duplicated
instructions so we'd get reasonable output... except that
anything dependent on the shoulda-been-hoisted value can't
be hoisted. In PR5319 (which this fixes), a memory value
didn't get promoted.
llvm-svn: 113134
|
|
|
|
|
|
|
| |
loop, making the resulting loop significantly less ugly. Also, zap
its trivial PHI nodes, since it's easy.
llvm-svn: 111255
|
|
|
|
| |
llvm-svn: 106954
|
|
|
|
| |
llvm-svn: 92740
|
|
|
|
|
|
|
|
|
|
| |
'GetValueInMiddleOfBlock' case, instead of inserting
duplicates.
A similar fix is almost certainly needed by the machine-level
SSAUpdate implementation.
llvm-svn: 91820
|
|
|
|
|
|
|
|
|
|
|
|
| |
it may be used in contexts where preheader insertion may have failed due
to an indirectbr.
Make LoopSimplify's LoopSimplify::SeparateNestedLoop properly fail in
the case that it would require splitting an indirectbr edge.
These fix PR5502.
llvm-svn: 89484
|
|
|
|
|
|
|
| |
-verify-dom-info and -verify-loop-info, which enable additional
(expensive) consistency checks.
llvm-svn: 85017
|
|
|
|
| |
llvm-svn: 83002
|
|
|
|
|
|
|
|
| |
input filename so that opt doesn't print the input filename in the
output so that grep lines in the tests don't unintentionally match
strings in the input filename.
llvm-svn: 81537
|
|
|
|
| |
llvm-svn: 81257
|
|
|
|
|
|
| |
of using llvm-as, now that opt supports this.
llvm-svn: 81226
|
|
|
|
| |
llvm-svn: 69867
|
|
|
|
|
|
| |
handling the flaw inherent in that assumption. :)
llvm-svn: 62984
|
|
|
|
| |
llvm-svn: 51349
|
|
|
|
|
|
| |
renaming to isnan2. Now that no test has llx ending there is no need to search for them from dg.exp too.
llvm-svn: 51328
|
|
|
|
| |
llvm-svn: 50548
|
|
|
|
|
|
|
|
| |
of -std-compile-opts and is now failing because other passes are generating
IR that looks different to input of loop rotate. Devang, please
introduce a testcase that only runs loop rotate.
llvm-svn: 50136
|
|
|
|
| |
llvm-svn: 48762
|
|
|
|
|
|
|
|
|
| |
OnlyReadsMemoryFns tables are dead! We
get more, and more accurate, information
from gcc via the readnone and readonly
function attributes.
llvm-svn: 44288
|
|
|
|
| |
llvm-svn: 40629
|
|
|
|
| |
llvm-svn: 39768
|
|
|
|
| |
llvm-svn: 37801
|
|
|
|
| |
llvm-svn: 36323
|
|
|
|
|
|
|
|
| |
Remove && from the end of the lines to prevent tests from throwing run
lines into the background. Also, clean up places where the same command
is run multiple times by using a temporary file.
llvm-svn: 36142
|
|
|
|
|
|
| |
Upgrade to use new Tcl exec based test harness.
llvm-svn: 36062
|
|
|
|
| |
llvm-svn: 35919
|
|
|
|
|
|
|
|
| |
global variables that needed to be passed in. This makes it possible to
add new global variables with only a couple changes (Makefile and llvm-dg.exp)
instead of touching every single dg.exp file.
llvm-svn: 35918
|
|
|
|
| |
llvm-svn: 35865
|
|
|
|
| |
llvm-svn: 35849
|
|
llvm-svn: 35838
|