| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 123514
|
| |
|
|
|
|
| |
to use it.
llvm-svn: 123501
|
| |
|
|
| |
llvm-svn: 123480
|
| |
|
|
|
|
| |
bitcasts, at least in simple cases. This fixes clang's CodeGenCXX/virtual-base-dtor.cpp
llvm-svn: 123477
|
| |
|
|
| |
llvm-svn: 123457
|
| |
|
|
|
|
|
| |
"promote a bunch of load and stores" logic, allowing the code to
be shared and reused.
llvm-svn: 123456
|
| |
|
|
|
|
| |
and one that uses SSAUpdater (-scalarrepl-ssa)
llvm-svn: 123436
|
| |
|
|
|
|
|
|
|
|
|
| |
instead of DomTree/DomFrontier. This may be interesting for reducing compile
time. This is currently disabled, but seems to work just fine.
When this is enabled, we eliminate two runs of dominator frontier, one in the
"early per-function" optimizations and one in the "interlaced with inliner"
function passes.
llvm-svn: 123434
|
| |
|
|
| |
llvm-svn: 123426
|
| |
|
|
|
|
|
|
|
|
|
|
| |
While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything. I fixed this in the constant folder as well. Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero. This is in accordance with the LangRef, but I must
admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.
llvm-svn: 123417
|
| |
|
|
| |
llvm-svn: 123396
|
| |
|
|
| |
llvm-svn: 123383
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a minor extension of SROA to handle a special case that is
important for some ARM NEON operations. Some of the NEON intrinsics
return multiple values, which are handled as struct types containing
multiple elements of the same vector type. The corresponding return
types declared in the arm_neon.h header have equivalent arrays. We
need SROA to recognize that it can split up those arrays and structs
into separate vectors, even though they are not always accessed with
the same type. SROA already handles loads and stores of an entire
alloca by using insertvalue/extractvalue to access the individual
pieces, and that code works the same regardless of whether the type
is a struct or an array. So, all that needs to be done is to check
for compatible arrays and homogeneous structs.
llvm-svn: 123381
|
| |
|
|
|
|
|
| |
SROA only split up structs and arrays one level at a time, so padding can
only cause trouble if it is located in between the struct or array elements.
llvm-svn: 123380
|
| |
|
|
| |
llvm-svn: 123318
|
| |
|
|
| |
llvm-svn: 123302
|
| |
|
|
|
|
| |
of the bootstrap miscompare issue.
llvm-svn: 123299
|
| |
|
|
|
|
| |
the source of the bootstrap problem.
llvm-svn: 123298
|
| |
|
|
| |
llvm-svn: 123288
|
| |
|
|
|
|
|
|
|
| |
DT->changeImmediateDominator() trivially ignores identity updates, so there is
really no need for the uniqueing provided by SmallPtrSet.
I expect this to fix PR8954.
llvm-svn: 123286
|
| |
|
|
|
|
|
|
| |
dominators
once at the beginning of GVN instead of once per iteration.
llvm-svn: 123278
|
| |
|
|
|
|
| |
x86-64 Linux.
llvm-svn: 123270
|
| |
|
|
| |
llvm-svn: 123248
|
| |
|
|
| |
llvm-svn: 123247
|
| |
|
|
|
|
|
|
| |
a new helper function so it can be reused in e.g. an upcoming SimplifySwitchOnSelect.
No functional change.
llvm-svn: 123234
|
| |
|
|
|
|
|
| |
actually reached in the testcase in PR8954, but it's safe and good
practice.
llvm-svn: 123224
|
| |
|
|
|
|
| |
is floating around in the ether.
llvm-svn: 123223
|
| |
|
|
|
|
|
|
|
|
| |
phi nodes. It is called from MergeBlockIntoPredecessor which is
called from GVN, which claims to preserve these.
I'm skeptical that this is the actual problem behind PR8954, but
this is a stab in the right direction.
llvm-svn: 123222
|
| |
|
|
| |
llvm-svn: 123221
|
| |
|
|
|
|
|
| |
neccesarily an uncond branch to the header. This fixes
PR8955 (the assertion tripping).
llvm-svn: 123219
|
| |
|
|
|
|
|
|
| |
determining which bits are demanded by
a comparison against a constant.
llvm-svn: 123203
|
| |
|
|
|
|
| |
intrinsics element dependencies. Reviewed by Nick.
llvm-svn: 123161
|
| |
|
|
| |
llvm-svn: 123149
|
| |
|
|
|
|
| |
back to life.
llvm-svn: 123146
|
| |
|
|
|
|
| |
buildbot stability.
llvm-svn: 123144
|
| |
|
|
|
|
|
|
| |
without informing memdep. This could cause nondeterminstic weirdness
based on where instructions happen to get allocated, and will hopefully
breath some life into some broken testers.
llvm-svn: 123124
|
| |
|
|
| |
llvm-svn: 123121
|
| |
|
|
| |
llvm-svn: 123117
|
| |
|
|
|
|
| |
that have the bit set.
llvm-svn: 123104
|
| |
|
|
|
|
|
| |
updating memdep when fusing stores together. This fixes the crash optimizing
the bullet benchmark.
llvm-svn: 123091
|
| |
|
|
| |
llvm-svn: 123090
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
larger memsets. Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:
_mad_synth_mute: ## @mad_synth_mute
## BB#0: ## %entry
pushq %rax
movl $4096, %esi ## imm = 0x1000
callq ___bzero
popq %rax
ret
llvm-svn: 123089
|
| |
|
|
|
|
| |
P and P+1 are relative to the same base pointer.
llvm-svn: 123087
|
| |
|
|
|
|
| |
memset into a single larger memset.
llvm-svn: 123086
|
| |
|
|
|
|
|
| |
Split memset formation logic out into its own
"tryMergingIntoMemset" helper function.
llvm-svn: 123081
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to be foldable into an uncond branch. When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.
Handle this case more aggressively. For example, previously on
phi-duplicate.ll we would get this:
define void @test(i32 %N, double* %G) nounwind ssp {
entry:
%cmp1 = icmp slt i64 1, 1000
br i1 %cmp1, label %bb.nph, label %for.end
bb.nph: ; preds = %entry
br label %for.body
for.body: ; preds = %bb.nph, %for.cond
%j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
%arrayidx = getelementptr inbounds double* %G, i64 %j.02
%tmp3 = load double* %arrayidx
%sub = sub i64 %j.02, 1
%arrayidx6 = getelementptr inbounds double* %G, i64 %sub
%tmp7 = load double* %arrayidx6
%add = fadd double %tmp3, %tmp7
%arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
store double %add, double* %arrayidx10
%inc = add nsw i64 %j.02, 1
br label %for.cond
for.cond: ; preds = %for.body
%cmp = icmp slt i64 %inc, 1000
br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge
for.cond.for.end_crit_edge: ; preds = %for.cond
br label %for.end
for.end: ; preds = %for.cond.for.end_crit_edge, %entry
ret void
}
Now we get the much nicer:
define void @test(i32 %N, double* %G) nounwind ssp {
entry:
br label %for.body
for.body: ; preds = %entry, %for.body
%j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
%arrayidx = getelementptr inbounds double* %G, i64 %j.01
%tmp3 = load double* %arrayidx
%sub = sub i64 %j.01, 1
%arrayidx6 = getelementptr inbounds double* %G, i64 %sub
%tmp7 = load double* %arrayidx6
%add = fadd double %tmp3, %tmp7
%arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
store double %add, double* %arrayidx10
%inc = add nsw i64 %j.01, 1
%cmp = icmp slt i64 %inc, 1000
br i1 %cmp, label %for.body, label %for.end
for.end: ; preds = %for.body
ret void
}
With all of these recent changes, we are now able to compile:
void foo(char *X) {
for (int i = 0; i != 100; ++i)
for (int j = 0; j != 100; ++j)
X[j+i*100] = 0;
}
into a single memset of 10000 bytes. This series of changes
should also be helpful for other nested loop scenarios as well.
llvm-svn: 123079
|
| |
|
|
|
|
|
| |
moving the OrigHeader block anymore: we just merge it away anyway so
its code layout doesn't matter.
llvm-svn: 123077
|
| |
|
|
|
|
|
|
|
|
| |
that it was leaving in loops after rotation (between the original latch
block and the original header.
With this change, it is possible for rotated loops to have just a single
basic block, which is useful.
llvm-svn: 123075
|
| |
|
|
|
|
| |
loop info.
llvm-svn: 123074
|
| |
|
|
| |
llvm-svn: 123073
|