| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 123152
|
| |
|
|
|
|
| |
PathV2::fs::exists.
llvm-svn: 123151
|
| |
|
|
| |
llvm-svn: 123149
|
| |
|
|
|
|
| |
back to life.
llvm-svn: 123146
|
| |
|
|
| |
llvm-svn: 123145
|
| |
|
|
|
|
| |
buildbot stability.
llvm-svn: 123144
|
| |
|
|
| |
llvm-svn: 123142
|
| |
|
|
| |
llvm-svn: 123141
|
| |
|
|
| |
llvm-svn: 123139
|
| |
|
|
|
|
|
| |
NUW AddRec's much more aggressively. We now get a trip count
for @test2 in nsw.ll
llvm-svn: 123138
|
| |
|
|
| |
llvm-svn: 123136
|
| |
|
|
|
|
|
|
| |
perform rounding other than truncation in the IR. Common C code for this
turns into really an LLVM intrinsic call that blocks a lot of further
optimizations.
llvm-svn: 123135
|
| |
|
|
|
|
|
|
| |
a + {b,+,stride} into {a+b,+,stride} (because a is LIV),
then the resultant AddRec is NUW/NSW if the client says it
is.
llvm-svn: 123133
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
void f(int* begin, int* end) { std::fill(begin, end, 0); }
which turns into a != exit expression where one pointer is
strided and (thanks to step #1) known to not overflow, and
the other is loop invariant.
The observation here is that, though the IV is strided by
4 in this case, that the IV *has* to become equal to the
end value. It cannot "miss" the end value by stepping over
it, because if it did, the strided IV expression would
eventually wrap around.
Handle this by turning A != B into "A-B != 0" where the A-B
part is known to be NUW.
llvm-svn: 123131
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when no virtual registers have been allocated.
It was only used to resize IndexedMaps, so provide an IndexedMap::resize()
method such that
Map.grow(MRI.getLastVirtReg());
can be replaced with the simpler
Map.resize(MRI.getNumVirtRegs());
This works correctly when no virtuals are allocated, and it bypasses the to/from
index conversions.
llvm-svn: 123130
|
| |
|
|
| |
llvm-svn: 123129
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
physical register numbers.
This makes the hack used in LiveInterval official, and lets LiveInterval be
oblivious of stack slots.
The isPhysicalRegister() and isVirtualRegister() predicates don't know about
this, so when a variable may contain a stack slot, isStackSlot() should always
be tested first.
llvm-svn: 123128
|
| |
|
|
| |
llvm-svn: 123126
|
| |
|
|
|
|
|
|
| |
without informing memdep. This could cause nondeterminstic weirdness
based on where instructions happen to get allocated, and will hopefully
breath some life into some broken testers.
llvm-svn: 123124
|
| |
|
|
| |
llvm-svn: 123123
|
| |
|
|
| |
llvm-svn: 123121
|
| |
|
|
| |
llvm-svn: 123117
|
| |
|
|
| |
llvm-svn: 123116
|
| |
|
|
| |
llvm-svn: 123115
|
| |
|
|
| |
llvm-svn: 123114
|
| |
|
|
|
|
|
| |
Also, switch to a more clear 'sink' function with its declaration to
avoid any confusion about 'g'. Thanks for the suggestion Frits.
llvm-svn: 123113
|
| |
|
|
| |
llvm-svn: 123112
|
| |
|
|
| |
llvm-svn: 123111
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
of using a Location class with the same information.
When making a copy of a MachineOperand that was already stored in a
MachineInstr, it is necessary to clear the parent pointer on the copy. Otherwise
the register use-def lists become inconsistent.
Add MachineOperand::clearParent() to do that. An alternative would be a custom
MachineOperand copy constructor that cleared ParentMI. I didn't want to do that
because of the performance impact.
llvm-svn: 123109
|
| |
|
|
| |
llvm-svn: 123108
|
| |
|
|
|
|
|
|
|
|
| |
without a TRI instance.
Print virtual registers numbered from 0 instead of the arbitrary
FirstVirtualRegister. The first virtual register is printed as %vreg0.
TRI::NoRegister is printed as %noreg.
llvm-svn: 123107
|
| |
|
|
| |
llvm-svn: 123106
|
| |
|
|
|
|
|
| |
with GEP instructions are always NUW, because PHIs cannot wrap
the end of the address space.
llvm-svn: 123105
|
| |
|
|
|
|
| |
that have the bit set.
llvm-svn: 123104
|
| |
|
|
| |
llvm-svn: 123103
|
| |
|
|
| |
llvm-svn: 123102
|
| |
|
|
|
|
|
|
| |
depending on TRI::FirstVirtualRegister.
Also use TRI::printReg instead of printing virtual registers directly.
llvm-svn: 123101
|
| |
|
|
|
|
| |
virtual registers.
llvm-svn: 123100
|
| |
|
|
|
|
|
|
| |
Provide MRI::getNumVirtRegs() and TRI::index2VirtReg() functions to allow
iteration over virtual registers without depending on the representation of
virtual register numbers.
llvm-svn: 123098
|
| |
|
|
|
|
| |
TargetRegisterInfo::FirstVirtualRegister.
llvm-svn: 123096
|
| |
|
|
| |
llvm-svn: 123093
|
| |
|
|
|
|
|
| |
updating memdep when fusing stores together. This fixes the crash optimizing
the bullet benchmark.
llvm-svn: 123091
|
| |
|
|
| |
llvm-svn: 123090
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
larger memsets. Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:
_mad_synth_mute: ## @mad_synth_mute
## BB#0: ## %entry
pushq %rax
movl $4096, %esi ## imm = 0x1000
callq ___bzero
popq %rax
ret
llvm-svn: 123089
|
| |
|
|
|
|
| |
P and P+1 are relative to the same base pointer.
llvm-svn: 123087
|
| |
|
|
|
|
| |
memset into a single larger memset.
llvm-svn: 123086
|
| |
|
|
|
|
|
| |
Split memset formation logic out into its own
"tryMergingIntoMemset" helper function.
llvm-svn: 123081
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to be foldable into an uncond branch. When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.
Handle this case more aggressively. For example, previously on
phi-duplicate.ll we would get this:
define void @test(i32 %N, double* %G) nounwind ssp {
entry:
%cmp1 = icmp slt i64 1, 1000
br i1 %cmp1, label %bb.nph, label %for.end
bb.nph: ; preds = %entry
br label %for.body
for.body: ; preds = %bb.nph, %for.cond
%j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
%arrayidx = getelementptr inbounds double* %G, i64 %j.02
%tmp3 = load double* %arrayidx
%sub = sub i64 %j.02, 1
%arrayidx6 = getelementptr inbounds double* %G, i64 %sub
%tmp7 = load double* %arrayidx6
%add = fadd double %tmp3, %tmp7
%arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
store double %add, double* %arrayidx10
%inc = add nsw i64 %j.02, 1
br label %for.cond
for.cond: ; preds = %for.body
%cmp = icmp slt i64 %inc, 1000
br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge
for.cond.for.end_crit_edge: ; preds = %for.cond
br label %for.end
for.end: ; preds = %for.cond.for.end_crit_edge, %entry
ret void
}
Now we get the much nicer:
define void @test(i32 %N, double* %G) nounwind ssp {
entry:
br label %for.body
for.body: ; preds = %entry, %for.body
%j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
%arrayidx = getelementptr inbounds double* %G, i64 %j.01
%tmp3 = load double* %arrayidx
%sub = sub i64 %j.01, 1
%arrayidx6 = getelementptr inbounds double* %G, i64 %sub
%tmp7 = load double* %arrayidx6
%add = fadd double %tmp3, %tmp7
%arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
store double %add, double* %arrayidx10
%inc = add nsw i64 %j.01, 1
%cmp = icmp slt i64 %inc, 1000
br i1 %cmp, label %for.body, label %for.end
for.end: ; preds = %for.body
ret void
}
With all of these recent changes, we are now able to compile:
void foo(char *X) {
for (int i = 0; i != 100; ++i)
for (int j = 0; j != 100; ++j)
X[j+i*100] = 0;
}
into a single memset of 10000 bytes. This series of changes
should also be helpful for other nested loop scenarios as well.
llvm-svn: 123079
|
| |
|
|
| |
llvm-svn: 123078
|
| |
|
|
|
|
|
| |
moving the OrigHeader block anymore: we just merge it away anyway so
its code layout doesn't matter.
llvm-svn: 123077
|