| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 32320
|
|
|
|
|
|
| |
is 'unsigned'.
llvm-svn: 32279
|
|
|
|
| |
llvm-svn: 31453
|
|
|
|
| |
llvm-svn: 29911
|
|
|
|
|
|
|
|
| |
This reduces selectiondag time on kc++ from 5.43s to 4.98s (9%). More
significantly, this speeds up the default ppc scheduler from ~1571ms to 1063ms,
a 33% speedup.
llvm-svn: 29743
|
|
|
|
| |
llvm-svn: 29471
|
|
|
|
|
|
|
|
|
|
|
| |
2. Added argument to instruction scheduler creators so the creators can do
special things.
3. Repaired target hazard code.
4. Misc.
More to follow.
llvm-svn: 29450
|
|
|
|
| |
llvm-svn: 29434
|
|
|
|
| |
llvm-svn: 29220
|
|
|
|
| |
llvm-svn: 28973
|
|
|
|
|
|
| |
non-deterministic. Returns NULL when it's empty!
llvm-svn: 28560
|
|
|
|
|
|
|
| |
TargetData.h. This should make recompiles a bit faster with my current
TargetData tinkering.
llvm-svn: 28238
|
|
|
|
|
|
|
| |
separate file. Added an initial implementation of top-down register pressure
reduction list scheduler.
llvm-svn: 28226
|
|
|
|
| |
llvm-svn: 28212
|
|
|
|
|
|
|
| |
the distance between the def and another use is much longer). This is under
option control for now "-sched-lower-defnuse".
llvm-svn: 28201
|
|
|
|
| |
llvm-svn: 28117
|
|
|
|
|
|
|
|
|
|
| |
scheduler can go into a "vertical mode" (i.e. traversing up the two-address
chain, etc.) when the register pressure is low.
This does seem to reduce the number of spills in the cases I've looked at. But
with x86, it's no guarantee the performance of the code improves.
It can be turned on with -sched-vertically option.
llvm-svn: 28108
|
|
|
|
|
|
|
| |
the heuristic to further reduce spills for several test cases. (Note, it may
not necessarily translate to runtime win!)
llvm-svn: 28076
|
|
|
|
| |
llvm-svn: 28035
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
up the schedule. This helps code that looks like this:
loads ...
computations (first set) ...
stores (first set) ...
loads
computations (seccond set) ...
stores (seccond set) ...
Without this change, the stores and computations are more likely to
interleave:
loads ...
loads ...
computations (first set) ...
computations (second set) ...
computations (first set) ...
stores (first set) ...
computations (second set) ...
stores (stores set) ...
This can increase the number of spills if we are unlucky.
llvm-svn: 28033
|
|
|
|
| |
llvm-svn: 28030
|
|
|
|
|
|
| |
performance regressions.
llvm-svn: 28029
|
|
|
|
|
|
|
|
|
|
|
| |
instructions
to be emitted.
Don't add one to the latency of a completed instruction if the latency of the
op is 0.
llvm-svn: 26718
|
|
|
|
|
|
| |
predecessor to finish before they can start.
llvm-svn: 26717
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
operands have all issued, but whose results are not yet available. This
allows us to compile:
int G;
int test(int A, int B, int* P) {
return (G+A)*(B+1);
}
to:
_test:
lis r2, ha16(L_G$non_lazy_ptr)
addi r4, r4, 1
lwz r2, lo16(L_G$non_lazy_ptr)(r2)
lwz r2, 0(r2)
add r2, r2, r3
mullw r3, r2, r4
blr
instead of this, which has a stall between the lis/lwz:
_test:
lis r2, ha16(L_G$non_lazy_ptr)
lwz r2, lo16(L_G$non_lazy_ptr)(r2)
addi r4, r4, 1
lwz r2, 0(r2)
add r2, r2, r3
mullw r3, r2, r4
blr
llvm-svn: 26716
|
|
|
|
|
|
| |
which cycle it lands on.
llvm-svn: 26714
|
|
|
|
| |
llvm-svn: 26713
|
|
|
|
|
|
| |
is together, and direction independent code is together.
llvm-svn: 26712
|
|
|
|
|
|
|
|
|
| |
merge succs/chainsuccs -> succs set
This has no functionality change, simplifies the code, and reduces the size
of sunits.
llvm-svn: 26711
|
|
|
|
| |
llvm-svn: 26690
|
|
|
|
| |
llvm-svn: 26687
|
|
|
|
| |
llvm-svn: 26686
|
|
|
|
| |
llvm-svn: 26684
|
|
|
|
| |
llvm-svn: 26683
|
|
|
|
| |
llvm-svn: 26682
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
keep track of a sense of "mobility", i.e. how many other nodes scheduling one
node will free up. For something like this:
float testadd(float *X, float *Y, float *Z, float *W, float *V) {
return (*X+*Y)*(*Z+*W)+*V;
}
For example, this makes us schedule *X then *Y, not *X then *Z. The former
allows us to issue the add, the later only lets us issue other loads.
This turns the above code from this:
_testadd:
lfs f0, 0(r3)
lfs f1, 0(r6)
lfs f2, 0(r4)
lfs f3, 0(r5)
fadds f0, f0, f2
fadds f1, f3, f1
lfs f2, 0(r7)
fmadds f1, f0, f1, f2
blr
into this:
_testadd:
lfs f0, 0(r6)
lfs f1, 0(r5)
fadds f0, f1, f0
lfs f1, 0(r4)
lfs f2, 0(r3)
fadds f1, f2, f1
lfs f2, 0(r7)
fmadds f1, f1, f0, f2
blr
llvm-svn: 26680
|
|
|
|
|
|
| |
priority impls that want to be notified when a node is scheduled
llvm-svn: 26678
|
|
|
|
| |
llvm-svn: 26676
|
|
|
|
| |
llvm-svn: 26646
|
|
|
|
| |
llvm-svn: 26637
|
|
|
|
|
|
| |
latency priority function.
llvm-svn: 26636
|
|
|
|
|
|
|
|
|
| |
Only enable this with -use-sched-latencies, I'll enable it by default with a
clean nightly tester run tonight.
PPC is the only target that provides latency info currently.
llvm-svn: 26634
|
|
|
|
| |
llvm-svn: 26632
|
|
|
|
| |
llvm-svn: 26631
|
|
|
|
|
|
|
|
| |
class, sever its implementation from the interface. Now we can provide new
implementations of the same interface (priority computation) without touching
the scheduler itself.
llvm-svn: 26630
|
|
|
|
|
|
| |
of the ScheduleDAGList class into a new SchedulingPriorityQueue class.
llvm-svn: 26613
|
|
|
|
| |
llvm-svn: 26612
|
|
|
|
| |
llvm-svn: 26611
|
|
|
|
| |
llvm-svn: 26609
|
|
|
|
| |
llvm-svn: 26608
|