| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 26779
|
| |
|
|
| |
llvm-svn: 26777
|
| |
|
|
| |
llvm-svn: 26776
|
| |
|
|
|
|
|
|
|
| |
offset,
and the offset lands at a field boundary in the old type, construct a new type,
copying the fields masked by the offset from the old type, and unify with that.
llvm-svn: 26775
|
| |
|
|
| |
llvm-svn: 26774
|
| |
|
|
| |
llvm-svn: 26764
|
| |
|
|
| |
llvm-svn: 26763
|
| |
|
|
| |
llvm-svn: 26762
|
| |
|
|
|
|
|
| |
2. Allow for user defined debug descriptors.
3. Allow for user augmented fields on debug descriptors.
llvm-svn: 26760
|
| |
|
|
| |
llvm-svn: 26758
|
| |
|
|
|
|
|
|
| |
A*A*B + A*A*C --> A*(A*B+A*C) --> A*(A*(B+C))
This implements Reassociate/mul-factor3.ll
llvm-svn: 26757
|
| |
|
|
| |
llvm-svn: 26755
|
| |
|
|
| |
llvm-svn: 26754
|
| |
|
|
|
|
|
| |
(x<<1)+(y<<1) -> (X+Y)<<1. This implements
Transforms/Reassociate/shift-factor.ll
llvm-svn: 26753
|
| |
|
|
| |
llvm-svn: 26748
|
| |
|
|
|
|
| |
2. Remove the declaration of llvm.dbg.declare.
llvm-svn: 26745
|
| |
|
|
| |
llvm-svn: 26743
|
| |
|
|
| |
llvm-svn: 26742
|
| |
|
|
| |
llvm-svn: 26741
|
| |
|
|
| |
llvm-svn: 26740
|
| |
|
|
|
|
| |
transformation decisions.
llvm-svn: 26738
|
| |
|
|
| |
llvm-svn: 26737
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
registers, and update it on entry to each function, then restore it on exit.
This compiles:
void func(vfloat *a, vfloat *b, vfloat *c) {
*a = *b * *c + *c;
}
to this:
_func:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
lvx v0, 0, r5
lvx v1, 0, r4
vmaddfp v0, v1, v0, v0
stvx v0, 0, r3
mtspr 256, r2
blr
GCC produces this (which has additional stack accesses):
_func:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc000
mtspr 256,r0
lvx v0,0,r5
lvx v1,0,r4
lwz r12,-4(r1)
vmaddfp v0,v0,v1,v0
stvx v0,0,r3
mtspr 256,r12
blr
llvm-svn: 26733
|
| |
|
|
| |
llvm-svn: 26731
|
| |
|
|
|
|
| |
$(Echo) instead of @echo
llvm-svn: 26730
|
| |
|
|
| |
llvm-svn: 26729
|
| |
|
|
| |
llvm-svn: 26728
|
| |
|
|
|
|
|
|
| |
Regression/CodeGen/PowerPC/and_add.ll
a case that occurs with dynamic allocas of constant size.
llvm-svn: 26727
|
| |
|
|
| |
llvm-svn: 26725
|
| |
|
|
| |
llvm-svn: 26724
|
| |
|
|
|
|
|
|
|
| |
a select and FABS/FNEG.
This speeds up a trivial (aka stupid) copysign benchmark I wrote from 6.73s
to 2.64s, woo.
llvm-svn: 26723
|
| |
|
|
| |
llvm-svn: 26722
|
| |
|
|
| |
llvm-svn: 26721
|
| |
|
|
| |
llvm-svn: 26720
|
| |
|
|
|
|
|
|
|
|
|
| |
1. Use flags on the instructions in the .td file to indicate the PPC970 unit
type instead of a table in the .cpp file. Much cleaner.
2. Change the hazard recognizer to build d-groups according to the actual
algorithm used, not my flawed understanding of it.
3. Model "must be in the first slot" and "must be the only instr in a group"
accurately.
llvm-svn: 26719
|
| |
|
|
|
|
|
|
|
|
|
| |
instructions
to be emitted.
Don't add one to the latency of a completed instruction if the latency of the
op is 0.
llvm-svn: 26718
|
| |
|
|
|
|
| |
predecessor to finish before they can start.
llvm-svn: 26717
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
operands have all issued, but whose results are not yet available. This
allows us to compile:
int G;
int test(int A, int B, int* P) {
return (G+A)*(B+1);
}
to:
_test:
lis r2, ha16(L_G$non_lazy_ptr)
addi r4, r4, 1
lwz r2, lo16(L_G$non_lazy_ptr)(r2)
lwz r2, 0(r2)
add r2, r2, r3
mullw r3, r2, r4
blr
instead of this, which has a stall between the lis/lwz:
_test:
lis r2, ha16(L_G$non_lazy_ptr)
lwz r2, lo16(L_G$non_lazy_ptr)(r2)
addi r4, r4, 1
lwz r2, 0(r2)
add r2, r2, r3
mullw r3, r2, r4
blr
llvm-svn: 26716
|
| |
|
|
|
|
| |
which cycle it lands on.
llvm-svn: 26714
|
| |
|
|
| |
llvm-svn: 26713
|
| |
|
|
|
|
| |
is together, and direction independent code is together.
llvm-svn: 26712
|
| |
|
|
|
|
|
|
|
| |
merge succs/chainsuccs -> succs set
This has no functionality change, simplifies the code, and reduces the size
of sunits.
llvm-svn: 26711
|
| |
|
|
| |
llvm-svn: 26710
|
| |
|
|
| |
llvm-svn: 26709
|
| |
|
|
| |
llvm-svn: 26708
|
| |
|
|
|
|
|
|
|
|
| |
set construction, rather than intersecting various std::sets. This reduces
the memory usage for the testcase in PR681 from 496 to 26MB of ram on my
darwin system, and reduces the runtime from 32.8 to 0.8 seconds on a
2.5GHz G5. This also enables future code sharing between Dom and PostDom
now that they share near-identical implementations.
llvm-svn: 26707
|
| |
|
|
| |
llvm-svn: 26705
|
| |
|
|
|
|
| |
off the result string at the first null terminator.
llvm-svn: 26704
|
| |
|
|
| |
llvm-svn: 26703
|
| |
|
|
| |
llvm-svn: 26701
|