| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 26900
|
|
|
|
|
|
|
|
| |
constant pool load. This generates significantly nicer code for splats.
When tblgen gets bugfixed, we can remove the custom selection code.
llvm-svn: 26898
|
|
|
|
|
|
|
|
| |
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.
llvm-svn: 26814
|
|
|
|
| |
llvm-svn: 26793
|
|
|
|
| |
llvm-svn: 26758
|
|
|
|
| |
llvm-svn: 26742
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
registers, and update it on entry to each function, then restore it on exit.
This compiles:
void func(vfloat *a, vfloat *b, vfloat *c) {
*a = *b * *c + *c;
}
to this:
_func:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
lvx v0, 0, r5
lvx v1, 0, r4
vmaddfp v0, v1, v0, v0
stvx v0, 0, r3
mtspr 256, r2
blr
GCC produces this (which has additional stack accesses):
_func:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc000
mtspr 256,r0
lvx v0,0,r5
lvx v1,0,r4
lwz r12,-4(r1)
vmaddfp v0,v0,v1,v0
stvx v0,0,r3
mtspr 256,r12
blr
llvm-svn: 26733
|
|
|
|
|
|
|
|
|
|
|
| |
1. Use flags on the instructions in the .td file to indicate the PPC970 unit
type instead of a table in the .cpp file. Much cleaner.
2. Change the hazard recognizer to build d-groups according to the actual
algorithm used, not my flawed understanding of it.
3. Model "must be in the first slot" and "must be the only instr in a group"
accurately.
llvm-svn: 26719
|
|
|
|
| |
llvm-svn: 26608
|
|
|
|
|
|
| |
flushes
llvm-svn: 26587
|
|
|
|
| |
llvm-svn: 26450
|
|
|
|
| |
llvm-svn: 26348
|
|
|
|
|
|
|
| |
and SUBE nodes that actually expose what's going on and allow for
significant simplifications in the targets.
llvm-svn: 26255
|
|
|
|
|
|
|
|
|
|
| |
We do not want to emit "Loop: ... brcond Out; br Loop", as it adds an extra
instruction in the loop. Instead, invert the condition and emit
"Loop: ... br!cond Loop; br Out.
Generalize the fix by moving it from PPCDAGToDAGISel to SelectionDAGLowering.
llvm-svn: 26231
|
|
|
|
| |
llvm-svn: 26085
|
|
|
|
|
|
|
|
| |
SDOperand Select(SDOperand N);
to
void Select(SDOperand &Result, SDOperand N);
llvm-svn: 26067
|
|
|
|
| |
llvm-svn: 26010
|
|
|
|
| |
llvm-svn: 25997
|
|
|
|
| |
llvm-svn: 25717
|
|
|
|
| |
llvm-svn: 25515
|
|
|
|
| |
llvm-svn: 25334
|
|
|
|
| |
llvm-svn: 25238
|
|
|
|
| |
llvm-svn: 25237
|
|
|
|
|
|
|
| |
The PPC backend was generating random shift counts in this case, due to an
uninitialized variable.
llvm-svn: 25114
|
|
|
|
|
|
| |
constant offsets from statics into the address arithmetic.
llvm-svn: 24999
|
|
|
|
| |
llvm-svn: 24874
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
us to load and store vectors directly at a pointer (offset of zero) by
using r0 as the base register. This also requires some asm printer work
to satisfy the darwin assembler.
For
void %foo(<4 x float> * %a) {
entry:
%tmp1 = load <4 x float> * %a;
%tmp2 = add <4 x float> %tmp1, %tmp1
store <4 x float> %tmp2, <4 x float> *%a
ret void
}
We now produce:
_foo:
lvx v0, 0, r3
vaddfp v0, v0, v0
stvx v0, 0, r3
blr
Instead of:
_foo:
li r2, 0
lvx v0, r2, r3
vaddfp v0, v0, v0
stvx v0, r2, r3
blr
llvm-svn: 24872
|
|
|
|
| |
llvm-svn: 24871
|
|
|
|
| |
llvm-svn: 24834
|
|
|
|
| |
llvm-svn: 24720
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
from the DAGToDAG cpp file. This adds pattern support for vector and
scalar fma, which passes test/Regression/CodeGen/PowerPC/fma.ll, and
does the right thing in the presence of -disable-excess-fp-precision.
Allows us to match:
void %foo(<4 x float> * %a) {
entry:
%tmp1 = load <4 x float> * %a;
%tmp2 = mul <4 x float> %tmp1, %tmp1
%tmp3 = add <4 x float> %tmp2, %tmp1
store <4 x float> %tmp3, <4 x float> *%a
ret void
}
As:
_foo:
li r2, 0
lvx v0, r2, r3
vmaddfp v0, v0, v0, v0
stvx v0, r2, r3
blr
Or, with llc -disable-excess-fp-precision,
_foo:
li r2, 0
lvx v0, r2, r3
vxor v1, v1, v1
vmaddfp v1, v0, v0, v1
vaddfp v0, v1, v0
stvx v0, r2, r3
blr
llvm-svn: 24719
|
|
|
|
|
|
|
| |
them in the PPC backend, to simplify some logic out of Select and
SelectAddr.
llvm-svn: 24657
|
|
|
|
| |
llvm-svn: 24627
|
|
|
|
| |
llvm-svn: 24592
|
|
|
|
|
|
| |
improvements.
llvm-svn: 24591
|
|
|
|
| |
llvm-svn: 24590
|
|
|
|
| |
llvm-svn: 24566
|
|
|
|
| |
llvm-svn: 24561
|
|
|
|
| |
llvm-svn: 24558
|
|
|
|
| |
llvm-svn: 24549
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
changes allow us to generate the following code:
_foo:
li r2, 0
lvx v0, r2, r3
vaddfp v0, v0, v0
stvx v0, r2, r3
blr
for this llvm:
void %foo(<4 x float>* %a) {
entry:
%tmp1 = load <4 x float>* %a
%tmp2 = add <4 x float> %tmp1, %tmp1
store <4 x float> %tmp2, <4 x float>* %a
ret void
}
llvm-svn: 24534
|
|
|
|
|
|
|
|
| |
of some code. This exposes the implicit load from the stubs to the DAG, allowing
them to be optimized by the dag combiner. It also moves darwin specific stuff
out of the isel into the legalizer, and allows more to be moved to the .td file.
llvm-svn: 24397
|
|
|
|
| |
llvm-svn: 24396
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
on Darwin to remove smarts from the isel. This is currently disabled by
default (uncomment setOperationAction(ISD::GlobalAddress to enable it).
tblgen needs to become smarter about tglobaladdr nodes and bigger patterns
needed to be added to the .td file. However, we can currently emit stuff like
this: :)
li r2, lo16(L_x$non_lazy_ptr)
lis r3, ha16(L_x$non_lazy_ptr)
lwzx r2, r3, r2
The obvious improvements will follow.
llvm-svn: 24390
|
|
|
|
|
|
| |
instead of a globaladdress. This has no effect on the generated code at all.
llvm-svn: 24386
|
|
|
|
|
|
|
| |
which branches to an absolute address. This is required to support objc
direct dispatch.
llvm-svn: 24370
|
|
|
|
| |
llvm-svn: 24073
|
|
|
|
|
|
| |
tracked as PR642
llvm-svn: 24068
|
|
|
|
| |
llvm-svn: 24067
|
|
|
|
| |
llvm-svn: 23991
|