| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 22619
|
| |
|
|
| |
llvm-svn: 22617
|
| |
|
|
| |
llvm-svn: 22595
|
| |
|
|
|
|
| |
Patch contributed by Jim Laskey!
llvm-svn: 22594
|
| |
|
|
| |
llvm-svn: 22588
|
| |
|
|
|
|
| |
have to write arguments to the stack
llvm-svn: 22536
|
| |
|
|
| |
llvm-svn: 22535
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the following code:
double %ext(int %A.0__, long %A.1__) {
%A_addr = alloca %typedef.DComplex ; <%typedef.DComplex*> [#uses=2]
%tmp.1 = cast %typedef.DComplex* %A_addr to int* ; <int*> [#uses=1]
store int %A.0__, int* %tmp.1
%tmp.2 = getelementptr %typedef.DComplex* %A_addr, int 0, uint 1 ; <double*> [#uses=2]
%tmp.3 = cast double* %tmp.2 to long* ; <long*> [#uses=1]
store long %A.1__, long* %tmp.3
%tmp.5 = load double* %tmp.2 ; <double> [#uses=1]
ret double %tmp.5
}
We now generate:
_ext:
.LBB_ext_0: ;
stw r3, -12(r1)
stw r4, -8(r1)
stw r5, -4(r1)
lfd f1, -8(r1)
blr
Instead of:
_ext:
.LBB_ext_0: ;
stw r3, -12(r1)
addi r2, r1, -12
stw r4, 4(r2)
stw r5, 8(r2)
lfd f1, 4(r2)
blr
This also fires hundreds of times on MultiSource.
llvm-svn: 22533
|
| |
|
|
| |
llvm-svn: 22530
|
| |
|
|
| |
llvm-svn: 22523
|
| |
|
|
|
|
| |
the need to build PIC.
llvm-svn: 22512
|
| |
|
|
| |
llvm-svn: 22507
|
| |
|
|
|
|
|
|
| |
Remove the LoadHiAddr pseudo-instruction.
Optimization of stores to and loads from statics.
Force JIT to use new non-PIC codepaths.
llvm-svn: 22494
|
| |
|
|
|
|
| |
8-byte align doubles.
llvm-svn: 22486
|
| |
|
|
|
|
| |
automatically generated from a target description.
llvm-svn: 22470
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is the last MVTSDNode.
This allows us to eliminate a bunch of special case code for handling
MVTSDNodes.
Also, remove some uses of dyn_cast that should really be cast (which is
cheaper in a release build).
llvm-svn: 22368
|
| |
|
|
| |
llvm-svn: 22366
|
| |
|
|
|
|
|
|
|
|
|
|
| |
1. Pass Value*'s into lowering methods so that the proper pointers can be
added to load/stores from the valist
2. Intrinsics that return void should only return a token chain, not a token
chain/retval pair.
3. Rename LowerVAArgNext -> LowerVAArg, because VANext is long gone.
4. Now that we have Value*'s available in the lowering methods, pass them
into any load/stores from the valist that are emitted
llvm-svn: 22339
|
| |
|
|
| |
llvm-svn: 22335
|
| |
|
|
|
|
|
|
| |
is at least overloading the right virtual methods. The implementations
are currently wrong though. This fixes Ptrdist/bc, but not other programs
(e.g. siod).
llvm-svn: 22326
|
| |
|
|
|
|
|
| |
avoids dereferencing the end() iterator when selecting the fallthrough
block. This requires an ilist change.
llvm-svn: 22212
|
| |
|
|
|
|
|
| |
not generate unnecessary register copies. This improves compile time by
2-5% depending on the test.
llvm-svn: 22210
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
regularly in "normal" code, but for things like software graphics, they
make a big difference.
For the following code:
unsigned short Trans16Bit(unsigned srcA,unsigned srcB,unsigned alpha)
{
unsigned tmpA,tmpB,mixed;
tmpA = ((srcA & 0x03E0) << 15) | (srcA & 0x7C1F);
tmpB = ((srcB & 0x03E0) << 15) | (srcB & 0x7C1F);
mixed = (tmpA * alpha) + (tmpB * (32 - alpha));
return ((mixed >> 5) & 0x7C1F) | ((mixed >> 20) & 0x03E0);
}
We now generate:
_Trans16Bit:
.LBB_Trans16Bit_0: ; entry
andi. r2, r4, 31775
rlwimi r2, r4, 15, 7, 11
subfic r4, r5, 32
mullw r2, r2, r4
andi. r4, r3, 31775
rlwimi r4, r3, 15, 7, 11
mullw r3, r4, r5
add r2, r2, r3
srwi r3, r2, 5
andi. r3, r3, 31775
rlwimi r3, r2, 12, 22, 26
blr
Instead of:
_Trans16Bit:
.LBB_Trans16Bit_0: ; entry
slwi r2, r4, 15
rlwinm r2, r2, 0, 7, 11
andi. r4, r4, 31775
or r2, r2, r4
subfic r4, r5, 32
mullw r2, r2, r4
slwi r4, r3, 15
rlwinm r4, r4, 0, 7, 11
andi. r3, r3, 31775
or r3, r4, r3
mullw r3, r3, r5
add r2, r2, r3
srwi r3, r2, 5
andi. r3, r3, 31775
srwi r2, r2, 20
rlwimi r3, r2, 0, 22, 26
blr
llvm-svn: 22201
|
| |
|
|
| |
llvm-svn: 22064
|
| |
|
|
| |
llvm-svn: 21977
|
| |
|
|
| |
llvm-svn: 21958
|
| |
|
|
|
|
| |
CodeGen/Generic/print-arith-fp.ll
llvm-svn: 21939
|
| |
|
|
| |
llvm-svn: 21915
|
| |
|
|
| |
llvm-svn: 21899
|
| |
|
|
| |
llvm-svn: 21884
|
| |
|
|
|
|
|
|
|
|
| |
1. Teach LegalizeDAG how to better legalize CTTZ if the target doesn't have
CTPOP, but does have CTLZ
2. Teach PPC32 how to do sub x, const -> add x, -const for valid consts
3. Teach PPC32 how to do and (xor a, -1) b -> andc b, a
4. Teach PPC32 that ISD::CTLZ -> PPC::CNTLZW
llvm-svn: 21880
|
| |
|
|
|
|
|
|
| |
possible,
include and (srl) and the inverses (shl and) etc.
llvm-svn: 21820
|
| |
|
|
| |
llvm-svn: 21693
|
| |
|
|
|
|
|
|
|
| |
population (ctpop). Generic lowering is implemented, however only promotion
is implemented for SelectionDAG at the moment.
More coming soon.
llvm-svn: 21676
|
| |
|
|
| |
llvm-svn: 21633
|
| |
|
|
|
|
|
|
| |
enables one to use alias analysis in the backends.
(TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*. Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null.
llvm-svn: 21599
|
| |
|
|
| |
llvm-svn: 21452
|
| |
|
|
| |
llvm-svn: 21425
|
| |
|
|
| |
llvm-svn: 21413
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
int %bar(float %a, float %b, float %c, float %d) {
entry:
%tmp.1 = setlt float %a, %d
%tmp.2 = setlt float %b, %d
%or = or bool %tmp.1, %tmp.2
%tmp.3 = setgt float %c, %d
%tmp.4 = or bool %or, %tmp.3
%tmp.5 = and bool %tmp.4, true
%retval = cast bool %tmp.5 to int
ret int %retval
}
We now emit:
_bar:
.LBB_bar_0: ; entry
fcmpu cr0, f1, f4
fcmpu cr1, f2, f4
cror 0, 0, 4
fcmpu cr1, f3, f4
cror 28, 0, 5
mfcr r2
rlwinm r3, r2, 29, 31, 31
blr
Instead of:
_bar:
.LBB_bar_0: ; entry
fcmpu cr7, f1, f4
mfcr r2
rlwinm r2, r2, 29, 31, 31
fcmpu cr7, f2, f4
mfcr r3
rlwinm r3, r3, 29, 31, 31
or r2, r2, r3
fcmpu cr7, f3, f4
mfcr r3
rlwinm r3, r3, 30, 31, 31
or r3, r2, r3
blr
llvm-svn: 21321
|
| |
|
|
|
|
|
|
| |
register. Added support in the .td file for the g5-specific variant
of cr -> gpr moves that executes faster, but we currently don't
generate it.
llvm-svn: 21314
|
| |
|
|
|
|
|
|
|
|
| |
Add new ppc beta option related to using condition registers
Make pattern isel control flag (-enable-pattern-isel) global and tristate
0 == off
1 == on
2 == target default
llvm-svn: 21309
|
| |
|
|
|
|
|
|
| |
This can generate considerably shorter code, reducing the size of crafty
by almost 1%. Also fix the printing of mcrf. The code is currently
disabled until it gets a bit more testing, but should work as-is.
llvm-svn: 21298
|
| |
|
|
|
|
|
| |
now gone. Next step is to get rid of the remaining ones and then start
allocating bools to CRs where appropriate.
llvm-svn: 21294
|
| |
|
|
|
|
| |
where it is safe to do so.
llvm-svn: 21293
|
| |
|
|
|
|
|
|
| |
Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel
Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc
always produces zero or one.
llvm-svn: 21291
|
| |
|
|
| |
llvm-svn: 21271
|
| |
|
|
|
|
|
| |
andi instructions instead of rlwinm instructions for zero extend, but they
seem like they would take the same time.
llvm-svn: 21268
|
| |
|
|
|
|
| |
Make llvm undef values generate ISD::UNDEF nodes
llvm-svn: 21261
|
| |
|
|
|
|
|
| |
Remove dead setcc op, 0 sequences
Coming later: generalization of op, imm
llvm-svn: 21260
|