| Commit message (Collapse) | Author | Age | Files | Lines | 
| | 
| 
| 
|  | 
llvm-svn: 23210
 | 
| | 
| 
| 
|  | 
llvm-svn: 23209
 | 
| | 
| 
| 
|  | 
llvm-svn: 23208
 | 
| | 
| 
| 
|  | 
llvm-svn: 23207
 | 
| | 
| 
| 
|  | 
llvm-svn: 23206
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
instead of ZERO_EXTEND to eliminate extraneous extensions.  This eliminates
dead zero extensions on formal arguments and other cases on PPC, implementing
the newly tightened up test/Regression/CodeGen/PowerPC/small-arguments.ll test.
llvm-svn: 23205
 | 
| | 
| 
| 
|  | 
llvm-svn: 23204
 | 
| | 
| 
| 
|  | 
llvm-svn: 23203
 | 
| | 
| 
| 
|  | 
llvm-svn: 23202
 | 
| | 
| 
| 
| 
| 
|  | 
the observation that it only has to handle i1 -> i64 and i64 -> i1.
llvm-svn: 23201
 | 
| | 
| 
| 
| 
| 
| 
|  | 
the results of calls to functions returning small values are properly
sign/zero extended.
llvm-svn: 23198
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
over to DAGCombiner.cpp
1. Don't assume that SetCC returns i1 when folding (xor (setcc) constant)
2. Don't duplicate code in folding AND with AssertZext that is handled by
   MaskedValueIsZero
llvm-svn: 23196
 | 
| | 
| 
| 
| 
| 
|  | 
left to do).
llvm-svn: 23195
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
measurements.  This improves the performance of 'treeadd' by about 20% with the dag
isel, restoring it to the pattern-isel level (which happens to get the alignment right).
llvm-svn: 23194
 | 
| | 
| 
| 
| 
| 
| 
|  | 
platforms.  This reduces executable size and makes shark realize the actual
bounds of functions instead of showing each MBB as a function :)
llvm-svn: 23193
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
2. Propagate feature "string" to all targets.
3. Implement use of SubtargetFeatures in PowerPCTargetSubtarget.
llvm-svn: 23192
 | 
| | 
| 
| 
| 
| 
| 
|  | 
is to manage processor specific attributes from the command line.  See examples
of use in llc/lli and PowerPCTargetSubtarget.
llvm-svn: 23191
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
directly out of R1 (without using a CopyFromReg, which uses a chain), multiple
allocas were getting CSE'd together, producing bogus code.  For this:
int %foo(bool %X, int %A, int %B) {
        br bool %X, label %T, label %F
F:
        %G = alloca int
        %H = alloca int
        store int %A, int* %G
        store int %B, int* %H
        %R = load int* %G
        ret int %R
T:
        ret int 0
}
We were generating:
_foo:
        stwu r1, -16(r1)
        stw r31, 4(r1)
        or r31, r1, r1
        stw r1, 12(r31)
        cmpwi cr0, r3, 0
        bne cr0, .LBB_foo_2     ; T
.LBB_foo_1:     ; F
        li r2, 16
        subf r2, r2, r1   ;; One alloca
        or r1, r2, r2
        or r3, r1, r1
        or r1, r2, r2
        or r2, r1, r1
        stw r4, 0(r3)
        stw r5, 0(r2)
        lwz r3, 0(r3)
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
.LBB_foo_2:     ; T
        li r3, 0
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
Now we generate:
_foo:
        stwu r1, -16(r1)
        stw r31, 4(r1)
        or r31, r1, r1
        stw r1, 12(r31)
        cmpwi cr0, r3, 0
        bne cr0, .LBB_foo_2     ; T
.LBB_foo_1:     ; F
        or r2, r1, r1
        li r3, 16
        subf r2, r3, r2  ;; Alloca 1
        or r1, r2, r2
        or r2, r1, r1
        or r6, r1, r1
        subf r3, r3, r6  ;; Alloca 2
        or r1, r3, r3
        or r3, r1, r1
        stw r4, 0(r2)
        stw r5, 0(r3)
        lwz r3, 0(r2)
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
.LBB_foo_2:     ; T
        li r3, 0
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
This fixes Povray and SPASS with the dag isel, the last two failing cases.
Tommorow we will hopefully turn it on by default! :)
llvm-svn: 23190
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
could cause a miscompile.  Fixing this didn't fix the two programs that fail
though.  :(
This also changes the implementation to follow the pattern selector more
closely, causing us to select 0 to li instead of lis.
llvm-svn: 23189
 | 
| | 
| 
| 
| 
| 
|  | 
and selecting early prevents folding immediates into the cmpw* instructions
llvm-svn: 23188
 | 
| | 
| 
| 
|  | 
llvm-svn: 23186
 | 
| | 
| 
| 
| 
| 
|  | 
statement in visit().
llvm-svn: 23185
 | 
| | 
| 
| 
| 
| 
| 
|  | 
be mostly functional.  It currently has all folds from SelectionDAG.cpp
that do not involve a condition code.
llvm-svn: 23184
 | 
| | 
| 
| 
|  | 
llvm-svn: 23181
 | 
| | 
| 
| 
| 
| 
|  | 
getting them out of the business of making stack slots.
llvm-svn: 23180
 | 
| | 
| 
| 
|  | 
llvm-svn: 23179
 | 
| | 
| 
| 
|  | 
llvm-svn: 23178
 | 
| | 
| 
| 
|  | 
llvm-svn: 23177
 | 
| | 
| 
| 
| 
| 
|  | 
the ops to dag optimization.
llvm-svn: 23176
 | 
| | 
| 
| 
|  | 
llvm-svn: 23173
 | 
| | 
| 
| 
|  | 
llvm-svn: 23171
 | 
| | 
| 
| 
|  | 
llvm-svn: 23170
 | 
| | 
| 
| 
|  | 
llvm-svn: 23169
 | 
| | 
| 
| 
|  | 
llvm-svn: 23168
 | 
| | 
| 
| 
| 
| 
|  | 
fixes crafty and probably others.
llvm-svn: 23167
 | 
| | 
| 
| 
|  | 
llvm-svn: 23166
 | 
| | 
| 
| 
| 
| 
|  | 
case in MaskedValueIsZero was wrong.
llvm-svn: 23165
 | 
| | 
| 
| 
| 
| 
|  | 
MaskedValueIsZero.
llvm-svn: 23164
 | 
| | 
| 
| 
| 
| 
|  | 
ugly hacks
llvm-svn: 23162
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
likethis, it is a requirement on PPC, which can have an f32 value in r3 at onepoint in a function and a f64 value in r3 at another point.  :(
This fixes compilation of mesa
llvm-svn: 23161
 | 
| | 
| 
| 
|  | 
llvm-svn: 23159
 | 
| | 
| 
| 
| 
| 
|  | 
This fixes PR621 and Regression/CodeGen/X86/2005-08-30-RegAllocAliasProblem.ll
llvm-svn: 23158
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Remove code (last hunk) that miscompiled immediate and's, such as
  and uint %tmp.30, 4294958079
into
 andi. r8, r8, 56319
 andis. r8, r8, 65535
instead of:
 li r9, -9217
 and r8, r8, r9
The first always generates zero.
This fixes espresso.
llvm-svn: 23155
 | 
| | 
| 
| 
| 
| 
|  | 
fixes fourinarow
llvm-svn: 23153
 | 
| | 
| 
| 
| 
| 
|  | 
fixes fhourstones
llvm-svn: 23152
 | 
| | 
| 
| 
| 
| 
|  | 
to SHIFT_PARTS nodes
llvm-svn: 23151
 | 
| | 
| 
| 
|  | 
llvm-svn: 23150
 | 
| | 
| 
| 
| 
| 
|  | 
at least tends to expose problems elsewhere.
llvm-svn: 23149
 | 
| | 
| 
| 
|  | 
llvm-svn: 23148
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
them.  This allows for elminination of redundant extends in the entry
blocks of functions on PowerPC.
Add support for i32 x i32 -> i64 multiplies, by recognizing when the inputs
to ISD::MUL in ExpandOp are actually just extended i32 values and not real
i64 values.  this allows us to codegen
int mulhs(int a, int b) { return ((long long)a * b) >> 32; }
as:
_mulhs:
        mulhw r3, r4, r3
        blr
instead of:
_mulhs:
        mulhwu r2, r4, r3
        srawi r5, r3, 31
        mullw r5, r4, r5
        add r2, r2, r5
        srawi r4, r4, 31
        mullw r3, r4, r3
        add r3, r2, r3
        blr
with a similar improvement on x86.
llvm-svn: 23147
 |