| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 24208
|
|
|
|
| |
llvm-svn: 24207
|
|
|
|
|
|
|
| |
a bunch of other things) but is currently ignored by the code
generator.
llvm-svn: 24206
|
|
|
|
| |
llvm-svn: 24203
|
|
|
|
| |
llvm-svn: 24200
|
|
|
|
| |
llvm-svn: 24199
|
|
|
|
| |
llvm-svn: 24198
|
|
|
|
| |
llvm-svn: 24197
|
|
|
|
|
|
|
|
|
| |
Add support for specifying alignment and size of setjmp jmpbufs.
No targets currently do anything with this information, nor is it presrved
in the bytecode representation. That's coming up next.
llvm-svn: 24196
|
|
|
|
| |
llvm-svn: 24195
|
|
|
|
|
|
| |
that has been sitting in my inbox since May 18. :)
llvm-svn: 24194
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a few times in crafty:
OLD: %tmp.36 = div int %tmp.35, 8 ; <int> [#uses=1]
NEW: %tmp.36 = div uint %tmp.35, 8 ; <uint> [#uses=0]
OLD: %tmp.19 = div int %tmp.18, 8 ; <int> [#uses=1]
NEW: %tmp.19 = div uint %tmp.18, 8 ; <uint> [#uses=0]
OLD: %tmp.117 = div int %tmp.116, 8 ; <int> [#uses=1]
NEW: %tmp.117 = div uint %tmp.116, 8 ; <uint> [#uses=0]
OLD: %tmp.92 = div int %tmp.91, 8 ; <int> [#uses=1]
NEW: %tmp.92 = div uint %tmp.91, 8 ; <uint> [#uses=0]
Which all turn into shrs.
llvm-svn: 24190
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8 times in vortex, allowing the srems to be turned into shrs:
OLD: %tmp.104 = rem int %tmp.5.i37, 16 ; <int> [#uses=1]
NEW: %tmp.104 = rem uint %tmp.5.i37, 16 ; <uint> [#uses=0]
OLD: %tmp.98 = rem int %tmp.5.i24, 16 ; <int> [#uses=1]
NEW: %tmp.98 = rem uint %tmp.5.i24, 16 ; <uint> [#uses=0]
OLD: %tmp.91 = rem int %tmp.5.i19, 8 ; <int> [#uses=1]
NEW: %tmp.91 = rem uint %tmp.5.i19, 8 ; <uint> [#uses=0]
OLD: %tmp.88 = rem int %tmp.5.i14, 8 ; <int> [#uses=1]
NEW: %tmp.88 = rem uint %tmp.5.i14, 8 ; <uint> [#uses=0]
OLD: %tmp.85 = rem int %tmp.5.i9, 1024 ; <int> [#uses=2]
NEW: %tmp.85 = rem uint %tmp.5.i9, 1024 ; <uint> [#uses=0]
OLD: %tmp.82 = rem int %tmp.5.i, 512 ; <int> [#uses=2]
NEW: %tmp.82 = rem uint %tmp.5.i1, 512 ; <uint> [#uses=0]
OLD: %tmp.48.i = rem int %tmp.5.i.i161, 4 ; <int> [#uses=1]
NEW: %tmp.48.i = rem uint %tmp.5.i.i161, 4 ; <uint> [#uses=0]
OLD: %tmp.20.i2 = rem int %tmp.5.i.i, 4 ; <int> [#uses=1]
NEW: %tmp.20.i2 = rem uint %tmp.5.i.i, 4 ; <uint> [#uses=0]
it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61)
so the payoff isn't as great.
llvm-svn: 24189
|
|
|
|
| |
llvm-svn: 24188
|
|
|
|
| |
llvm-svn: 24187
|
|
|
|
|
|
|
|
|
| |
out failed (e.g. methcall) - now the code compiles, though it's not quite
right just yet (tm) ;)
would fix this but it's 3am! :O
llvm-svn: 24186
|
|
|
|
| |
llvm-svn: 24183
|
|
|
|
| |
llvm-svn: 24182
|
|
|
|
| |
llvm-svn: 24180
|
|
|
|
| |
llvm-svn: 24175
|
|
|
|
| |
llvm-svn: 24164
|
|
|
|
| |
llvm-svn: 24161
|
|
|
|
|
|
| |
XCode's indenting.
llvm-svn: 24159
|
|
|
|
|
|
| |
/Regression/Transforms/InstCombine/add.ll
llvm-svn: 24158
|
|
|
|
|
|
| |
This fixes PR641
llvm-svn: 24154
|
|
|
|
|
|
| |
though)
llvm-svn: 24152
|
|
|
|
| |
llvm-svn: 24151
|
|
|
|
| |
llvm-svn: 24150
|
|
|
|
|
|
| |
selecting ints to IA64, and a few other ia64 bits and pieces
llvm-svn: 24147
|
|
|
|
|
|
| |
stop pretending -0.0 and -1.0 are machine constants
llvm-svn: 24146
|
|
|
|
|
|
| |
may fix PR652. Thanks to Andrew for tracking down the problem.
llvm-svn: 24145
|
|
|
|
| |
llvm-svn: 24139
|
|
|
|
|
|
| |
need to send chris, jim and sampo a box of fish each
llvm-svn: 24135
|
|
|
|
|
|
| |
one sometimes needs to pass FP args in both FP *and* integer registers.
llvm-svn: 24134
|
|
|
|
|
|
| |
clever little tablegen!
llvm-svn: 24133
|
|
|
|
| |
llvm-svn: 24132
|
|
|
|
| |
llvm-svn: 24131
|
|
|
|
| |
llvm-svn: 24130
|
|
|
|
|
|
| |
not compiling a whole program at a time :)
llvm-svn: 24129
|
|
|
|
| |
llvm-svn: 24124
|
|
|
|
|
|
|
| |
bad cases. This fixes Markus's second testcase in PR639, and should
seal it for good.
llvm-svn: 24123
|
|
|
|
|
|
|
|
| |
2. Iterate operands and not uses (performance.)
3. Some long pending comment changes.
llvm-svn: 24119
|
|
|
|
| |
llvm-svn: 24118
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a special case hack for X86, make the hack more general: if an incoming argument
register is not used in any block other than the entry block, don't copy it to
a vreg. This helps us compile code like this:
%struct.foo = type { int, int, [0 x ubyte] }
int %test(%struct.foo* %X) {
%tmp1 = getelementptr %struct.foo* %X, int 0, uint 2, int 100
%tmp = load ubyte* %tmp1 ; <ubyte> [#uses=1]
%tmp2 = cast ubyte %tmp to int ; <int> [#uses=1]
ret int %tmp2
}
to:
_test:
lbz r3, 108(r3)
blr
instead of:
_test:
lbz r2, 108(r3)
or r3, r2, r2
blr
The (dead) copy emitted to copy r3 into a vreg for extra-block uses was
increasing the live range of r3 past the load, preventing the coallescing.
This implements CodeGen/PowerPC/reg-coallesce-simple.ll
llvm-svn: 24115
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
generating results in vregs that will need them. In the case of something
like this: CopyToReg((add X, Y), reg1024), we no longer emit code like
this:
reg1025 = add X, Y
reg1024 = reg 1025
Instead, we emit:
reg1024 = add X, Y
Whoa! :)
llvm-svn: 24111
|
|
|
|
| |
llvm-svn: 24110
|
|
|
|
| |
llvm-svn: 24109
|
|
|
|
| |
llvm-svn: 24107
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements test/Regression/CodeGen/PowerPC/mul-neg-power-2.ll,
producing:
_foo:
slwi r2, r3, 1
subfic r3, r2, 63
blr
instead of:
_foo:
mulli r2, r3, -2
addi r3, r2, 63
blr
llvm-svn: 24106
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When inserting code for an addrec expression with a non-unit stride, be
more careful where we insert the multiply. In particular, insert the multiply
in the outermost loop we can, instead of the requested insertion point.
This allows LSR to notice the mul in the right loop, reducing it when it gets
to it. This allows it to reduce the multiply, where before it missed it.
This happens quite a bit in the test suite, for example, eliminating 2
multiplies in art, 3 in ammp, 4 in apsi, reducing from 1050 multiplies to
910 muls in galgel (!), from 877 to 859 in applu, and 36 to 30 in bzip2.
This speeds up galgel from 16.45s to 16.01s, applu from 14.21 to 13.94s and
fourinarow from 66.67s to 63.48s.
This implements Transforms/LoopStrengthReduce/nested-reduce.ll
llvm-svn: 24102
|