| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 26411
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make this code more powerful by using ComputeMaskedBits instead of looking
for an AND operand. This lets us fold this:
int %test23(int %a) {
%tmp.1 = and int %a, 1
%tmp.2 = seteq int %tmp.1, 0
%tmp.3 = cast bool %tmp.2 to int ;; xor tmp1, 1
ret int %tmp.3
}
into: xor (and a, 1), 1
llvm-svn: 26396
|
|
|
|
|
|
| |
and (A-B) == A -> B == 0
llvm-svn: 26394
|
|
|
|
|
|
| |
InstCombine/or.ll:test23.
llvm-svn: 26385
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the code that does "select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y)))".
We now compile this loop:
LBB1_1: ; no_exit
add r6, r2, r3
subf r3, r2, r3
cmpwi cr0, r2, 0
addi r7, r5, 4
lwz r2, 0(r5)
addi r4, r4, 1
blt cr0, LBB1_4 ; no_exit
LBB1_3: ; no_exit
mr r3, r6
LBB1_4: ; no_exit
cmpwi cr0, r4, 16
mr r5, r7
bne cr0, LBB1_1 ; no_exit
into this instead:
LBB1_1: ; no_exit
srawi r6, r2, 31
add r2, r2, r6
xor r6, r2, r6
addi r7, r5, 4
lwz r2, 0(r5)
addi r4, r4, 1
add r3, r3, r6
cmpwi cr0, r4, 16
mr r5, r7
bne cr0, LBB1_1 ; no_exit
llvm-svn: 26356
|
|
|
|
| |
llvm-svn: 26287
|
|
|
|
|
|
|
| |
and ComputeMaskedBits to match the new improved versions in instcombine.
Tested against all of multisource/benchmarks on ppc.
llvm-svn: 26238
|
|
|
|
| |
llvm-svn: 26155
|
|
|
|
|
|
|
|
| |
for a sign extension.
This fixes InstCombine/2006-02-13-DemandedMiscompile.ll and Ptrdist/bc.
llvm-svn: 26152
|
|
|
|
|
|
| |
of the input. This fixes the mediabench/gsm/toast failure last night.
llvm-svn: 26138
|
|
|
|
| |
llvm-svn: 26135
|
|
|
|
| |
llvm-svn: 26134
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Teach GetConstantInType to handle boolean constants.
2. Teach instcombine to fold (compare X, CST) when X has known 0/1 bits.
Testcase here: set.ll:test22
3. Improve the "(X >> c1) & C2 == 0" folding code to allow a noop cast
between the shift and and. More aggressive bitfolding for other reasons
was turning signed shr's into unsigned shr's, leaving the noop cast in
the way.
llvm-svn: 26131
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to simplify on conditions where bits are not known, but they
are not demanded either! This also fixes a couple of bugs in
ComputeMaskedBits that were exposed during this work.
In the future, swaths of instcombine should be removed, as this code
subsumes a bunch of ad-hockery.
llvm-svn: 26122
|
|
|
|
| |
llvm-svn: 26088
|
|
|
|
|
|
|
|
|
|
|
| |
1. Teach it new tricks: in particular how to propagate through signed shr and sexts.
2. Teach it to return a bitset of known-1 and known-0 bits, instead of just zero.
3. Teach instcombine (AND X, C) to fold when we know all C bits of X.
This implements Regression/Transforms/InstCombine/bittest.ll, and allows
future things to be simplified.
llvm-svn: 26087
|
|
|
|
|
|
| |
optimization where we reduce the number of bits in AND masks when possible.
llvm-svn: 26056
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instruction onto the worklist (in case they are now dead).
Add a really trivial local DSE implementation to help out bitfield code.
We now fold this:
struct S {
unsigned char a : 1, b : 1, c : 1, d : 2, e : 3;
S();
};
S::S() : a(0), b(0), c(1), d(0), e(6) {}
to this:
void %_ZN1SC1Ev(%struct.S* %this) {
entry:
%tmp.1 = getelementptr %struct.S* %this, int 0, uint 0
store ubyte 38, ubyte* %tmp.1
ret void
}
much earlier (in gccas instead of only in gccld after DSE runs).
llvm-svn: 26050
|
|
|
|
| |
llvm-svn: 26045
|
|
|
|
| |
llvm-svn: 26040
|
|
|
|
|
|
|
|
| |
is just as efficient as MVIZ and is also more general.
Fix a few minor bugs introduced in recent patches
llvm-svn: 26036
|
|
|
|
|
|
|
|
| |
mask. This allows the code to be simpler and more efficient.
Also, generalize some of the cases in MVIZ a bit, making it slightly more aggressive.
llvm-svn: 26035
|
|
|
|
| |
llvm-svn: 26034
|
|
|
|
|
|
|
| |
'demanded bits', inspired by Nate's work in the dag combiner. This isn't
complete, but needs to unrelated instcombiner changes to continue.
llvm-svn: 26033
|
|
|
|
|
|
|
|
| |
Turn A / (C1 << N), where C1 is "1<<C2" into A >> (N+C2) [udiv only].
Tested with: rem.ll:test5, div.ll:test10
llvm-svn: 26003
|
|
|
|
| |
llvm-svn: 25514
|
|
|
|
|
|
| |
need the float->double part.
llvm-svn: 25452
|
|
|
|
| |
llvm-svn: 25363
|
|
|
|
| |
llvm-svn: 25299
|
|
|
|
| |
llvm-svn: 25294
|
|
|
|
| |
llvm-svn: 25292
|
|
|
|
| |
llvm-svn: 25137
|
|
|
|
| |
llvm-svn: 25130
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the shifts.
This allows us to fold this (which is the 'integer add a constant' sequence
from cozmic's scheme compmiler):
int %x(uint %anf-temporary776) {
%anf-temporary777 = shr uint %anf-temporary776, ubyte 1
%anf-temporary800 = cast uint %anf-temporary777 to int
%anf-temporary804 = shl int %anf-temporary800, ubyte 1
%anf-temporary805 = add int %anf-temporary804, -2
%anf-temporary806 = or int %anf-temporary805, 1
ret int %anf-temporary806
}
into this:
int %x(uint %anf-temporary776) {
%anf-temporary776 = cast uint %anf-temporary776 to int
%anf-temporary776.mask1 = add int %anf-temporary776, -2
%anf-temporary805 = or int %anf-temporary776.mask1, 1
ret int %anf-temporary805
}
note that instcombine already knew how to eliminate the AND that the two
shifts fold into. This is tested by InstCombine/shift.ll:test26
-Chris
llvm-svn: 25128
|
|
|
|
| |
llvm-svn: 25126
|
|
|
|
|
|
| |
functionality changes.
llvm-svn: 25125
|
|
|
|
|
|
|
|
|
| |
Add support for specifying alignment and size of setjmp jmpbufs.
No targets currently do anything with this information, nor is it presrved
in the bytecode representation. That's coming up next.
llvm-svn: 24196
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a few times in crafty:
OLD: %tmp.36 = div int %tmp.35, 8 ; <int> [#uses=1]
NEW: %tmp.36 = div uint %tmp.35, 8 ; <uint> [#uses=0]
OLD: %tmp.19 = div int %tmp.18, 8 ; <int> [#uses=1]
NEW: %tmp.19 = div uint %tmp.18, 8 ; <uint> [#uses=0]
OLD: %tmp.117 = div int %tmp.116, 8 ; <int> [#uses=1]
NEW: %tmp.117 = div uint %tmp.116, 8 ; <uint> [#uses=0]
OLD: %tmp.92 = div int %tmp.91, 8 ; <int> [#uses=1]
NEW: %tmp.92 = div uint %tmp.91, 8 ; <uint> [#uses=0]
Which all turn into shrs.
llvm-svn: 24190
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8 times in vortex, allowing the srems to be turned into shrs:
OLD: %tmp.104 = rem int %tmp.5.i37, 16 ; <int> [#uses=1]
NEW: %tmp.104 = rem uint %tmp.5.i37, 16 ; <uint> [#uses=0]
OLD: %tmp.98 = rem int %tmp.5.i24, 16 ; <int> [#uses=1]
NEW: %tmp.98 = rem uint %tmp.5.i24, 16 ; <uint> [#uses=0]
OLD: %tmp.91 = rem int %tmp.5.i19, 8 ; <int> [#uses=1]
NEW: %tmp.91 = rem uint %tmp.5.i19, 8 ; <uint> [#uses=0]
OLD: %tmp.88 = rem int %tmp.5.i14, 8 ; <int> [#uses=1]
NEW: %tmp.88 = rem uint %tmp.5.i14, 8 ; <uint> [#uses=0]
OLD: %tmp.85 = rem int %tmp.5.i9, 1024 ; <int> [#uses=2]
NEW: %tmp.85 = rem uint %tmp.5.i9, 1024 ; <uint> [#uses=0]
OLD: %tmp.82 = rem int %tmp.5.i, 512 ; <int> [#uses=2]
NEW: %tmp.82 = rem uint %tmp.5.i1, 512 ; <uint> [#uses=0]
OLD: %tmp.48.i = rem int %tmp.5.i.i161, 4 ; <int> [#uses=1]
NEW: %tmp.48.i = rem uint %tmp.5.i.i161, 4 ; <uint> [#uses=0]
OLD: %tmp.20.i2 = rem int %tmp.5.i.i, 4 ; <int> [#uses=1]
NEW: %tmp.20.i2 = rem uint %tmp.5.i.i, 4 ; <uint> [#uses=0]
it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61)
so the payoff isn't as great.
llvm-svn: 24189
|
|
|
|
|
|
| |
/Regression/Transforms/InstCombine/add.ll
llvm-svn: 24158
|
|
|
|
|
|
|
| |
bad cases. This fixes Markus's second testcase in PR639, and should
seal it for good.
llvm-svn: 24123
|
|
|
|
|
|
| |
This allows us to turn code like malloc(4*x+4) -> malloc int, (x+1)
llvm-svn: 24081
|
|
|
|
|
|
| |
change.
llvm-svn: 24076
|
|
|
|
| |
llvm-svn: 24056
|
|
|
|
|
|
| |
PR640
llvm-svn: 24046
|
|
|
|
| |
llvm-svn: 24033
|
|
|
|
|
|
| |
into: malloc int, (2*X)
llvm-svn: 24032
|
|
|
|
|
|
|
| |
(malloc [25 x int]) directly without having to convert to
(malloc [100 x sbyte]) first.
llvm-svn: 24031
|
|
|
|
| |
llvm-svn: 24030
|
|
|
|
|
|
| |
fixes a very slow compile in PR639.
llvm-svn: 24011
|