| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This gets rid of a sub instruction by moving the negation to the
constant when valid.
Reviewers: nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3773
llvm-svn: 208827
|
|
|
|
|
|
|
|
|
| |
This reverts commit r204912, and follow-up commit r204948.
This introduced a performance regression, and the fix is not completely
clear yet.
llvm-svn: 205010
|
|
|
|
|
|
|
|
|
| |
Fixes a miscompile introduced in r204912. It would miscompile code like
(unsigned)(a + -49) <= 5U. The transform would turn this into
(unsigned)a < 55U, which would return true for values in [0, 49], when
it should not.
llvm-svn: 204948
|
|
|
|
|
|
|
|
|
|
| |
Transform:
icmp X+Cst2, Cst
into:
icmp X, Cst-Cst2
when Cst-Cst2 does not overflow, and the add has nsw.
llvm-svn: 204912
|
|
|
|
|
|
| |
This is common in bitfield code.
llvm-svn: 194925
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several architectures use the same instruction to perform both a comparison and
a subtract. The instruction selection framework does not allow to consider
different basic blocks to expose such fusion opportunities.
Therefore, these instructions are “merged” by CSE at MI IR level.
To increase the likelihood of CSE to apply in such situation, we reorder the
operands of the comparison, when they have the same complexity, so that they
matches the order of the most frequent subtract.
E.g.,
icmp A, B
...
sub B, A
<rdar://problem/14514580>
llvm-svn: 190352
|
|
|
|
| |
llvm-svn: 188926
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
easier debugging. No functionality change.
This conversion was done with the following bash script:
find test/Transforms -name "*.ll" | \
while read NAME; do
echo "$NAME"
if ! grep -q "^; *RUN: *llc" $NAME; then
TEMP=`mktemp -t temp`
cp $NAME $TEMP
sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
while read FUNC; do
sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)define\([^@]*\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3define\4@$FUNC(/g" $TEMP
done
mv $TEMP $NAME
fi
done
llvm-svn: 186269
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
functionality change.
This update was done with the following bash script:
find test/Transforms -name "*.ll" | \
while read NAME; do
echo "$NAME"
if ! grep -q "^; *RUN: *llc" $NAME; then
TEMP=`mktemp -t temp`
cp $NAME $TEMP
sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
while read FUNC; do
sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3@$FUNC(/g" $TEMP
done
mv $TEMP $NAME
fi
done
llvm-svn: 186268
|
|
|
|
|
|
|
|
|
|
| |
The following transforms are valid if -C is a power of 2:
(icmp ugt (xor X, C), ~C) -> (icmp ult X, C)
(icmp ult (xor X, C), -C) -> (icmp uge X, C)
These are nice, they get rid of the xor.
llvm-svn: 185915
|
|
|
|
|
|
| |
Tests were added in r185910 somehow.
llvm-svn: 185912
|
|
|
|
| |
llvm-svn: 185910
|
|
|
|
|
|
|
|
|
| |
C1-X <u C2 -> (X|(C2-1)) == C1
C1-X >u C2 -> (X|C2) == C1
X-C1 <u C2 -> (X & -C2) == C1
X-C1 >u C2 -> (X & ~C2) == C1
llvm-svn: 185909
|
|
|
|
|
|
|
|
|
|
|
| |
Back in r179493 we determined that two transforms collided with each
other. The fix back then was to reorder the transforms so that the
preferred transform would give it a try and then we would try the
secondary transform. However, it was noted that the best approach would
canonicalize one transform into the other, removing the collision and
allowing us to optimize IR given to us in that form.
llvm-svn: 185808
|
|
|
|
| |
llvm-svn: 185737
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This transform allows us to turn IR that looks like:
%1 = icmp eq i64 %b, 0
%2 = icmp ult i64 %a, %b
%3 = or i1 %1, %2
ret i1 %3
into:
%0 = add i64 %b, -1
%1 = icmp uge i64 %0, %a
ret i1 %1
which means we go from lowering:
cmpq %rsi, %rdi
setb %cl
testq %rsi, %rsi
sete %al
orb %cl, %al
ret
to lowering:
decq %rsi
cmpq %rdi, %rsi
setae %al
ret
llvm-svn: 185677
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We may, after other optimizations, find ourselves with IR that looks
like:
%shl = shl i32 1, %y
%cmp = icmp ult i32 %shl, 32
Instead, we should just compare the shift count:
%cmp = icmp ult i32 %y, 5
llvm-svn: 185242
|
|
|
|
| |
llvm-svn: 183433
|
|
|
|
|
|
|
| |
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.
llvm-svn: 181518
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1
The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.
Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
%shr = lshr i32 %X, 4
%and = and i32 %shr, 15
%add = add i32 %and, -14
%tobool = icmp ne i32 %add, 0
into:
%and = and i32 %X, 240
%tobool = icmp ne i32 %and, 224
llvm-svn: 179493
|
|
|
|
|
|
|
|
| |
The transform will execute like so:
(A & ~B) == 0 --> (A & B) != 0
(A & ~B) != 0 --> (A & B) == 0
llvm-svn: 179386
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allows LLVM to optimize sequences like the following:
%add = add nsw i32 %x, 1
%cmp = icmp sgt i32 %add, %y
into:
%cmp = icmp sge i32 %x, %y
as well as:
%add1 = add nsw i32 %x, 20
%add2 = add nsw i32 %y, 57
%cmp = icmp sge i32 %add1, %add2
into:
%add = add nsw i32 %y, 37
%cmp = icmp sle i32 %cmp, %x
llvm-svn: 179316
|
|
|
|
| |
llvm-svn: 177863
|
|
|
|
|
|
|
|
| |
This simplification happens at 2 places :
- using the nsw attribute when the shl / mul is used by a sign test
- when the shl / mul is compared for (in)equality to zero
llvm-svn: 177856
|
|
|
|
|
|
|
|
|
| |
It enables to work with a smaller constant, which is target friendly for those which can compare to immediates.
It also avoids inserting a shift in favor of a trunc, which can be free on some targets.
This used to work until LLVM-3.1, but regressed with the 3.2 release.
llvm-svn: 175270
|
|
|
|
|
|
| |
Also add an assert to avoid confusion in the code where is known that C1 <= C2.
llvm-svn: 171310
|
|
|
|
|
|
|
| |
if C1 and C2 differ only with one bit.
Fixes PR14708.
llvm-svn: 171270
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.
Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.
Patch by Kevin Schoedel
llvm-svn: 170576
|
|
|
|
|
|
|
|
|
| |
This assumes (1 << n) is always not zero. Consider n is greater than word size.
Although I know it is undefined, this transforms undefined behavior hidden.
This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang.
llvm-svn: 170128
|
|
|
|
| |
llvm-svn: 170020
|
|
|
|
|
|
|
|
|
| |
replaced by this patch is equivalent to the new logic, but you'd be wrong, and
that's exactly where the bug was. There's a similar bug in instsimplify which
manifests itself as instsimplify failing to simplify this, rather than doing it
wrong, see next commit.
llvm-svn: 168181
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the compare.
This saves a cast, and zext is more expensive on platforms with subreg support
than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
same performance now when not inlining either function.
stupid_memchr: 323.0us
bsd_memchr: 321.0us
memchr: 479.0us
where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
I haven't fully understood the issue yet, something is grossly mangling the
loop after inlining.
llvm-svn: 158297
|
|
|
|
|
|
| |
'gep null' when the icmp predicate is unsigned (or is signed without inbounds).
llvm-svn: 151467
|
|
|
|
|
|
| |
MultiSource/Applications/lua.
llvm-svn: 151463
|
|
|
|
|
|
|
| |
by using llvm::isIdentifiedObject. Also teach it to handle GEPs that have
the same base pointer and constant operands. Fixes PR11238!
llvm-svn: 151449
|
|
|
|
|
|
|
|
|
|
|
| |
of the indices.
This transformation is not safe in some pathological cases (signed icmp of pointers should be an
extremely rare thing, but it's valid IR!). Add an explanatory comment.
Kudos to Duncan for pointing out this edge case (and not giving up explaining it until I finally got it).
llvm-svn: 151055
|
|
|
|
| |
llvm-svn: 150979
|
|
|
|
|
|
|
|
| |
pointer but use different types, expand the offset calculation and to the compare on the offset if profitable.
This came up in SmallVector code.
llvm-svn: 150962
|
|
|
|
|
|
| |
width.
llvm-svn: 149151
|
|
|
|
|
|
|
| |
Unfortunately I also had to disable constant-pool-sharing.ll the code it tests has been
updated to use the IL logic.
llvm-svn: 149148
|
|
|
|
|
|
| |
ConstantVector's to integer type.
llvm-svn: 149110
|
|
|
|
| |
llvm-svn: 147580
|
|
|
|
|
|
| |
style issues and confusing comment
llvm-svn: 145618
|
|
|
|
| |
llvm-svn: 145570
|
|
|
|
|
|
|
|
| |
compare when the AND has more than one use.
This can pessimize code, inequalities are generally more expensive.
llvm-svn: 134379
|
|
|
|
|
|
| |
The backend already knew this trick.
llvm-svn: 132915
|
|
|
|
|
|
| |
"zext" and the "and" have one use.
llvm-svn: 132897
|
|
|
|
|
|
| |
side of the icmp is an exact shift.
llvm-svn: 130954
|
|
|
|
|
|
|
|
|
|
| |
when X has multiple uses. This is useful for exposing secondary optimizations,
but the X86 backend isn't ready for this when X has a single use. For example,
this can disable load folding.
This is inching towards resolving PR6627.
llvm-svn: 130238
|
|
|
|
| |
llvm-svn: 127366
|