| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
benchmarks, and that it can be simplified to X/Y. (In general you can only
simplify (Z*Y)/Y to Z if the multiplication did not overflow; if Z has the
form "X/Y" then this is the case). This patch implements that transform and
moves some Div logic out of instcombine and into InstructionSimplify.
Unfortunately instcombine gets in the way somewhat, since it likes to change
(X/Y)*Y into X-(X rem Y), so I had to teach instcombine about this too.
Finally, thanks to the NSW/NUW flags, sometimes we know directly that "Z*Y"
does not overflow, because the flag says so, so I added that logic too. This
eliminates a bunch of divisions and subtractions in 447.dealII, and has good
effects on some other benchmarks too. It seems to have quite an effect on
tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions
changed, resulting in massive changes all over.
llvm-svn: 124487
|
|
|
|
| |
llvm-svn: 124312
|
|
|
|
|
|
| |
which is more efficient than countPopulation - use it.
llvm-svn: 124283
|
|
|
|
|
|
|
| |
doesn't return immediately after then the insert position in UniqueSCEVs will
be out of date. No test because this is a memory corruption issue. Fixes PR9051!
llvm-svn: 124282
|
|
|
|
|
|
|
|
|
| |
a few loops accordingly. Should be no functional change.
This is a step for more accurate cost/benefit analysis of devirt/inlining
bonuses.
llvm-svn: 124275
|
|
|
|
| |
llvm-svn: 124260
|
|
|
|
| |
llvm-svn: 124188
|
|
|
|
| |
llvm-svn: 124184
|
|
|
|
|
|
|
|
|
|
|
| |
optimized code are:
(non-negative number)+(power-of-two) != 0 -> true
and
(x | 1) != 0 -> true
Instcombine knows about the second one of course, but only does it if X|1
has only one use. These fire thousands of times in the testsuite.
llvm-svn: 124183
|
|
|
|
|
|
| |
rather than interspersed. No functional change.
llvm-svn: 124168
|
|
|
|
|
|
|
|
|
|
|
| |
with BasicAA's DecomposeGEPExpression, which recently began
using a TargetData. This fixes PR8968, though the testcase
is awkward to reduce.
Also, update several off GetUnderlyingObject's users
which happen to have a TargetData handy to pass it in.
llvm-svn: 124134
|
|
|
|
| |
llvm-svn: 124132
|
|
|
|
| |
llvm-svn: 124126
|
|
|
|
|
|
| |
robust against smarter optimizations, using the power of FileCheck.
llvm-svn: 124081
|
|
|
|
|
|
|
|
|
|
| |
clang's -Wuninitialized-experimental warning.
While these don't look like real bugs, clang's
-Wuninitialized-experimental analysis is stricter
than GCC's, and these fixes have the benefit
of being general nice cleanups.
llvm-svn: 124073
|
|
|
|
| |
llvm-svn: 124062
|
|
|
|
|
|
| |
"make check" alone.
llvm-svn: 124046
|
|
|
|
|
|
| |
that we can change from indirect to direct.
llvm-svn: 124045
|
|
|
|
|
|
|
|
| |
target function.
Fixes part of rdar://8546196
llvm-svn: 124044
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0.
This patch adds this transform and some related logic to InstructionSimplify
and removes some of the logic from instcombine (unfortunately not all because
there are several situations in which instcombine can improve things by making
new instructions, whereas instsimplify is not allowed to do this). At -O2 this
often results in more than 15% more simplifications by early-cse, and results in
hundreds of lines of bitcode being eliminated from the testsuite. I did see some
small negative effects in the testsuite, for example a few additional instructions
in three programs. One program, 483.xalancbmk, got an additional 35 instructions,
which seems to be due to a function getting an additional instruction and then
being inlined all over the place.
llvm-svn: 123911
|
|
|
|
| |
llvm-svn: 123842
|
|
|
|
|
|
|
|
|
|
|
| |
by indvars through the scev expander.
trunc(add x, y) --> add(trunc x, y). Currently SCEV largely folds the other way
which is probably wrong, but preserved to minimize churn. Instcombine doesn't
do this fold either, demonstrating a missed optz'n opportunity on code doing
add+trunc+add.
llvm-svn: 123838
|
|
|
|
| |
llvm-svn: 123832
|
|
|
|
|
|
|
|
| |
are pointing to the same object, one pointer is accessing the entire
object, and the other is access has a non-zero size. This prevents
TBAA from kicking in and saying NoAlias in such cases.
llvm-svn: 123775
|
|
|
|
|
|
|
|
|
|
| |
1) -> -1.
These were not recommended by my auto-simplifier since they don't fire often enough.
However they do fire from time to time, for example they remove one subtraction from
the final bitcode for 483.xalancbmk.
llvm-svn: 123755
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
common missed
simplification in fully optimized code. It occurs sporadically in the testsuite, and
many times in 403.gcc: the final bitcode has 131 fewer subtractions after this change.
The reason that the multiplies are not eliminated is the same reason that instcombine
did not catch this: they are used by other instructions (instcombine catches this with
a more general transform which in general is only profitable if the operands have only
one use).
llvm-svn: 123754
|
|
|
|
| |
llvm-svn: 123747
|
|
|
|
| |
llvm-svn: 123562
|
|
|
|
|
|
|
|
|
|
|
| |
half a million non-local queries, each of which would otherwise have triggered a
linear scan over a basic block.
Also fix a fixme for memory intrinsics which dereference pointers. With this,
we prove that a pointer is non-null because it was dereferenced by an intrinsic
112 times in llvm-test.
llvm-svn: 123533
|
|
|
|
|
|
|
|
|
| |
simplification present in fully optimized code (I think instcombine fails to
transform some of these when "X-Y" has more than one use). Fires here and
there all over the test-suite, for example it eliminates 8 subtractions in
the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc.
llvm-svn: 123442
|
|
|
|
|
|
|
|
|
|
|
| |
threading of shifts over selects and phis while there. This fires here and
there in the testsuite, to not much effect. For example when compiling spirit
it fires 5 times, during early-cse, resulting in 6 more cse simplifications,
and 3 more terminators being folded by jump threading, but the final bitcode
doesn't change in any interesting way: other optimizations would have caught
the opportunity anyway, only later.
llvm-svn: 123441
|
|
|
|
|
|
|
|
|
|
|
|
| |
While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything. I fixed this in the constant folder as well. Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero. This is in accordance with the LangRef, but I must
admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.
llvm-svn: 123417
|
|
|
|
|
|
|
|
|
|
|
| |
Add methods for accessing the (single) entry / exit edge of a region. If no such
edge exists, null is returned. Both accessors return the start block of the
corresponding edge. The edge can finally be formed by utilizing
Region::getEntry() or Region::getExit();
Contributed by: Andreas Simbuerger <simbuerg@fim.uni-passau.de>
llvm-svn: 123410
|
|
|
|
|
|
| |
the comment I added): an extern weak global may have a null address.
llvm-svn: 123373
|
|
|
|
|
|
|
|
| |
is "X != 0 -> X" when X is a boolean. This occurs a lot because of the way
llvm-gcc converts gcc's conditional expressions. Add this, and a few other
similar transforms for completeness.
llvm-svn: 123372
|
|
|
|
| |
llvm-svn: 123243
|
|
|
|
|
|
| |
to get a testcase.
llvm-svn: 123225
|
|
|
|
| |
llvm-svn: 123218
|
|
|
|
|
|
|
|
| |
the cause of our gcc bootstrap miscompare."
It didn't.
llvm-svn: 123215
|
|
|
|
|
|
| |
gcc bootstrap miscompare.
llvm-svn: 123207
|
|
|
|
|
|
|
|
| |
point values to their integer representation through the SSE intrinsic
calls. This is the last part of a README.txt entry for which I have real
world examples.
llvm-svn: 123206
|
|
|
|
|
|
|
| |
IDs when available rather than using a mixture of IDs and textual name
comparisons.
llvm-svn: 123165
|
|
|
|
| |
llvm-svn: 123139
|
|
|
|
|
|
|
| |
NUW AddRec's much more aggressively. We now get a trip count
for @test2 in nsw.ll
llvm-svn: 123138
|
|
|
|
| |
llvm-svn: 123136
|
|
|
|
|
|
|
|
| |
a + {b,+,stride} into {a+b,+,stride} (because a is LIV),
then the resultant AddRec is NUW/NSW if the client says it
is.
llvm-svn: 123133
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
void f(int* begin, int* end) { std::fill(begin, end, 0); }
which turns into a != exit expression where one pointer is
strided and (thanks to step #1) known to not overflow, and
the other is loop invariant.
The observation here is that, though the IV is strided by
4 in this case, that the IV *has* to become equal to the
end value. It cannot "miss" the end value by stepping over
it, because if it did, the strided IV expression would
eventually wrap around.
Handle this by turning A != B into "A-B != 0" where the A-B
part is known to be NUW.
llvm-svn: 123131
|
|
|
|
|
|
|
| |
with GEP instructions are always NUW, because PHIs cannot wrap
the end of the address space.
llvm-svn: 123105
|
|
|
|
|
|
| |
that have the bit set.
llvm-svn: 123104
|
|
|
|
| |
llvm-svn: 122977
|