| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
could end up removing a different function than we intended because it was
functionally equivalent, then end up with a comparison of a function against
itself in the next round of comparisons (the one in the function set and the
one on the deferred list). To fix this, I introduce a choice in the form of
comparison for ComparableFunctions, either normal or "pointer only" used to
find exact Function*'s in lookups.
Also add some debugging statements.
llvm-svn: 125180
|
|
|
|
|
|
|
|
| |
the active loop. This is generally desirable, and it avoids trouble
in situations such as the testcase in PR9123, though the failure
mode depends on use-list order, so it is infeasible to test.
llvm-svn: 125065
|
|
|
|
|
|
|
|
| |
switch. If we used only one icmp, don't turn it into a switch.
Also prevent the switch-to-icmp transform from creating identity adds, noticed by Marius Wachtler.
llvm-svn: 125056
|
|
|
|
|
|
|
|
| |
instcombine xform to exercise this.
Nothing forms exact udivs yet though. This is progress on PR8862
llvm-svn: 124992
|
|
|
|
| |
llvm-svn: 124977
|
|
|
|
|
|
| |
now, and this wasn't comparing some of their relevant bits anyhow.
llvm-svn: 124976
|
|
|
|
|
|
|
|
| |
are not sorted into sub+icmp.
This transforms another 1000 switches in gcc.c.
llvm-svn: 124826
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the job of the later optzn passes easier, allowing the vast amount of
icmp transforms to chew on it.
We transform 840 switches in gcc.c, leading to a 16k byte shrink of the resulting
binary on i386-linux.
The testcase from README.txt now compiles into
decl %edi
cmpl $3, %edi
sbbl %eax, %eax
andl $1, %eax
ret
llvm-svn: 124724
|
|
|
|
|
|
|
|
| |
that might have changed been affected by a merge elsewhere will have been
removed from the function set, and it isn't needed for performance because we
call grow() ahead of time to prevent reallocations.
llvm-svn: 124717
|
|
|
|
|
|
|
| |
reassociation. No testcase, because I wasn't able to create a testcase
which actually demonstrates a problem.
llvm-svn: 124713
|
|
|
|
| |
llvm-svn: 124712
|
|
|
|
|
|
|
| |
(A+B) == A -> B == 0
A == (A+B) -> B == 0
llvm-svn: 124567
|
|
|
|
|
|
| |
The DEBUG() call at line 606 demands to see raw_ostream's definition. I have no idea why this seems to only break MSVC.
llvm-svn: 124545
|
|
|
|
| |
llvm-svn: 124535
|
|
|
|
| |
llvm-svn: 124534
|
|
|
|
| |
llvm-svn: 124527
|
|
|
|
| |
llvm-svn: 124526
|
|
|
|
| |
llvm-svn: 124522
|
|
|
|
|
|
| |
unconditional predecessor to enable TCE on demand.
llvm-svn: 124518
|
|
|
|
|
|
|
|
|
|
| |
Modified patch by Adam Preuss.
This builds on the existing framework for block tracing, edge profiling and optimal edge profiling.
See -help-hidden for new flags.
For documentation, see the technical report "Implementation of Path Profiling..." in llvm.org/pubs.
llvm-svn: 124515
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
benchmarks, and that it can be simplified to X/Y. (In general you can only
simplify (Z*Y)/Y to Z if the multiplication did not overflow; if Z has the
form "X/Y" then this is the case). This patch implements that transform and
moves some Div logic out of instcombine and into InstructionSimplify.
Unfortunately instcombine gets in the way somewhat, since it likes to change
(X/Y)*Y into X-(X rem Y), so I had to teach instcombine about this too.
Finally, thanks to the NSW/NUW flags, sometimes we know directly that "Z*Y"
does not overflow, because the flag says so, so I added that logic too. This
eliminates a bunch of divisions and subtractions in 447.dealII, and has good
effects on some other benchmarks too. It seems to have quite an effect on
tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions
changed, resulting in massive changes all over.
llvm-svn: 124487
|
|
|
|
|
|
| |
functionality change.
llvm-svn: 124482
|
|
|
|
| |
llvm-svn: 124480
|
|
|
|
| |
llvm-svn: 124479
|
|
|
|
| |
llvm-svn: 124478
|
|
|
|
|
|
| |
the function equality set.
llvm-svn: 124475
|
|
|
|
| |
llvm-svn: 124469
|
|
|
|
|
|
|
|
| |
branches. PR8575, rdar://5134905, rdar://8911460.
- Allow codegen tail duplication to dup small return blocks after register
allocation is done.
llvm-svn: 124462
|
|
|
|
| |
llvm-svn: 124426
|
|
|
|
| |
llvm-svn: 124406
|
|
|
|
| |
llvm-svn: 124404
|
|
|
|
|
|
| |
that relationships like "i8* null" is equivalent to "i32* null".
llvm-svn: 124368
|
|
|
|
|
|
|
|
| |
operand being factorized (and erased) could occur several times in Ops,
resulting in freed memory being used when the next occurrence in Ops was
analyzed.
llvm-svn: 124287
|
|
|
|
|
|
| |
it. No functionality change!
llvm-svn: 124286
|
|
|
|
|
|
|
| |
merge vector<intptr_t>::push_back() and vector<void*>::push_back() because
Enumerate() doesn't realize that "i64* null" and "i8** null" are equivalent.
llvm-svn: 124285
|
|
|
|
|
|
| |
elements for type equivalence.
llvm-svn: 124284
|
|
|
|
|
|
|
| |
for now. It's controlled by the HasGlobalAliases variable which is not attached
to any flag yet.
llvm-svn: 124182
|
|
|
|
|
|
|
|
|
|
|
| |
with BasicAA's DecomposeGEPExpression, which recently began
using a TargetData. This fixes PR8968, though the testcase
is awkward to reduce.
Also, update several off GetUnderlyingObject's users
which happen to have a TargetData handy to pass it in.
llvm-svn: 124134
|
|
|
|
|
|
| |
code.
llvm-svn: 124100
|
|
|
|
| |
llvm-svn: 124099
|
|
|
|
|
|
|
|
|
|
| |
occurs because instcombine sinks loads and inserts phis. This kicks in
on such apps as 175.vpr, eon, 403.gcc, xalancbmk and a bunch of times in
spec2006 in some app that uses std::deque.
This resolves the last of rdar://7339113.
llvm-svn: 124090
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
common cases. This triggers a surprising number of times in SPEC2K6
because min/max idioms end up doing this. For example, code from the
STL ends up looking like this to SRoA:
%202 = load i64* %__old_size, align 8, !tbaa !3
%203 = load i64* %__old_size, align 8, !tbaa !3
%204 = load i64* %__n, align 8, !tbaa !3
%205 = icmp ult i64 %203, %204
%storemerge.i = select i1 %205, i64* %__n, i64* %__old_size
%206 = load i64* %storemerge.i, align 8, !tbaa !3
We can now promote both the __n and the __old_size allocas.
This addresses another chunk of rdar://7339113, poor codegen on
stringswitch.
llvm-svn: 124088
|
|
|
|
|
|
|
|
|
|
| |
clang's -Wuninitialized-experimental warning.
While these don't look like real bugs, clang's
-Wuninitialized-experimental analysis is stricter
than GCC's, and these fixes have the benefit
of being general nice cleanups.
llvm-svn: 124073
|
|
|
|
|
|
|
|
|
|
|
|
| |
that have PHI or select uses of their element pointers. This can often happen
when instcombine sinks two loads into a successor, inserting a phi or select.
With this patch, we can scalarize the alloca, but the pinned elements are not
yet promoted. This is still a win for large aggregates where only one element
is used. This fixes rdar://8904039 and part of rdar://7339113 (poor codegen
on stringswitch).
llvm-svn: 124070
|
|
|
|
|
|
| |
on test-suite + SPEC2000 & SPEC2006.
llvm-svn: 124068
|
|
|
|
|
|
| |
No functionality change.
llvm-svn: 124067
|
|
|
|
|
|
|
|
| |
handle the "Transformation preventing inst" printing,
so that -scalarrepl -debug will always print the rejected
instruction. No functionality change.
llvm-svn: 124066
|
|
|
|
|
|
| |
X86 backend has been fixed.
llvm-svn: 124064
|
|
|
|
|
|
| |
how they should be checked.
llvm-svn: 123999
|
|
|
|
|
|
|
| |
A == B, and A > B, does not mean we can fold it to true. We still need to
check for A ? B (A unordered B).
llvm-svn: 123993
|