| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 122828
|
|
|
|
| |
llvm-svn: 122827
|
|
|
|
| |
llvm-svn: 122826
|
|
|
|
|
|
|
|
| |
the "leader table", and
rename methods to make it much more clear what they're doing.
llvm-svn: 122823
|
|
|
|
|
|
| |
in a corner case.
llvm-svn: 122822
|
|
|
|
|
|
|
|
|
|
| |
case where a static caller is itself inlined everywhere else, and
thus may go away if it doesn't get too big due to inlining other
things into it. If there are references to the caller other than
calls, it will not be removed; account for this.
This results in same-day completion of the case in PR8853.
llvm-svn: 122821
|
|
|
|
|
|
|
|
|
| |
value number for them. This
avoids adding them to the various value numbering tables, resulting in a minor (~3%) speedup for GVN
on 40.gcc.
llvm-svn: 122819
|
|
|
|
| |
llvm-svn: 122817
|
|
|
|
| |
llvm-svn: 122815
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when safe.
The testcase is basically this nested loop:
void foo(char *X) {
for (int i = 0; i != 100; ++i)
for (int j = 0; j != 100; ++j)
X[j+i*100] = 0;
}
which gets turned into a single memset now. clang -O3 doesn't optimize
this yet though due to a phase ordering issue I haven't analyzed yet.
llvm-svn: 122806
|
|
|
|
|
|
|
|
| |
instruction *after* the store. The store will always be deleted
if the transformation kicks in, so we'd do an N^2 scan of every
loop block. Whoops.
llvm-svn: 122805
|
|
|
|
|
|
| |
CodeGenPrepare (which is the default behavior).
llvm-svn: 122801
|
|
|
|
|
|
|
|
|
|
|
|
| |
FunctionPass. It probably doesn't have a reason to be a LoopPass, as it will
probably drop the simple fixed point and either use RPO iteration or Duncan's
approach in instsimplify of only revisiting instructions that have changed.
The next step is to preserve LoopSimplify. This looks like it won't be too hard,
although the pass manager doesn't actually seem to respect when non-loop passes
claim to preserve LCSSA or LoopSimplify. This will have to be fixed.
llvm-svn: 122791
|
|
|
|
|
|
|
| |
stop setting NSW: signed overflow is possible. Thanks to Dan
for pointing these out.
llvm-svn: 122790
|
|
|
|
| |
llvm-svn: 122788
|
|
|
|
|
|
|
|
| |
PHIs of GEPs. For the moment,
have GlobalsModRef handle this conservatively by simply removing the value from its maps.
llvm-svn: 122787
|
|
|
|
|
|
|
| |
invalidated by stores, so they can be handled as 'simple'
operations.
llvm-svn: 122785
|
|
|
|
|
|
| |
almost-but-not-quite-identical code. No intended functionality change.
llvm-svn: 122760
|
|
|
|
|
|
|
|
|
|
|
| |
that are allowed to have metadata operands are intrinsic calls,
and the only ones that take metadata currently return void.
Just reject all void instructions, which should not be value
numbered anyway. To future proof things, add an assert to the
getHashValue impl for calls to check that metadata operands
aren't present.
llvm-svn: 122759
|
|
|
|
|
|
|
|
|
|
| |
nested values, so they can change and drop to null, which can
change the hash and cause havok.
It turns out that it isn't a good idea to value number stuff
with metadata operands anyway, so... don't.
llvm-svn: 122758
|
|
|
|
|
|
|
| |
InstructionSimplify on instructions that didn't change since the
last time round the loop.
llvm-svn: 122745
|
|
|
|
|
|
|
|
|
| |
capacity on the Visited SmallPtrSet. On 403.gcc, this is about a 4.5% speedup of
CodeGenPrepare time (which itself is 10% of time spent in the backend).
This is progress towards PR8889.
llvm-svn: 122741
|
|
|
|
|
|
|
| |
elimination as well. This deletes 60 stores in 176.gcc
that largely come from bitfield code.
llvm-svn: 122736
|
|
|
|
|
|
| |
speeding earlycse up by 6%.
llvm-svn: 122733
|
|
|
|
|
|
|
| |
store->load forwarding. This allows EarlyCSE to zap 600 more
loads from 176.gcc.
llvm-svn: 122732
|
|
|
|
|
|
| |
by their pointer instead of using MemoryValue to wrap it.
llvm-svn: 122731
|
|
|
|
| |
llvm-svn: 122729
|
|
|
|
|
|
|
| |
On 176.gcc, this catches 13090 loads and calls, and increases the
number of simple instructions CSE'd from 29658 to 36208.
llvm-svn: 122727
|
|
|
|
| |
llvm-svn: 122725
|
|
|
|
| |
llvm-svn: 122724
|
|
|
|
|
|
| |
allocator. This speeds up early cse by about 20%
llvm-svn: 122723
|
|
|
|
| |
llvm-svn: 122720
|
|
|
|
|
|
|
| |
of instcombine that is currently in the middle of the loop pass pipeline. This
commit only checks in the pass; it will hopefully be enabled by default later.
llvm-svn: 122719
|
|
|
|
| |
llvm-svn: 122718
|
|
|
|
|
|
| |
Teach it to CSE the rest of the non-side-effecting instructions.
llvm-svn: 122716
|
|
|
|
|
|
| |
Add a testcase.
llvm-svn: 122715
|
|
|
|
|
|
| |
so that Dominators.h is *just* domtree. Also prune #includes a bit.
llvm-svn: 122714
|
|
|
|
| |
llvm-svn: 122713
|
|
|
|
|
|
|
|
| |
sure that the loop we're promoting into a memcpy doesn't mutate the input
of the memcpy. Before we were just checking that the dest of the memcpy
wasn't mod/ref'd by the loop.
llvm-svn: 122712
|
|
|
|
|
|
|
| |
mess with it. We'd rather peel/unroll it than convert all of its
stores into memsets.
llvm-svn: 122711
|
|
|
|
|
|
| |
another function.
llvm-svn: 122705
|
|
|
|
|
|
|
| |
blocks in a loop, instead of just the header block. This makes it more
aggressive, able to handle Duncan's Ada examples.
llvm-svn: 122704
|
|
|
|
| |
llvm-svn: 122703
|
|
|
|
|
|
|
|
| |
isExitBlockDominatedByBlockInLoop is a relic of the days when domtree was
*just* a tree and didn't have DFS numbers. Checking DFS numbers is faster
and easier than "limiting the search of the tree".
llvm-svn: 122702
|
|
|
|
| |
llvm-svn: 122701
|
|
|
|
|
|
|
|
|
|
| |
described
in the PR, the pass could break LCSSA form when inserting preheaders. It probably
would be easy enough to fix this, but since currently we always go into LCSSA form
after running this pass, doing so is not urgent.
llvm-svn: 122695
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
header for now for memset/memcpy opportunities. It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for
loops" into 2 basic block loops that loop-idiom was ignoring.
With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:
for (j=0; j<MAX_history; ++j) {
history_new[i][j+1] = history[2*i][j];
}
Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine. Woo.
llvm-svn: 122685
|
|
|
|
| |
llvm-svn: 122683
|
|
|
|
| |
llvm-svn: 122682
|
|
|
|
|
|
|
| |
size of a loop header instead of its own code size estimator.
This allows it to handle bitcasts etc more precisely.
llvm-svn: 122681
|