| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 60779
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Merge the 'None' result into 'Normal', making loads
and stores return their dependencies on allocations as Normal.
2. Split the 'Normal' result into 'Clobber' and 'Def' to
distinguish between the cases when memdep knows the value is
produced from when we just know if may be changed.
3. Move some of the logic for determining whether readonly calls
are CSEs into memdep instead of it being in GVN. This still
leaves verification that the arguments are hte same to GVN to
let it know about value equivalences in different contexts.
4. Change memdep's call/call dependency analysis to use
getModRefInfo(CallSite,CallSite) instead of doing something
very weak. This only really matters for things like DSA, but
someday maybe we'll have some other decent context sensitive
analyses :)
5. This reimplements the guts of memdep to handle the new results.
6. This simplifies GVN significantly:
a) readonly call CSE is slightly simpler
b) I eliminated the "getDependencyFrom" chaining for load
elimination and load CSE doesn't have to worry about
volatile (they are always clobbers) anymore.
c) GVN no longer does any 'lastLoad' caching, leaving it to
memdep.
7. The logic in DSE is simplified a bit and sped up. A potentially
unsafe case was eliminated.
llvm-svn: 60607
|
|
|
|
|
|
| |
See PR3160 for details
llvm-svn: 60604
|
|
|
|
| |
llvm-svn: 60594
|
|
|
|
| |
llvm-svn: 60588
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
straight-forward implementation. This does not require any extra
alias analysis queries beyond what we already do for non-local loads.
Some programs really really like load PRE. For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.
The biggest limitation to the implementation is that it does not split
critical edges. This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.
The implementation of this should incidentally speed up rejection of
non-local loads because it avoids creating the repl densemap in cases
when it won't be used for fully redundant loads.
This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact. This is pretty close to ready though.
llvm-svn: 60408
|
|
|
|
|
|
|
| |
a new value numbering set after splitting a critical edge. This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.
llvm-svn: 60393
|
|
|
|
|
|
| |
instead of throughout it.
llvm-svn: 60339
|
|
|
|
|
|
|
| |
that it isn't reallocated all the time. This is a tiny speedup for
GVN: 3.90->3.88s
llvm-svn: 60338
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal. This makes load elimination
more aggressive. For example, on 403.gcc, we had:
< 68 gvn - Number of instructions PRE'd
< 152718 gvn - Number of instructions deleted
< 49699 gvn - Number of loads deleted
< 6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses
now we have:
> 64 gvn - Number of instructions PRE'd
> 153623 gvn - Number of instructions deleted
> 49856 gvn - Number of loads deleted
> 5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses
That's an extra 157 loads deleted and extra 905 other instructions nuked.
This slows down GVN very slightly, from 3.91 to 3.96s.
llvm-svn: 60314
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vector instead of a densemap. This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster. This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc
This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case. Here's the stats
for 403.gcc:
6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses
yay for caching :)
llvm-svn: 60313
|
|
|
|
|
|
| |
This speeds up GVN from 4.0386s to 3.9376s.
llvm-svn: 60310
|
|
|
|
|
|
|
| |
remove some fixme's. This speeds up GVN very slightly on 403.gcc
(4.06->4.03s)
llvm-svn: 60309
|
|
|
|
|
|
|
|
| |
If we see that a load depends on the allocation of its memory with no
intervening stores, we now return a 'None' depedency instead of "Normal".
This tweaks GVN to do its optimization with the new result.
llvm-svn: 60267
|
|
|
|
|
|
|
| |
a smallvector instead of a DenseMap. This speeds up GVN by 5%
on 403.gcc.
llvm-svn: 60255
|
|
|
|
|
|
|
| |
formulation that is faster and doesn't require nonLazyHelper.
Much less code.
llvm-svn: 60253
|
|
|
|
|
|
|
| |
former does caching, the later doesn't. This dramatically simplifies
the logic in getDependency and getDependencyFrom.
llvm-svn: 60234
|
|
|
|
|
|
|
|
|
|
|
| |
query. This makes it crystal clear what cases can escape from MemDep that
the clients have to handle. This also gives the clients a nice simplified
interface to it that is easy to poke at.
This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType
private, yay.
llvm-svn: 60231
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of a pointer/int pair instead of a manually bitmangled pointer.
This forces clients to think a little more about checking the
appropriate pieces and will be useful for internal
implementation improvements later.
I'm not particularly happy with this. After going through this
I don't think that the clients of memdep should be exposed to
the internal type at all. I'll fix this in a subsequent commit.
This has no functionality change.
llvm-svn: 60230
|
|
|
|
| |
llvm-svn: 57353
|
|
|
|
|
|
| |
Patch by Samuel Tardieu.
llvm-svn: 57291
|
|
|
|
|
|
|
|
|
|
|
| |
pointer bitcasts and GEP's", and centralize the
logic in Value::getUnderlyingObject. The
difference with stripPointerCasts is that
stripPointerCasts only strips GEPs if all
indices are zero, while getUnderlyingObject
strips GEPs no matter what the indices are.
llvm-svn: 56922
|
|
|
|
| |
llvm-svn: 55779
|
|
|
|
| |
llvm-svn: 55744
|
|
|
|
|
|
|
|
| |
massively complicated CFGs.
This speeds up a particular testcase from 12+ hours to 5 seconds with little perceptible loss of quality.
llvm-svn: 55391
|
|
|
|
|
|
| |
Patch contributed by m-s.
llvm-svn: 55167
|
|
|
|
| |
llvm-svn: 53771
|
|
|
|
|
|
| |
bootstrap passes with this change.
llvm-svn: 53762
|
|
|
|
| |
llvm-svn: 53730
|
|
|
|
| |
llvm-svn: 53705
|
|
|
|
|
|
| |
could cause problems for memdep when it breaks critical edges.
llvm-svn: 53691
|
|
|
|
| |
llvm-svn: 53627
|
|
|
|
| |
llvm-svn: 53616
|
|
|
|
|
|
| |
where possible. This allows local PRE to be more aggressive.
llvm-svn: 53615
|
|
|
|
| |
llvm-svn: 53470
|
|
|
|
|
|
| |
there won't be a value number match. This speeds up GVN on a case where there are very few redundancies by ~25%.
llvm-svn: 53108
|
|
|
|
| |
llvm-svn: 53040
|
|
|
|
|
|
| |
unreachable blocks.
llvm-svn: 53032
|
|
|
|
| |
llvm-svn: 52643
|
|
|
|
|
|
|
|
| |
correct our preserved analyses list, since we
do now change the CFG by splitting critical edges during PRE.
llvm-svn: 52631
|
|
|
|
| |
llvm-svn: 52574
|
|
|
|
| |
llvm-svn: 52531
|
|
|
|
|
|
| |
in a GVN+PRE that is faster that GVN alone was before.
llvm-svn: 52521
|
|
|
|
| |
llvm-svn: 52518
|
|
|
|
|
|
| |
once benchmarking is completed.
llvm-svn: 52506
|
|
|
|
| |
llvm-svn: 52505
|
|
|
|
|
|
| |
This fixes a failure on povray.
llvm-svn: 52499
|
|
|
|
|
|
|
| |
GVN expects that all inputs which to an instruction fall somewhere in the value
hierarchy, which isn't true for these.
llvm-svn: 52496
|
|
|
|
| |
llvm-svn: 52472
|
|
|
|
|
|
|
|
| |
increase code size, namely when the instantiated expression
would only need to be created in one predecessor.
llvm-svn: 52471
|