|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | about pairs of AA::Location's instead of looking for MemDep's
"Def" predicate.  This is more powerful and general, handling
memset/memcpy/store all uniformly, and implementing PR8701 and
probably obsoleting parts of memcpyoptimizer.
This also fixes an obscure bug with init.trampoline and i8
stores, but I'm not surprised it hasn't been hit yet.  Enhancing
init.trampoline to carry the size that it stores would allow
DSE to be much more aggressive about optimizing them.
llvm-svn: 120406 | 
| | 
| 
| 
| | llvm-svn: 120398 | 
| | 
| 
| 
| | llvm-svn: 120391 | 
| | 
| 
| 
| 
| 
| 
| | is trivially dead, since these have side effects.  This makes the
(misnamed) MemoryUseIntrinsic class dead, so remove it.
llvm-svn: 120382 | 
| | 
| 
| 
| 
| 
| | remove an actively-wrong comment.
llvm-svn: 120378 | 
| | 
| 
| 
| 
| 
| 
| | It can be seriously improved, but at least now it isn't intertwined
with the other logic.
llvm-svn: 120377 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | contains "ref".
Enhance DSE to use a modref query instead of a store-specific hack
to generalize the "ignore may-alias stores" optimization to handle
memset and memcpy.
llvm-svn: 120368 | 
| | 
| 
| 
| 
| 
| | stores, fix and add a testcase.
llvm-svn: 120363 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | 1. Don't bother trying to optimize:
lifetime.end(ptr)
store(ptr)
as it is undefined, and therefore shouldn't exist.
2. Move the 'storing a loaded pointer' xform up, simplifying
  the may-aliased store code.
llvm-svn: 120359 | 
| | 
| 
| 
| | llvm-svn: 120347 | 
| | 
| 
| 
| | llvm-svn: 120325 | 
| | 
| 
| 
| 
| 
| | has no other uses, shrinking the load.
llvm-svn: 120323 | 
| | 
| 
| 
| 
| 
| 
| 
| | by my recent GVN improvement.  Looking through a single layer of
PHI nodes when attempting to sink GEPs, we need to iteratively
look through arbitrary PHI nests.
llvm-svn: 120202 | 
| | 
| 
| 
| 
| 
| 
| | whether the pointer can be replaced with the global variable it is a copy of.
Fixes PR8680.
llvm-svn: 120126 | 
| | 
| 
| 
| | llvm-svn: 120051 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | two.
E.g. -5 % 5 is 0 with srem and 1 with urem.
Also addresses Frits van Bommel's comments.
llvm-svn: 120049 | 
| | 
| 
| 
| 
| 
| 
| | in two places that are really interested in simplified instructions, not
constants.
llvm-svn: 120044 | 
| | 
| 
| 
| 
| 
| 
| | (which does constant folding and more) is called a few lines
later.
llvm-svn: 120042 | 
| | 
| 
| 
| 
| 
| 
| 
| | positive.
This allows to transform the rem in "1 << ((int)x % 8);" to an and.
llvm-svn: 120028 | 
| | 
| 
| 
| 
| 
| | Stylistic improvement suggested by Frits van Bommel.
llvm-svn: 120026 | 
| | 
| 
| 
| | llvm-svn: 120025 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | fairly systematic way in instcombine.  Some of these cases were already dealt
with, in which case I removed the existing code.  The case of Add has a bunch of
funky logic which covers some of this plus a few variants (considers shifts to be
a form of multiplication), which I didn't touch.  The simplification performed is:
A*B+A*C -> A*(B+C).  The improvement is to do this in cases that were not already
handled [such as A*B-A*C -> A*(B-C), which was reported on the mailing list], and
also to do it more often by not checking for "only one use" if "B+C" simplifies.
llvm-svn: 120024 | 
| | 
| 
| 
| 
| 
| | on this instcombine xform.  This fixes a miscompilation of 403.gcc.
llvm-svn: 119988 | 
| | 
| 
| 
| | llvm-svn: 119984 | 
| | 
| 
| 
| 
| 
| | then replace the index with zero.
llvm-svn: 119974 | 
| | 
| 
| 
| 
| 
| | InstructionSimplify.
llvm-svn: 119970 | 
| | 
| 
| 
| 
| 
| | is never used.  Patch by Cameron Zwarich.
llvm-svn: 119963 | 
| | 
| 
| 
| | llvm-svn: 119948 | 
| | 
| 
| 
| 
| 
| 
| | method in MemDep instead of inserting an instruction, doing a query,
then removing it.  Neither operation is effectively cached.
llvm-svn: 119930 | 
| | 
| 
| 
| | llvm-svn: 119927 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | void a(int x) { if (((1<<x)&8)==0) b(); }
into "x != 3", which occurs over 100 times in 403.gcc but in no
other program in llvm-test.
llvm-svn: 119922 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | allowing the memcpy to be eliminated.
Unfortunately, the requirements on byval's without explicit 
alignment are really weak and impossible to predict in the 
mid-level optimizer, so this doesn't kick in much with current
frontends.  The fix is to change clang to set alignment on all
byval arguments.
llvm-svn: 119916 | 
| | 
| 
| 
| | llvm-svn: 119908 | 
| | 
| 
| 
| | llvm-svn: 119865 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | nodes
if all the operands of the PHI are equivalent.  This allows CodeGenPrepare to undo
unprofitable PRE transforms.
llvm-svn: 119853 | 
| | 
| 
| 
| 
| 
| 
| 
| | preserves LCSSA form out of ScalarEvolution and into the LoopInfo
class.  Use it to check that SimplifyInstruction simplifications
are not breaking LCSSA form.  Fixes PR8622.
llvm-svn: 119727 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | to leader mapping.  Previously,
this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum
if the initial lookup failed.  This led to really bad performance on tall, narrow CFGs.
We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually
represented by a hashtable with a list of Value*'s as the value type), and then
determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by
DominatorTree.  Because there are typically few duplicates of a given value, this scan tends to be
quite fast.  Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary
allocation in representing the value-side of the multimap.
This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I
think is pretty good considering that includes all the "real work" being done by MemDep as well.
The one downside to this approach is that we can no longer use GVN to perform simple conditional progation,
but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up
the slack.  If you see conditional propagation that's not happening, please file bugs against LVI or CVP.
llvm-svn: 119714 | 
| | 
| 
| 
| 
| 
| | saying "it would be bad", give an example of what is going on.
llvm-svn: 119695 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | refusing to optimize two memcpy's like this:
copy A <- B
copy C <- A
if it couldn't prove that noalias(B,C).  We can eliminate
the copy by producing a memmove instead of memcpy.
llvm-svn: 119694 | 
| | 
| 
| 
| 
| 
| | source and dest are known to not overlap.
llvm-svn: 119692 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | check:
there is no need to check to see if the source and dest of a memcpy are noalias,
behavior is undefined if not.
llvm-svn: 119691 | 
| | 
| 
| 
| | llvm-svn: 119690 | 
| | 
| 
| 
| 
| 
| | out of processMemCpy into its own function.
llvm-svn: 119687 | 
| | 
| 
| 
| 
| 
| 
| 
| | if it is passed as a byval argument.  The byval argument will just be a
read, so it is safe to read from the original global instead.  This allows
us to promote away the %agg.tmp alloca in PR8582
llvm-svn: 119686 | 
| | 
| 
| 
| 
| 
| 
| | to ignore calls that obviously can't modify the alloca
because they are readonly/readnone.
llvm-svn: 119683 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | optimization.  If the alloca that is "memcpy'd from constant" also has
a memcpy from *it*, ignore it: it is a load.  We now optimize the testcase to:
define void @test2() {
  %B = alloca %T
  %a = bitcast %T* @G to i8*
  %b = bitcast %T* %B to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false)
  call void @bar(i8* %b)
  ret void
}
previously we would generate:
define void @test() {
  %B = alloca %T
  %b = bitcast %T* %B to i8*
  %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0
  %tmp3 = load i8* %G.0, align 4
  %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1
  %G.15 = bitcast [123 x i8]* %G.1 to i8*
  %1 = bitcast [123 x i8]* %G.1 to i984*
  %srcval = load i984* %1, align 1
  %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0
  store i8 %tmp3, i8* %B.0, align 4
  %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1
  %B.12 = bitcast [123 x i8]* %B.1 to i8*
  %2 = bitcast [123 x i8]* %B.1 to i984*
  store i984 %srcval, i984* %2, align 1
  call void @bar(i8* %b)
  ret void
}
llvm-svn: 119682 | 
| | 
| 
| 
| | llvm-svn: 119570 | 
| | 
| 
| 
| 
| 
| 
| | functions of ScalarEvolution, in preparation for memoization and
other optimizations.
llvm-svn: 119562 | 
| | 
| 
| 
| 
| 
| | to avoid an unneeded dependence.
llvm-svn: 119557 | 
| | 
| 
| 
| | llvm-svn: 119538 |