|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| ... |  | 
| | 
| 
| 
| | llvm-svn: 13312 | 
| | 
| 
| 
| 
| 
| | Turning "if (A < B && B < C)" into "if (A < B & B < C)"
llvm-svn: 13311 | 
| | 
| 
| 
| 
| 
| | missing opportunities for combination.
llvm-svn: 13309 | 
| | 
| 
| 
| 
| 
| | when replacing them, missing the opportunity to do simplifications
llvm-svn: 13308 | 
| | 
| 
| 
| | llvm-svn: 13307 | 
| | 
| 
| 
| | llvm-svn: 13306 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | is only used by a cast, and the casted type is the same size as the original
allocation, it would eliminate the cast by folding it into the allocation.
Unfortunately, it was placing the new allocation instruction right before
the cast, which could pull (for example) alloca instructions into the body
of a function.  This turns statically allocatable allocas into expensive
dynamically allocated allocas, which is bad bad bad.
This fixes the problem by placing the new allocation instruction at the same
place the old one was, duh. :)
llvm-svn: 13289 | 
| | 
| 
| 
| 
| 
| | patch was graciously contributed by Vladimir Prus.
llvm-svn: 13185 | 
| | 
| 
| 
| | llvm-svn: 13172 | 
| | 
| 
| 
| 
| 
| | * Commandline option (for now) controls that flag that is passed in
llvm-svn: 13141 | 
| | 
| 
| 
| 
| 
| 
| | still room for cleanup, but at least the code modification is out of the
analysis now.
llvm-svn: 13135 | 
| | 
| 
| 
| 
| 
| | the function instead of isolating it. This also means the condition is reversed.
llvm-svn: 13112 | 
| | 
| 
| 
| 
| 
| 
| | the Module. The default behavior keeps functionality as before: the chosen
function is the one that remains.
llvm-svn: 13111 | 
| | 
| 
| 
| | llvm-svn: 13108 | 
| | 
| 
| 
| | llvm-svn: 13106 | 
| | 
| 
| 
| 
| 
| 
| 
| | loop.  This eliminates the extra add from the previous case, but it's
not clear that this will be a performance win overall.  Tommorows test
results will tell. :)
llvm-svn: 13103 | 
| | 
| 
| 
| 
| 
| 
| 
| | over its USES.  If it's dead it doesn't have any uses!  :)
Thanks to the fabulous and mysterious Bill Wendling for pointing this out.  :)
llvm-svn: 13102 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | types in them.  Instead of creating an induction variable for all types, it
creates a single induction variable and casts to the other sizes.  This generates
this code:
no_exit:                ; preds = %entry, %no_exit
        %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ]            ; <uint> [#uses=4]
***     %j.0.0 = cast uint %indvar to short             ; <short> [#uses=1]
        %indvar = cast uint %indvar to int              ; <int> [#uses=1]
        %tmp.7 = getelementptr short* %P, uint %indvar          ; <short*> [#uses=1]
        store short %j.0.0, short* %tmp.7
        %inc.0 = add int %indvar, 1             ; <int> [#uses=2]
        %tmp.2 = setlt int %inc.0, %N           ; <bool> [#uses=1]
        %indvar.next = add uint %indvar, 1              ; <uint> [#uses=1]
        br bool %tmp.2, label %no_exit, label %loopexit
instead of:
no_exit:                ; preds = %entry, %no_exit
        %indvar = phi ushort [ %indvar.next, %no_exit ], [ 0, %entry ]          ; <ushort> [#uses=2]
***     %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ]            ; <uint> [#uses=3]
        %indvar = cast uint %indvar to int              ; <int> [#uses=1]
        %indvar = cast ushort %indvar to short          ; <short> [#uses=1]
        %tmp.7 = getelementptr short* %P, uint %indvar          ; <short*> [#uses=1]
        store short %indvar, short* %tmp.7
        %inc.0 = add int %indvar, 1             ; <int> [#uses=2]
        %tmp.2 = setlt int %inc.0, %N           ; <bool> [#uses=1]
        %indvar.next = add uint %indvar, 1
***     %indvar.next = add ushort %indvar, 1
        br bool %tmp.2, label %no_exit, label %loopexit
This is an improvement in register pressure, but probably doesn't happen that
often.
The more important fix will be to get rid of the redundant add.
llvm-svn: 13101 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | ilists :)
Eventually it would be nice if CallGraph maintained an ilist of CallGraphNode's instead
of a vector of pointers to them, but today is not that day.
llvm-svn: 13100 | 
| | 
| 
| 
| | llvm-svn: 13091 | 
| | 
| 
| 
| | llvm-svn: 13089 | 
| | 
| 
| 
| 
| 
| | is done, which avoids invalidating iterators in the SCC traversal routines
llvm-svn: 13088 | 
| | 
| 
| 
| | llvm-svn: 13081 | 
| | 
| 
| 
| | llvm-svn: 13080 | 
| | 
| 
| 
| 
| 
| | but it's a start, and seems to do it's basic job.
llvm-svn: 13068 | 
| | 
| 
| 
| | llvm-svn: 13057 | 
| | 
| 
| 
| | llvm-svn: 13051 | 
| | 
| 
| 
| | llvm-svn: 13048 | 
| | 
| 
| 
| 
| 
| | on demand.
llvm-svn: 13046 | 
| | 
| 
| 
| 
| 
| 
| | structure to being dynamically computed on demand.  This makes updating
loop information MUCH easier.
llvm-svn: 13045 | 
| | 
| 
| 
| | llvm-svn: 13040 | 
| | 
| 
| 
| 
| 
| 
| 
| | that the exit block of the loop becomes the new entry block of the function.
This was causing a verifier assertion on 252.eon.
llvm-svn: 13039 | 
| | 
| 
| 
| 
| 
| 
| | using instructions inside of the loop.  This should fix the MishaTest failure
from last night.
llvm-svn: 13038 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | block.  The primary motivation for doing this is that we can now unroll nested loops.
This makes a pretty big difference in some cases.  For example, in 183.equake,
we are now beating the native compiler with the CBE, and we are a lot closer
with LLC.
I'm now going to play around a bit with the unroll factor and see what effect
it really has.
llvm-svn: 13034 | 
| | 
| 
| 
| 
| 
| | While we're at it, add support for updating loop information correctly.
llvm-svn: 13033 | 
| | 
| 
| 
| 
| 
| 
| 
| | limited.  Even in it's extremely simple state (it can only *fully* unroll single
basic block loops that execute a constant number of times), it already helps improve
performance a LOT on some benchmarks, particularly with the native code generators.
llvm-svn: 13028 | 
| | 
| 
| 
| 
| 
| | of hardcoded
llvm-svn: 13025 | 
| | 
| 
| 
| 
| 
| | exit values.
llvm-svn: 13018 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | (familiar) function:
int _strlen(const char *str) {
    int len = 0;
    while (*str++) len++;
    return len;
}
And transforming it to use a ulong induction variable, because the type of
the pointer index was left as a constant long.  This is obviously very bad.
The fix is to shrink long constants in getelementptr instructions to intptr_t,
making the indvars pass insert a uint induction variable, which is much more
efficient.
Here's the before code for this function:
int %_strlen(sbyte* %str) {
entry:
        %tmp.13 = load sbyte* %str              ; <sbyte> [#uses=1]
        %tmp.24 = seteq sbyte %tmp.13, 0                ; <bool> [#uses=1]
        br bool %tmp.24, label %loopexit, label %no_exit
no_exit:                ; preds = %entry, %no_exit
***     %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ]            ; <uint> [#uses=2]
***     %indvar = phi ulong [ %indvar.next, %no_exit ], [ 0, %entry ]           ; <ulong> [#uses=2]
        %indvar1 = cast ulong %indvar to uint           ; <uint> [#uses=1]
        %inc.02.sum = add uint %indvar1, 1              ; <uint> [#uses=1]
        %inc.0.0 = getelementptr sbyte* %str, uint %inc.02.sum          ; <sbyte*> [#uses=1]
        %tmp.1 = load sbyte* %inc.0.0           ; <sbyte> [#uses=1]
        %tmp.2 = seteq sbyte %tmp.1, 0          ; <bool> [#uses=1]
        %indvar.next = add ulong %indvar, 1             ; <ulong> [#uses=1]
        %indvar.next = add uint %indvar, 1              ; <uint> [#uses=1]
        br bool %tmp.2, label %loopexit.loopexit, label %no_exit
loopexit.loopexit:              ; preds = %no_exit
        %indvar = cast uint %indvar to int              ; <int> [#uses=1]
        %inc.1 = add int %indvar, 1             ; <int> [#uses=1]
        ret int %inc.1
loopexit:               ; preds = %entry
        ret int 0
}
Here's the after code:
int %_strlen(sbyte* %str) {
entry:
        %inc.02 = getelementptr sbyte* %str, uint 1             ; <sbyte*> [#uses=1]
        %tmp.13 = load sbyte* %str              ; <sbyte> [#uses=1]
        %tmp.24 = seteq sbyte %tmp.13, 0                ; <bool> [#uses=1]
        br bool %tmp.24, label %loopexit, label %no_exit
no_exit:                ; preds = %entry, %no_exit
***     %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ]            ; <uint> [#uses=3]
        %indvar = cast uint %indvar to int              ; <int> [#uses=1]
        %inc.0.0 = getelementptr sbyte* %inc.02, uint %indvar           ; <sbyte*> [#uses=1]
        %inc.1 = add int %indvar, 1             ; <int> [#uses=1]
        %tmp.1 = load sbyte* %inc.0.0           ; <sbyte> [#uses=1]
        %tmp.2 = seteq sbyte %tmp.1, 0          ; <bool> [#uses=1]
        %indvar.next = add uint %indvar, 1              ; <uint> [#uses=1]
        br bool %tmp.2, label %loopexit, label %no_exit
loopexit:               ; preds = %entry, %no_exit
        %len.0.1 = phi int [ 0, %entry ], [ %inc.1, %no_exit ]          ; <int> [#uses=1]
        ret int %len.0.1
}
llvm-svn: 13016 | 
| | 
| 
| 
| 
| 
| 
| | the trip count for the loop, insert one so that we can canonicalize the exit
condition.
llvm-svn: 13015 | 
| | 
| 
| 
| | llvm-svn: 13011 | 
| | 
| 
| 
| 
| 
| | make the verifier more strict.  This fixes building zlib
llvm-svn: 13002 | 
| | 
| 
| 
| 
| 
| | Debian.)
llvm-svn: 12986 | 
| | 
| 
| 
| | llvm-svn: 12980 | 
| | 
| 
| 
| 
| 
| 
| 
| | Basically we were using SimplifyCFG as a huge sledgehammer for a simple
optimization.  Because simplifycfg does so many things, we can't use it
for this purpose.
llvm-svn: 12977 | 
| | 
| 
| 
| 
| 
| | the back-edge block, we must check the preincremented value.
llvm-svn: 12968 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Instead of producing code like this:
Loop:
  X = phi 0, X2
  ...
  X2 = X + 1
  if (X != N-1) goto Loop
We now generate code that looks like this:
Loop:
  X = phi 0, X2
  ...
  X2 = X + 1
  if (X2 != N) goto Loop
This has two big advantages:
  1. The trip count of the loop is now explicit in the code, allowing
     the direct implementation of Loop::getTripCount()
  2. This reduces register pressure in the loop, and allows X and X2 to be
     put into the same register.
As a consequence of the second point, the code we generate for loops went
from:
.LBB2:  # no_exit.1
	...
        mov %EDI, %ESI
        inc %EDI
        cmp %ESI, 2
        mov %ESI, %EDI
        jne .LBB2 # PC rel: no_exit.1
To:
.LBB2:  # no_exit.1
	...
        inc %ESI
        cmp %ESI, 3
        jne .LBB2 # PC rel: no_exit.1
... which has two fewer moves, and uses one less register.
llvm-svn: 12961 | 
| | 
| 
| 
| | llvm-svn: 12940 | 
| | 
| 
| 
| 
| 
| | test/Regression/Transforms/SCCP/calltest.ll
llvm-svn: 12921 | 
| | 
| 
| 
| | llvm-svn: 12919 |