| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
a direct callee may have indirect callees and so may have changed.
llvm-svn: 13649
|
| |
|
|
| |
llvm-svn: 13643
|
| |
|
|
| |
llvm-svn: 13618
|
| |
|
|
|
|
| |
fix the really bad code we're getting on PPC.
llvm-svn: 13609
|
| |
|
|
|
|
|
| |
a full 64-bit address, it must be adjusted to fit in the branch instruction's
immediate field. (This is only used in the reoptimizer, for now.)
llvm-svn: 13608
|
| |
|
|
|
|
| |
Fix a typo in a debug message.
llvm-svn: 13607
|
| |
|
|
|
|
| |
effects as) CloneFunctionInto().
llvm-svn: 13601
|
| |
|
|
|
|
|
|
| |
CloneTrace, and because it is primarily an operation on ValueMaps. It
is now a global (non-static) function which can be pulled in using
ValueMapper.h.
llvm-svn: 13600
|
| |
|
|
|
|
| |
correct error message.
llvm-svn: 13590
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add better comments, including a better head-of-file comment.
Prune #includes.
Fix a FIXME that Chris put here by using doInitialization().
Use DEBUG() to print out debug msgs.
Give names to basic blocks inserted by this pass.
Expand tabs.
Use InsertProfilingInitCall() from ProfilingUtils to insert the initialize call.
llvm-svn: 13581
|
| |
|
|
|
|
| |
MachineBasicBlocks instead.
llvm-svn: 13568
|
| |
|
|
|
|
|
| |
Get rid of separate numbering for LLVM BasicBlocks; use the automatically
generated MachineBasicBlock numbering.
llvm-svn: 13567
|
| |
|
|
|
|
| |
LLVM BasicBlock operands.
llvm-svn: 13566
|
| |
|
|
| |
llvm-svn: 13565
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the size calculation.
This is not something you want to see:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - UNROLLING!
The problem was that 2*2147483648 == 0.
Now we get:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - TOO LARGE: 4294967296>100
Thanks to some anonymous person playing with the demo page that repeatedly
caused zion to go into swapping land. That's one way to ensure you'll get
a quick bugfix. :)
Testcase here: Transforms/LoopUnroll/2004-05-13-DontUnrollTooMuch.ll
llvm-svn: 13564
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and passing a null pointer into a function.
For this testcase:
void %test(int** %X) {
store int* null, int** %X
call void %test(int** null)
ret void
}
we now generate this:
test:
sub %ESP, 12
mov %EAX, DWORD PTR [%ESP + 16]
mov DWORD PTR [%EAX], 0
mov DWORD PTR [%ESP], 0
call test
add %ESP, 12
ret
instead of this:
test:
sub %ESP, 12
mov %EAX, DWORD PTR [%ESP + 16]
mov %ECX, 0
mov DWORD PTR [%EAX], %ECX
mov %EAX, 0
mov DWORD PTR [%ESP], %EAX
call test
add %ESP, 12
ret
llvm-svn: 13558
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the alloca address into common operations like loads/stores.
In a simple testcase like this (which is just designed to excersize the
alloca A, nothing more):
int %test(int %X, bool %C) {
%A = alloca int
store int %X, int* %A
store int* %A, int** %G
br bool %C, label %T, label %F
T:
call int %test(int 1, bool false)
%V = load int* %A
ret int %V
F:
call int %test(int 123, bool true)
%V2 = load int* %A
ret int %V2
}
We now generate:
test:
sub %ESP, 12
mov %EAX, DWORD PTR [%ESP + 16]
mov %CL, BYTE PTR [%ESP + 20]
*** mov DWORD PTR [%ESP + 8], %EAX
mov %EAX, OFFSET G
lea %EDX, DWORD PTR [%ESP + 8]
mov DWORD PTR [%EAX], %EDX
test %CL, %CL
je .LBB2 # PC rel: F
.LBB1: # T
mov DWORD PTR [%ESP], 1
mov DWORD PTR [%ESP + 4], 0
call test
*** mov %EAX, DWORD PTR [%ESP + 8]
add %ESP, 12
ret
.LBB2: # F
mov DWORD PTR [%ESP], 123
mov DWORD PTR [%ESP + 4], 1
call test
*** mov %EAX, DWORD PTR [%ESP + 8]
add %ESP, 12
ret
Instead of:
test:
sub %ESP, 20
mov %EAX, DWORD PTR [%ESP + 24]
mov %CL, BYTE PTR [%ESP + 28]
*** lea %EDX, DWORD PTR [%ESP + 16]
*** mov DWORD PTR [%EDX], %EAX
mov %EAX, OFFSET G
mov DWORD PTR [%EAX], %EDX
test %CL, %CL
*** mov DWORD PTR [%ESP + 12], %EDX
je .LBB2 # PC rel: F
.LBB1: # T
mov DWORD PTR [%ESP], 1
mov %EAX, 0
mov DWORD PTR [%ESP + 4], %EAX
call test
*** mov %EAX, DWORD PTR [%ESP + 12]
*** mov %EAX, DWORD PTR [%EAX]
add %ESP, 20
ret
.LBB2: # F
mov DWORD PTR [%ESP], 123
mov %EAX, 1
mov DWORD PTR [%ESP + 4], %EAX
call test
*** mov %EAX, DWORD PTR [%ESP + 12]
*** mov %EAX, DWORD PTR [%EAX]
add %ESP, 20
ret
llvm-svn: 13557
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sized allocas in the entry block). Instead of generating code like this:
entry:
reg1024 = ESP+1234
... (much later)
*reg1024 = 17
Generate code that looks like this:
entry:
(no code generated)
... (much later)
t = ESP+1234
*t = 17
The advantage being that we DRAMATICALLY reduce the register pressure for these
silly temporaries (they were all being spilled to the stack, resulting in very
silly code). This is actually a manual implementation of rematerialization :)
I have a patch to fold the alloca address computation into loads & stores, which
will make this much better still, but just getting this right took way too much time
and I'm sleepy.
llvm-svn: 13554
|
| |
|
|
|
|
|
|
| |
broke obsequi and a lot of other things. It all boiled down to MBB being
overloaded in an inner scope and me confusing it with the one in the outer
scope. Ugh!
llvm-svn: 13517
|
| |
|
|
| |
llvm-svn: 13515
|
| |
|
|
|
|
|
|
| |
MBBs start out as #-1. When a MBB is added to a MachineFunction, it
gets the next available unique MBB number. If it is removed from a
MachineFunction, it goes back to being #-1.
llvm-svn: 13514
|
| |
|
|
|
|
|
|
|
|
|
| |
mov DWORD PTR [%ESP + 4], 1
instead of:
mov %EAX, 1
mov DWORD PTR [%ESP + 4], %EAX
llvm-svn: 13494
|
| |
|
|
|
|
| |
give the extracted function a more useful name than just foo_code.
llvm-svn: 13493
|
| |
|
|
|
|
| |
instruction in them.
llvm-svn: 13490
|
| |
|
|
|
|
|
|
|
|
| |
PHI node entries from multiple outside-the-region blocks. This also fixes
extraction of the entry block in a function. Yaay.
This has successfully block extracted all (but one) block from the score_move
function in obsequi (out of 33). Hrm, I wonder which block the bug is in. :)
llvm-svn: 13489
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add a stub for the severSplitPHINodes which will allow us to bbextract
bb's with PHI nodes in them soon.
* Remove unused arguments from findInputsOutputs
* Dramatically simplify the code in findInputsOutputs. In particular,
nothing really cares whether or not a PHI node is using something.
* Move moveCodeToFunction to after emitCallAndSwitchStatement as that's the
order they get called.
* Fix a bug where we would code extract a region that included a call to
vastart. Like 'alloca', calls to vastart must stay in the function that
they are defined in.
* Add some comments.
llvm-svn: 13482
|
| |
|
|
|
|
|
|
|
|
|
| |
from the extracted region. If the return has 0 or 1 exit blocks, the new
function returns void. If it has 2 exits, it returns bool, otherwise it
returns a ushort as before.
This allows us to use a conditional branch instruction when there are two
exit blocks, as often happens during block extraction.
llvm-svn: 13481
|
| |
|
|
|
|
|
|
|
|
|
|
| |
1. Get rid of the silly abort block. When doing bb extraction, we get one
abort block for every block extracted, which is kinda annoying.
2. If the switch ends up having a single destination, turn it into an
unconditional branch.
I would like to add support for conditional branches, but to do this we will
want to have the function return a bool instead of a ushort.
llvm-svn: 13478
|
| |
|
|
|
|
| |
phi-elimination from 0.6 to 0.54s on kc++.
llvm-svn: 13454
|
| |
|
|
|
|
|
| |
in the basic block being processed. This fixes PhiElimination on kimwitu++
from taking 105s to taking a much more reasonable 0.6s (in a debug build).
llvm-svn: 13453
|
| |
|
|
|
|
|
|
|
| |
than before. Because this is the case, we can compute the first non-phi
instruction once when de-phi'ing a block. This shaves ~4s off of
phielimination of _Z7yyparsev in kimwitu++ from 109s -> 105s. There are
still much more important gains to come.
llvm-svn: 13452
|
| |
|
|
|
|
|
| |
compiling things like 'add long %X, 1'. The problem is that we were switching
the order of the operands for longs even though we can't fold them yet.
llvm-svn: 13451
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
when we see a read of a register. This is important in cases like:
AL = ...
AH = ...
= AX
The read of AX must make both the AL and AH defs live until the use.
llvm-svn: 13444
|
| |
|
|
| |
llvm-svn: 13440
|
| |
|
|
|
|
| |
extension required)
llvm-svn: 13439
|
| |
|
|
|
|
| |
them. This should *dramatically* improve the performance of CBE compiled code on targets that depend on GCC's loop optimizations (like PPC)
llvm-svn: 13438
|
| |
|
|
| |
llvm-svn: 13437
|
| |
|
|
| |
llvm-svn: 13436
|
| |
|
|
|
|
|
| |
FindUsedTypes manipulation stuff out to be a seperate pass, and make the
main CWriter be a function pass now!
llvm-svn: 13435
|
| |
|
|
| |
llvm-svn: 13433
|
| |
|
|
| |
llvm-svn: 13432
|
| |
|
|
|
|
| |
be a conditional branch or switch.
llvm-svn: 13430
|
| |
|
|
| |
llvm-svn: 13429
|
| |
|
|
| |
llvm-svn: 13425
|
| |
|
|
| |
llvm-svn: 13424
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Flesh out the SetCC support... which currently ends in a little bit
of unfinished code (which is probably completely hilarious) for
generating the condition value splitting the basic block up into 4
blocks, like this (clearly a better API is needed for this!):
BB
cond. branch
/ / R1=1 R2=0
\ /
\ /
R=phi(R1,R2)
Other minor edits.
llvm-svn: 13423
|
| |
|
|
| |
llvm-svn: 13422
|
| |
|
|
| |
llvm-svn: 13421
|
| |
|
|
| |
llvm-svn: 13420
|
| |
|
|
| |
llvm-svn: 13419
|