| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
shrd [mem], reg, imm
This fixes the jit-ls failure on 186.crafty.
llvm-svn: 14914
|
| |
|
|
| |
llvm-svn: 14913
|
| |
|
|
| |
llvm-svn: 14841
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this LLVM function:
int %foo() {
ret int cast (int** getelementptr (int** null, int 1) to int)
}
into:
foo:
mov %EAX, 0
lea %EAX, DWORD PTR [%EAX + 4]
ret
now we compile it into:
foo:
mov %EAX, 4
ret
This sequence is frequently generated by the MSIL front-end, and soon the malloc lowering pass and
Java front-ends as well..
-Chris
llvm-svn: 14834
|
| |
|
|
|
|
| |
The shared command line options are now in a header that makes sense.
llvm-svn: 14756
|
| |
|
|
| |
llvm-svn: 14747
|
| |
|
|
| |
llvm-svn: 14745
|
| |
|
|
| |
llvm-svn: 14622
|
| |
|
|
| |
llvm-svn: 14564
|
| |
|
|
| |
llvm-svn: 14483
|
| |
|
|
| |
llvm-svn: 14482
|
| |
|
|
|
|
|
|
| |
If not,
this is a bug, and should be fixed.
llvm-svn: 14476
|
| |
|
|
|
|
|
| |
float as a truncation by going through memory. This truncation was being
skipped, which caused 175.vpr to fail after aggressive register promotion.
llvm-svn: 14473
|
| |
|
|
|
|
| |
MachineBasicBlock that is not yet attached to a MachineFunction. This change includes changing the third operand (TargetMachine) to a pointer for the MachineInstr::print function.
llvm-svn: 14389
|
| |
|
|
| |
llvm-svn: 14299
|
| |
|
|
| |
llvm-svn: 14298
|
| |
|
|
| |
llvm-svn: 14266
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
mov REG, C
sub REG, X
generate:
neg X
add X, C
which uses one less reg
llvm-svn: 14213
|
| |
|
|
|
|
| |
the setcc.
llvm-svn: 14212
|
| |
|
|
|
|
|
|
|
|
| |
we do not want to fold the load in cases like this:
X = load
= add A, X
= add B, X
llvm-svn: 14204
|
| |
|
|
| |
llvm-svn: 14201
|
| |
|
|
| |
llvm-svn: 14189
|
| |
|
|
|
|
| |
needs to go
llvm-svn: 14185
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
comparisons. In an 'isunordered' predicate, which looks like this at
the LLVM level:
%a = call bool %llvm.isnan(double %X)
%b = call bool %llvm.isnan(double %Y)
%COM = or bool %a, %b
We used to generate this code:
fxch %ST(1)
fucomip %ST(0), %ST(0)
setp %AL
fucomip %ST(0), %ST(0)
setp %AH
or %AL, %AH
With this patch, we generate this code:
fucomip %ST(0), %ST(1)
fstp %ST(0)
setp %AL
Which should make alkis happy. Tested as X86/compare_folding.llx:test1
llvm-svn: 14148
|
| |
|
|
| |
llvm-svn: 14146
|
| |
|
|
| |
llvm-svn: 14145
|
| |
|
|
|
|
|
|
|
| |
instructions,
we can get rid of the FpUCOM/FpUCOMi pseudo instructions, which makes stuff simpler
and faster.
llvm-svn: 14144
|
| |
|
|
|
|
| |
twoarg cases.
llvm-svn: 14143
|
| |
|
|
|
|
|
|
| |
test/Regression/CodeGen/X86/isnan.llx
testcase
llvm-svn: 14141
|
| |
|
|
| |
llvm-svn: 14140
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This makes the code much simpler, and the two cases really do belong apart.
Once we do it, it's pretty obvious how flawed the logic was for A != A case,
so I fixed it (fixing PR369).
This also uses freeStackSlotAfter instead of inserting an fxchg then
popStackAfter'ing in the case where there is a dead result (unlikely, but
possible), producing better code.
llvm-svn: 14139
|
| |
|
|
| |
llvm-svn: 14110
|
| |
|
|
|
|
| |
that cast to bool.
llvm-svn: 14096
|
| |
|
|
| |
llvm-svn: 14005
|
| |
|
|
| |
llvm-svn: 13952
|
| |
|
|
| |
llvm-svn: 13696
|
| |
|
|
| |
llvm-svn: 13695
|
| |
|
|
| |
llvm-svn: 13694
|
| |
|
|
|
|
| |
MachineBasicBlocks instead.
llvm-svn: 13568
|
| |
|
|
|
|
|
| |
Get rid of separate numbering for LLVM BasicBlocks; use the automatically
generated MachineBasicBlock numbering.
llvm-svn: 13567
|
| |
|
|
|
|
| |
LLVM BasicBlock operands.
llvm-svn: 13566
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and passing a null pointer into a function.
For this testcase:
void %test(int** %X) {
store int* null, int** %X
call void %test(int** null)
ret void
}
we now generate this:
test:
sub %ESP, 12
mov %EAX, DWORD PTR [%ESP + 16]
mov DWORD PTR [%EAX], 0
mov DWORD PTR [%ESP], 0
call test
add %ESP, 12
ret
instead of this:
test:
sub %ESP, 12
mov %EAX, DWORD PTR [%ESP + 16]
mov %ECX, 0
mov DWORD PTR [%EAX], %ECX
mov %EAX, 0
mov DWORD PTR [%ESP], %EAX
call test
add %ESP, 12
ret
llvm-svn: 13558
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the alloca address into common operations like loads/stores.
In a simple testcase like this (which is just designed to excersize the
alloca A, nothing more):
int %test(int %X, bool %C) {
%A = alloca int
store int %X, int* %A
store int* %A, int** %G
br bool %C, label %T, label %F
T:
call int %test(int 1, bool false)
%V = load int* %A
ret int %V
F:
call int %test(int 123, bool true)
%V2 = load int* %A
ret int %V2
}
We now generate:
test:
sub %ESP, 12
mov %EAX, DWORD PTR [%ESP + 16]
mov %CL, BYTE PTR [%ESP + 20]
*** mov DWORD PTR [%ESP + 8], %EAX
mov %EAX, OFFSET G
lea %EDX, DWORD PTR [%ESP + 8]
mov DWORD PTR [%EAX], %EDX
test %CL, %CL
je .LBB2 # PC rel: F
.LBB1: # T
mov DWORD PTR [%ESP], 1
mov DWORD PTR [%ESP + 4], 0
call test
*** mov %EAX, DWORD PTR [%ESP + 8]
add %ESP, 12
ret
.LBB2: # F
mov DWORD PTR [%ESP], 123
mov DWORD PTR [%ESP + 4], 1
call test
*** mov %EAX, DWORD PTR [%ESP + 8]
add %ESP, 12
ret
Instead of:
test:
sub %ESP, 20
mov %EAX, DWORD PTR [%ESP + 24]
mov %CL, BYTE PTR [%ESP + 28]
*** lea %EDX, DWORD PTR [%ESP + 16]
*** mov DWORD PTR [%EDX], %EAX
mov %EAX, OFFSET G
mov DWORD PTR [%EAX], %EDX
test %CL, %CL
*** mov DWORD PTR [%ESP + 12], %EDX
je .LBB2 # PC rel: F
.LBB1: # T
mov DWORD PTR [%ESP], 1
mov %EAX, 0
mov DWORD PTR [%ESP + 4], %EAX
call test
*** mov %EAX, DWORD PTR [%ESP + 12]
*** mov %EAX, DWORD PTR [%EAX]
add %ESP, 20
ret
.LBB2: # F
mov DWORD PTR [%ESP], 123
mov %EAX, 1
mov DWORD PTR [%ESP + 4], %EAX
call test
*** mov %EAX, DWORD PTR [%ESP + 12]
*** mov %EAX, DWORD PTR [%EAX]
add %ESP, 20
ret
llvm-svn: 13557
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sized allocas in the entry block). Instead of generating code like this:
entry:
reg1024 = ESP+1234
... (much later)
*reg1024 = 17
Generate code that looks like this:
entry:
(no code generated)
... (much later)
t = ESP+1234
*t = 17
The advantage being that we DRAMATICALLY reduce the register pressure for these
silly temporaries (they were all being spilled to the stack, resulting in very
silly code). This is actually a manual implementation of rematerialization :)
I have a patch to fold the alloca address computation into loads & stores, which
will make this much better still, but just getting this right took way too much time
and I'm sleepy.
llvm-svn: 13554
|
| |
|
|
|
|
|
|
|
|
|
| |
mov DWORD PTR [%ESP + 4], 1
instead of:
mov %EAX, 1
mov DWORD PTR [%ESP + 4], %EAX
llvm-svn: 13494
|
| |
|
|
|
|
|
| |
compiling things like 'add long %X, 1'. The problem is that we were switching
the order of the operands for longs even though we can't fold them yet.
llvm-svn: 13451
|
| |
|
|
| |
llvm-svn: 13440
|
| |
|
|
|
|
| |
extension required)
llvm-svn: 13439
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
allows us to compile:
store float 10.0, float* %P
into:
mov DWORD PTR [%EAX], 1092616192
instead of:
.CPItest_0: # float 0x4024000000000000
.long 1092616192 # float 10
...
fld DWORD PTR [.CPItest_0]
fstp DWORD PTR [%EAX]
llvm-svn: 13409
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
against zero. In particular, don't emit:
mov %ESI, 0
cmp %ECX, %ESI
instead, emit:
test %ECX, %ECX
llvm-svn: 13407
|