| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
since we are dirty, special case __main. This should fix the infinite loop
horrible stuff that happens on linux-alpha when configuring llvm-gcc. It
might also help cygwin, who knows??
llvm-svn: 19729
|
| |
|
|
| |
llvm-svn: 19728
|
| |
|
|
| |
llvm-svn: 19727
|
| |
|
|
|
|
| |
operations for 64-bit integers.
llvm-svn: 19724
|
| |
|
|
| |
llvm-svn: 19721
|
| |
|
|
|
|
| |
fixes most of the remaining llc-beta failures.
llvm-svn: 19716
|
| |
|
|
| |
llvm-svn: 19715
|
| |
|
|
| |
llvm-svn: 19714
|
| |
|
|
| |
llvm-svn: 19712
|
| |
|
|
| |
llvm-svn: 19711
|
| |
|
|
| |
llvm-svn: 19707
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The second folds operations into selects, e.g. (select C, (X+Y), (Y+Z))
-> (Y+(select C, X, Z)
This occurs a few times across spec, e.g.
select add/sub
mesa: 83 0
povray: 5 2
gcc 4 2
parser 0 22
perlbmk 13 30
twolf 0 3
llvm-svn: 19706
|
| |
|
|
| |
llvm-svn: 19704
|
| |
|
|
| |
llvm-svn: 19703
|
| |
|
|
| |
llvm-svn: 19701
|
| |
|
|
|
|
| |
independent of each other.
llvm-svn: 19700
|
| |
|
|
| |
llvm-svn: 19699
|
| |
|
|
| |
llvm-svn: 19698
|
| |
|
|
|
|
|
|
| |
pressure, not decreases register pressure. Fix problem where we accidentally
swapped the operands of SHLD, which caused fourinarow to fail. This fixes
fourinarow.
llvm-svn: 19697
|
| |
|
|
| |
llvm-svn: 19696
|
| |
|
|
|
|
|
| |
well as all of teh other stuff in livevar. This fixes the compiler crash
on fourinarow last night.
llvm-svn: 19695
|
| |
|
|
| |
llvm-svn: 19694
|
| |
|
|
| |
llvm-svn: 19693
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
typically cost 1 cycle) instead of shld/shrd instruction (which are typically
6 or more cycles). This also saves code space.
For example, instead of emitting:
rotr:
mov %EAX, DWORD PTR [%ESP + 4]
mov %CL, BYTE PTR [%ESP + 8]
shrd %EAX, %EAX, %CL
ret
rotli:
mov %EAX, DWORD PTR [%ESP + 4]
shrd %EAX, %EAX, 27
ret
Emit:
rotr32:
mov %CL, BYTE PTR [%ESP + 8]
mov %EAX, DWORD PTR [%ESP + 4]
ror %EAX, %CL
ret
rotli32:
mov %EAX, DWORD PTR [%ESP + 4]
ror %EAX, 27
ret
We also emit byte rotate instructions which do not have a sh[lr]d counterpart
at all.
llvm-svn: 19692
|
| |
|
|
| |
llvm-svn: 19690
|
| |
|
|
| |
llvm-svn: 19689
|
| |
|
|
| |
llvm-svn: 19687
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to generate this:
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %EDX, DWORD PTR [%ESP + 8]
shld %EDX, %EDX, 2
shl %EAX, 2
ret
instead of this:
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
mov %EDX, %EAX
shrd %EDX, %ECX, 30
shl %EAX, 2
ret
Note the magically transmogrifying immediate.
llvm-svn: 19686
|
| |
|
|
|
|
| |
instead of doing it manually.
llvm-svn: 19685
|
| |
|
|
|
|
|
| |
Add default impl of commuteInstruction
Add notes about ugly V9 code.
llvm-svn: 19684
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %EDX, DWORD PTR [%ESP + 8]
shrd %EAX, %EDX, 2
sar %EDX, 2
ret
instead of this:
test1:
mov %ECX, DWORD PTR [%ESP + 4]
shr %ECX, 2
mov %EDX, DWORD PTR [%ESP + 8]
mov %EAX, %EDX
shl %EAX, 30
or %EAX, %ECX
sar %EDX, 2
ret
and long << 2 to this:
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
*** mov %EDX, %EAX
shrd %EDX, %ECX, 30
shl %EAX, 2
ret
instead of this:
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, %EAX
shr %ECX, 30
mov %EDX, DWORD PTR [%ESP + 8]
shl %EDX, 2
or %EDX, %ECX
shl %EAX, 2
ret
The extra copy (marked ***) can be eliminated when I teach the code generator
that shrd32rri8 is really commutative.
llvm-svn: 19681
|
| |
|
|
|
|
|
|
| |
select operations or to shifts that are by a constant. This automatically
implements (with no special code) all of the special cases for shift by 32,
shift by < 32 and shift by > 32.
llvm-svn: 19679
|
| |
|
|
| |
llvm-svn: 19678
|
| |
|
|
|
|
|
|
|
| |
range. Either they are undefined (the default), they mask the shift amount
to the size of the register (X86, Alpha, etc), or they extend the shift (PPC).
This defaults to undefined, which is conservatively correct.
llvm-svn: 19677
|
| |
|
|
| |
llvm-svn: 19675
|
| |
|
|
|
|
| |
FP_EXTEND from!
llvm-svn: 19674
|
| |
|
|
| |
llvm-svn: 19673
|
| |
|
|
|
|
| |
don't need to even think about F32 in the X86 code anymore.
llvm-svn: 19672
|
| |
|
|
|
|
| |
of zero and sign extends.
llvm-svn: 19671
|
| |
|
|
| |
llvm-svn: 19670
|
| |
|
|
|
|
|
|
|
| |
do it. This results in better code on X86 for floats (because if strict
precision is not required, we can elide some more expensive double -> float
conversions like the old isel did), and allows other targets to emit
CopyFromRegs that are not legal for arguments.
llvm-svn: 19668
|
| |
|
|
| |
llvm-svn: 19667
|
| |
|
|
| |
llvm-svn: 19661
|
| |
|
|
| |
llvm-svn: 19660
|
| |
|
|
| |
llvm-svn: 19659
|
| |
|
|
|
|
|
|
| |
* Insert some really pedantic assertions that will notice when we emit the
same loads more than one time, exposing bugs. This turns a miscompilation in
bzip2 into a compile-fail. yaay.
llvm-svn: 19658
|
| |
|
|
| |
llvm-svn: 19657
|
| |
|
|
| |
llvm-svn: 19656
|
| |
|
|
|
|
|
| |
match (X+Y)+(Z << 1), because we match the X+Y first, consuming the index
register, then there is no place to put the Z.
llvm-svn: 19652
|
| |
|
|
| |
llvm-svn: 19651
|