| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
This should unbreak llvm-gcc-i386-linux-selfhost.
llvm-svn: 130927
|
|
|
|
| |
llvm-svn: 130893
|
|
|
|
|
|
|
|
| |
Most of these tests require a single mov instruction that can come either before
or after a 2-addr instruction. -join-physregs changes the behavior, but the
results are equivalent.
llvm-svn: 130891
|
|
|
|
|
|
| |
the default register allocator is changed.
llvm-svn: 130883
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
landing pad as its successor.
SjLj exception handling jumps to the correct landing pad via a switch statement
that's generated right before code-gen. Loosen the constraint in the machine
instruction verifier to allow for this. Note, this isn't the most rigorous check
since we cannot determine where that switch statement came from. But it's
marginally better than turning this check off when SjLj exceptions are used.
<rdar://problem/9187612>
llvm-svn: 130881
|
|
|
|
|
|
|
|
|
|
| |
edge in some cases.
Original message:
Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .
llvm-svn: 130877
|
|
|
|
|
|
| |
alignment 4 is wrong) and requires hard-float.
llvm-svn: 130875
|
|
|
|
| |
llvm-svn: 130867
|
|
|
|
|
|
| |
allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .
llvm-svn: 130862
|
|
|
|
| |
llvm-svn: 130859
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These tests all follow the same pattern:
mov r2, r0
movs r0, #0
$CMP r2, r1
it eq
moveq r0, #1
bx lr
The first 'mov' can be eliminated by rematerializing 'movs r0, #0' below the
test instruction:
$CMP r0, r1
mov.w r0, #0
it eq
moveq r0, #1
bx lr
So far, only physreg coalescing can do that. The register allocators won't yet
split live ranges just to eliminate copies. They can learn, but this particular
problem is not likely to show up in real code. It only appears because r0 is
used for both the function argument and return value.
llvm-svn: 130858
|
|
|
|
| |
llvm-svn: 130856
|
|
|
|
| |
llvm-svn: 130855
|
|
|
|
| |
llvm-svn: 130854
|
|
|
|
| |
llvm-svn: 130849
|
|
|
|
|
|
| |
instruction that restores the clobbered $gp.
llvm-svn: 130847
|
|
|
|
|
|
|
|
|
|
|
| |
it is both inefficient and unexpected by dwarfdump. Change to
a DW_FORM_data4.
While in here, change the predicate name to reflect that the position
is not really absolute (it is an offset), just that the linker needs a
relocation.
llvm-svn: 130846
|
|
|
|
|
|
|
|
| |
but according to my super-optimizer there are only two missed simplifications
of -instsimplify kind when compiling bzip2, and this is one of them. It amuses
me to have bzip2 be perfectly optimized as far as instsimplify goes!
llvm-svn: 130840
|
|
|
|
| |
llvm-svn: 130818
|
|
|
|
|
|
|
| |
The basic allocator is really bad about hinting, so it doesn't eliminate all
copies when physreg joining is disabled.
llvm-svn: 130817
|
|
|
|
| |
llvm-svn: 130816
|
|
|
|
| |
llvm-svn: 130815
|
|
|
|
| |
llvm-svn: 130812
|
|
|
|
| |
llvm-svn: 130800
|
|
|
|
|
|
| |
<rdar://problem/8460511>
llvm-svn: 130791
|
|
|
|
|
|
|
|
| |
max(a,b) >= a -> true. According to my super-optimizer, these are
by far the most common simplifications (of the -instsimplify kind)
that occur in the testsuite and aren't caught by -std-compile-opts.
llvm-svn: 130780
|
|
|
|
| |
llvm-svn: 130778
|
|
|
|
| |
llvm-svn: 130769
|
|
|
|
| |
llvm-svn: 130763
|
|
|
|
| |
llvm-svn: 130754
|
|
|
|
|
|
|
|
| |
string template.
Fixes rdar://8493866
llvm-svn: 130747
|
|
|
|
|
|
|
|
| |
model constants which can be added to base registers via add-immediate
instructions which don't require an additional register to materialize
the immediate.
llvm-svn: 130743
|
|
|
|
| |
llvm-svn: 130713
|
|
|
|
|
|
| |
a vector compare, generate a vector result rather than i1 (and crashing).
llvm-svn: 130706
|
|
|
|
|
|
|
| |
This automagically provides a transform noticed by my super-optimizer
as occurring quite often: "rem x, (select cond, x, 1)" -> 0.
llvm-svn: 130694
|
|
|
|
| |
llvm-svn: 130693
|
|
|
|
| |
llvm-svn: 130691
|
|
|
|
|
|
| |
take some time.
llvm-svn: 130690
|
|
|
|
| |
llvm-svn: 130658
|
|
|
|
|
|
| |
-fno-dwarf2-cfi-asm. Implement the same behavior.
llvm-svn: 130637
|
|
|
|
|
|
| |
less agressive about disabling cfi on linux :-(
llvm-svn: 130626
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the output should be almost identical to the one produced by CodeGen
to make the transition easier.
The only two differences I know of are:
* Some files get an extra advance loc of size 0. This will be fixed when
relaxations are enabled.
* The optimization of declaring an EH symbol as an external variable is not
implemented. This is a subset of adding the nounwind attribute, so we if really
this at -O0 we should probably do it at the IL level.
llvm-svn: 130623
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
urem or constant B.
This obviously helps a lot if the division would be turned into a libcall
(think i64 udiv on i386), but div is also one of the few remaining instructions
on modern CPUs that become more expensive when the bitwidth gets bigger.
This also helps register pressure on i386 when dividing chars, divb needs
two 8-bit parts of a 16 bit register as input where divl uses two registers.
int foo(unsigned char a) { return a/10; }
int bar(unsigned char a, unsigned char b) { return a/b; }
compiles into (x86_64)
_foo:
imull $205, %edi, %eax
shrl $11, %eax
ret
_bar:
movzbl %dil, %eax
divb %sil, %al
movzbl %al, %eax
ret
llvm-svn: 130615
|
|
|
|
|
|
| |
This folds away silly stuff like (a&255)/1000 -> 0.
llvm-svn: 130614
|
|
|
|
| |
llvm-svn: 130613
|
|
|
|
| |
llvm-svn: 130599
|
|
|
|
| |
llvm-svn: 130567
|
|
|
|
|
|
| |
FastEmit_i can fail for non-Thumb2 ARM. Makes ARMSimplifyAddress work correctly, and reduces the number of fast-isel bailouts on non-Thumb ARM.
llvm-svn: 130560
|
|
|
|
|
|
| |
ARM/Thumb2 patterns.
llvm-svn: 130552
|
|
|
|
| |
llvm-svn: 130540
|