| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
llvm-svn: 66875
|
|
|
|
| |
llvm-svn: 66870
|
|
|
|
| |
llvm-svn: 66867
|
|
|
|
| |
llvm-svn: 66866
|
|
|
|
| |
llvm-svn: 66843
|
|
|
|
| |
llvm-svn: 66780
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
related transformations out of target-specific dag combine into the
ARM backend. These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently
with the recently added cp constant select optimization, but is a
very general xform. For example, we now compile the second example
in const-select.ll to:
_test:
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
seta %al
movzbl %al, %eax
movl 4(%esp), %ecx
movsbl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
leal 4(%eax), %ecx
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
cmovbe %eax, %ecx
movsbl (%ecx), %eax
ret
This passes multisource and dejagnu.
llvm-svn: 66779
|
|
|
|
|
|
| |
one bits information for values that are live out of basic blocks. The goal is to eliminate unnecessary sext, zext, truncate of values that are live-in to blocks. This does not handle PHI nodes yet.
llvm-svn: 66777
|
|
|
|
| |
llvm-svn: 66733
|
|
|
|
|
|
| |
No (intended) functionality change.
llvm-svn: 66720
|
|
|
|
|
|
| |
optimization too late and left the live intervals to be out of sync with instructions. This fixes 8b10b.
llvm-svn: 66715
|
|
|
|
|
|
| |
linkage, so remove it.
llvm-svn: 66690
|
|
|
|
| |
llvm-svn: 66653
|
|
|
|
|
|
|
|
|
| |
alignment of the generated constant pool entry to the
desired alignment of a type. If we don't do this, we end up
trying to do movsd from 4-byte alignment memory. This fixes
450.soplex and 456.hmmer.
llvm-svn: 66641
|
|
|
|
| |
llvm-svn: 66611
|
|
|
|
|
|
|
| |
1. Use the same value# to represent unknown values being merged into sub-registers.
2. When coalescer commute an instruction and the destination is a physical register, update its sub-registers by merging in the extended ranges.
llvm-svn: 66610
|
|
|
|
| |
llvm-svn: 66607
|
|
|
|
|
|
| |
need to alloc/dealloc.
llvm-svn: 66591
|
|
|
|
| |
llvm-svn: 66589
|
|
|
|
| |
llvm-svn: 66586
|
|
|
|
|
|
| |
- Remove unused method.
llvm-svn: 66585
|
|
|
|
|
|
|
|
|
|
|
| |
the untimed version of getOrCreateSourceID. getOrCreateSourceID calls
GetOrCreateSourceID, of course.
- Move some methods into the "private" section. Constify at least one method.
- General clean-ups.
llvm-svn: 66582
|
|
|
|
|
|
| |
writing individually.
llvm-svn: 66577
|
|
|
|
|
|
| |
/ Darwin.
llvm-svn: 66574
|
|
|
|
|
|
| |
emit exception and debug Dwarf info.
llvm-svn: 66571
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
scheduled in multiple regions, liveness data used by the
anti-dependence breaker is carried from one region to the next, however
the information reflects the state of the instructions before scheduling.
After scheduling, there may be new live range overlaps. Handle this by
pessimizing the liveness data carried between regions to the point where
it will be conservatively correct now matter how the earlier region is
scheduled. This fixes a miscompilation in 176.gcc with the post-RA
scheduler enabled.
llvm-svn: 66558
|
|
|
|
|
|
| |
format strings with the standard ${:foo} syntax.
llvm-svn: 66527
|
|
|
|
| |
llvm-svn: 66434
|
|
|
|
|
|
| |
the same instruction as kill. This fixes PR3706.
llvm-svn: 66428
|
|
|
|
|
|
|
| |
existed was for llvm-gcc 3.4 (which used the __main hack) which
is really really long dead.
llvm-svn: 66417
|
|
|
|
|
|
|
|
| |
whether a global is dead or not. This should fix PR3749 - linker adds
spurious use to appending globals. I can't reasonably add a testcase
for this, because the bc writer/reader strip dead constant users.
llvm-svn: 66404
|
|
|
|
|
|
| |
on the number of times a std::string is created and copied.
llvm-svn: 66396
|
|
|
|
| |
llvm-svn: 66363
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For 2009-03-07-FPConstSelect.ll we now produce:
_f:
xorl %eax, %eax
testl %edi, %edi
movl $4, %ecx
cmovne %rax, %rcx
leaq LCPI1_0(%rip), %rax
movss (%rcx,%rax), %xmm0
ret
previously we produced:
_f:
subl $4, %esp
cmpl $0, 8(%esp)
movss LCPI1_0, %xmm0
je LBB1_2 ## entry
LBB1_1: ## entry
movss LCPI1_1, %xmm0
LBB1_2: ## entry
movss %xmm0, (%esp)
flds (%esp)
addl $4, %esp
ret
on PPC the code also improves to:
_f:
cntlzw r2, r3
srwi r2, r2, 5
li r3, lo16(LCPI1_0)
slwi r2, r2, 2
addis r3, r3, ha16(LCPI1_0)
lfsx f1, r3, r2
blr
from:
_f:
li r2, lo16(LCPI1_1)
cmplwi cr0, r3, 0
addis r2, r2, ha16(LCPI1_1)
beq cr0, LBB1_2 ; entry
LBB1_1: ; entry
li r2, lo16(LCPI1_0)
addis r2, r2, ha16(LCPI1_0)
LBB1_2: ; entry
lfs f1, 0(r2)
blr
This also improves the existing pic-cpool case from:
foo:
subl $12, %esp
call .Lllvm$1.$piclabel
.Lllvm$1.$piclabel:
popl %eax
addl $_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
cmpl $0, 16(%esp)
movsd .LCPI1_0@GOTOFF(%eax), %xmm0
je .LBB1_2 # entry
.LBB1_1: # entry
movsd .LCPI1_1@GOTOFF(%eax), %xmm0
.LBB1_2: # entry
movsd %xmm0, (%esp)
fldl (%esp)
addl $12, %esp
ret
to:
foo:
call .Lllvm$1.$piclabel
.Lllvm$1.$piclabel:
popl %eax
addl $_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
xorl %ecx, %ecx
cmpl $0, 4(%esp)
movl $8, %edx
cmovne %ecx, %edx
fldl .LCPI1_0@GOTOFF(%eax,%edx)
ret
This triggers a few dozen times in spec FP 2000.
llvm-svn: 66358
|
|
|
|
| |
llvm-svn: 66357
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and extern_weak_odr. These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global. In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time. This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function. If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body. The
code generators on the other hand map weak and weak_odr linkage
to the same thing.
llvm-svn: 66339
|
|
|
|
|
|
|
|
|
|
|
| |
with multiple chain operands. This can occur when the scheduler
has added chain operands to a node that already has a chain
operand, in order to handle physical register dependencies.
This fixes an llvm-gcc bootstrap failure on x86-64 introduced
in r66058.
llvm-svn: 66240
|
|
|
|
|
|
| |
Delete this default branch, because we're going to generate our own.
llvm-svn: 66234
|
|
|
|
| |
llvm-svn: 66158
|
|
|
|
|
|
| |
This fixes some subtle miscompilations.
llvm-svn: 66147
|
|
|
|
|
|
| |
start. Sorry, no small test case possible.
llvm-svn: 66129
|
|
|
|
|
|
|
| |
It is an error to call APInt::zext with a size that is equal to the value's
current size, so use zextOrTrunc instead.
llvm-svn: 66039
|
|
|
|
|
|
| |
Update a testcase to check this.
llvm-svn: 66029
|
|
|
|
| |
llvm-svn: 66021
|
|
|
|
|
|
| |
what llvm-gcc generates so codegen knows flags register is being clobbered by inline asm. 2. BURR scheduler should also check if inline asm nodes can clobber "live" physical registers. Previously it was only checking target nodes with implicit defs.
llvm-svn: 65996
|
|
|
|
|
|
|
|
|
|
|
|
| |
so it changed it into a 31 via the TLO.ShrinkDemandedConstant() call. Then it
would go through the DAG combiner again. This time it had a value of 31, which
was turned into a -1 by TLI.SimplifyDemandedBits(). This would ping pong
forever.
Teach the TLO.ShrinkDemandedConstant() call not to lower a value if the demanded
value is an XOR of all ones.
llvm-svn: 65985
|
|
|
|
|
|
|
|
| |
arbitrary vector sizes. Add an optional MinSplatBits parameter to specify
a minimum for the splat element size. Update the PPC target to use the
revised interface.
llvm-svn: 65899
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
extracts + build_vector into a shuffle would fail, because the
type of the new build_vector would not be legal. Try harder to
create a legal build_vector type. Note: this will be totally
irrelevant once vector_shuffle no longer takes a build_vector for
shuffle mask.
New:
_foo:
xorps %xmm0, %xmm0
xorps %xmm1, %xmm1
subps %xmm1, %xmm1
mulps %xmm0, %xmm1
addps %xmm0, %xmm1
movaps %xmm1, 0
Old:
_foo:
xorps %xmm0, %xmm0
movss %xmm0, %xmm1
xorps %xmm2, %xmm2
unpcklps %xmm1, %xmm2
pshufd $80, %xmm1, %xmm1
unpcklps %xmm1, %xmm2
pslldq $16, %xmm2
pshufd $57, %xmm2, %xmm1
subps %xmm0, %xmm1
mulps %xmm0, %xmm1
addps %xmm0, %xmm1
movaps %xmm1, 0
llvm-svn: 65791
|
|
|
|
|
|
|
|
|
|
|
| |
Look for situations like this:
%reg1024<def> = MOV r1
%reg1025<def> = MOV r0
%reg1026<def> = ADD %reg1024, %reg1025
r0 = MOV %reg1026
Commute the ADD to hopefully eliminate an otherwise unavoidable copy.
llvm-svn: 65752
|
|
|
|
|
|
| |
method in a BuildVectorSDNode "pseudo-class".
llvm-svn: 65747
|