| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
i8 return values.
llvm-svn: 67502
|
| |
|
|
|
|
| |
Handle odd registers allocation in FGR32.
llvm-svn: 67422
|
| |
|
|
| |
llvm-svn: 67416
|
| |
|
|
| |
llvm-svn: 67412
|
| |
|
|
| |
llvm-svn: 67373
|
| |
|
|
|
|
| |
Removed unncessary code. No functionality change.
llvm-svn: 67371
|
| |
|
|
|
|
| |
for those architectures that support the instruction.
llvm-svn: 67363
|
| |
|
|
| |
llvm-svn: 67287
|
| |
|
|
|
|
|
| |
in selectiondag patterns. This is required for the upcoming shuffle_vector rewrite,
and as it turns out, cleans up a hack in the Alpha instruction info.
llvm-svn: 67286
|
| |
|
|
| |
llvm-svn: 67280
|
| |
|
|
|
|
|
|
|
|
|
| |
not safe in general because the immediate could be an arbitrary
value that does not fit in a 32-bit pcrel displacement.
Conservatively fall back to loading the value into a register
and calling through it.
We still do the optzn on X86-32.
llvm-svn: 67142
|
| |
|
|
|
|
| |
Revert inadvertent mis-fix of fneg.
llvm-svn: 67084
|
| |
|
|
| |
llvm-svn: 67072
|
| |
|
|
| |
llvm-svn: 67071
|
| |
|
|
|
|
|
|
|
|
|
|
| |
- Fix fabs, fneg for f32 and f64.
- Use BuildVectorSDNode.isConstantSplat, now that the functionality exists
- Continue to improve i64 constant lowering. Lower certain special constants
to the constant pool when they correspond to SPU's shufb instruction's
special mask values. This avoids the overhead of performing a shuffle on a
zero-filled vector just to get the special constant when the memory load
suffices.
llvm-svn: 67067
|
| |
|
|
|
|
|
| |
Incorporate Tilmann's 128-bit operation patch. Evidently, it gets the
llvm-gcc bootstrap a bit further along.
llvm-svn: 67048
|
| |
|
|
|
|
|
|
| |
array allocated on the stack which would lead
the compiled program to run over its stack. Thanks to Gil Dogon
llvm-svn: 67034
|
| |
|
|
|
|
| |
it has a smaller encoding than absolute addressing.
llvm-svn: 67002
|
| |
|
|
|
|
|
|
|
|
| |
operand is a signed 32-bit immediate. Unlike with the 8-bit
signed immediate case, it isn't actually smaller to fold a
32-bit signed immediate instead of a load. In fact, it's
larger in the case of 32-bit unsigned immediates, because
they can be materialized with movl instead of movq.
llvm-svn: 67001
|
| |
|
|
|
|
|
|
| |
ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.
llvm-svn: 66988
|
| |
|
|
|
|
|
|
|
| |
by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.
llvm-svn: 66941
|
| |
|
|
|
|
| |
add a fixme note on how to remove code duplication.
llvm-svn: 66932
|
| |
|
|
| |
llvm-svn: 66922
|
| |
|
|
|
|
|
| |
codegen to the same thing as integer truncates to i8 (the top bits are
just undefined). This implements rdar://6667338
llvm-svn: 66902
|
| |
|
|
|
|
| |
instructions. Prevent that if we don't want implicit uses of SSE.
llvm-svn: 66877
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
llvm-svn: 66875
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for i32/i64 expressions (we could also do i16 on cpus where
i16 lea is fast, but I didn't add this). On the example, we now
generate:
_test:
movl 4(%esp), %eax
cmpl $42, (%eax)
setl %al
movzbl %al, %eax
leal 4(%eax,%eax,8), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
cmpl $41, (%eax)
movl $4, %ecx
movl $13, %eax
cmovg %ecx, %eax
ret
llvm-svn: 66869
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
example to:
_test:
movl 4(%esp), %eax
cmpl $41, (%eax)
setg %al
movzbl %al, %eax
orl $4294967294, %eax
ret
instead of:
movl 4(%esp), %eax
cmpl $41, (%eax)
movl $4294967294, %ecx
movl $4294967295, %eax
cmova %ecx, %eax
ret
which is smaller in code size and faster. rdar://6668608
llvm-svn: 66868
|
| |
|
|
|
|
|
|
|
|
| |
operands can't both be fully folded at the same time. For example,
in the included testcase, a global variable is being added with
an add of two values. The global variable wants RIP-relative
addressing, so it can't share the address with another base
register, but it's still possible to fold the initial add.
llvm-svn: 66865
|
| |
|
|
|
|
| |
assembly. 2. Fixed JIT encoding by making the address pc-relative.
llvm-svn: 66803
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
related transformations out of target-specific dag combine into the
ARM backend. These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently
with the recently added cp constant select optimization, but is a
very general xform. For example, we now compile the second example
in const-select.ll to:
_test:
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
seta %al
movzbl %al, %eax
movl 4(%esp), %ecx
movsbl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
leal 4(%eax), %ecx
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
cmovbe %eax, %ecx
movsbl (%ecx), %eax
ret
This passes multisource and dejagnu.
llvm-svn: 66779
|
| |
|
|
| |
llvm-svn: 66778
|
| |
|
|
|
|
| |
double load and store instead.
llvm-svn: 66776
|
| |
|
|
| |
llvm-svn: 66763
|
| |
|
|
|
|
| |
symbols in one section will always be put into one bank.
llvm-svn: 66761
|
| |
|
|
|
|
| |
assembly text output uses an indirect call ("call *") instead of a direct call.
llvm-svn: 66735
|
| |
|
|
| |
llvm-svn: 66725
|
| |
|
|
|
|
| |
floating point instructions that are explicitly specified by the user.
llvm-svn: 66719
|
| |
|
|
|
|
| |
linkage, so remove it.
llvm-svn: 66690
|
| |
|
|
|
|
| |
clear some bits.
llvm-svn: 66684
|
| |
|
|
| |
llvm-svn: 66660
|
| |
|
|
|
|
|
| |
linkage: this linkage type only applies to declarations,
but ODR is only relevant to globals with definitions.
llvm-svn: 66650
|
| |
|
|
|
|
| |
pshuflw/hw.
llvm-svn: 66645
|
| |
|
|
| |
llvm-svn: 66642
|
| |
|
|
| |
llvm-svn: 66540
|
| |
|
|
| |
llvm-svn: 66515
|
| |
|
|
| |
llvm-svn: 66508
|
| |
|
|
|
|
| |
thumb mode and arch subversion. Eventually thumb triplets will go way and replaced with function notes.
llvm-svn: 66435
|
| |
|
|
|
|
| |
optimizer can create values of funky scalar types.
llvm-svn: 66429
|
| |
|
|
| |
llvm-svn: 66382
|