| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
If the basic block containing the BCCi64 (or BCCZi64) instruction ends with
an unconditional branch, that branch needs to be deleted before appending
the expansion of the BCCi64 to the end of the block.
llvm-svn: 122521
|
| |
|
|
|
|
|
|
|
|
|
| |
Type legalization splits up i64 values into pairs of i32 values, which leads
to poor quality code when inserting or extracting i64 vector elements.
If the vector element is loaded or stored, it can be treated as an f64 value
and loaded or stored directly from a VPR register. Use the pre-legalization
DAG combiner to cast those vector elements to f64 types so that the type
legalizer won't mess them up. Radar 8755338.
llvm-svn: 122319
|
| |
|
|
|
|
| |
is enabled.
llvm-svn: 122163
|
| |
|
|
|
|
| |
The result vector elements are always integers. Radar 8782191.
llvm-svn: 122112
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
may be called. If the entry block is empty, the insertion point iterator will be
the "end()" value. Calling ->getParent() on it (among others) causes problems.
Modify materializeFrameBaseRegister to take the machine basic block and insert
the frame base register at the beginning of that block. (It's very similar to
what the code does all ready. The only difference is that it will always insert
at the beginning of the entry block instead of after a previous materialization
of the frame base register. I doubt that that matters here.)
<rdar://problem/8782198>
llvm-svn: 122104
|
| |
|
|
|
|
|
| |
BUILD_VECTOR operands where the element type is not legal. I had previously
changed this code to insert TRUNCATE operations, but that was just wrong.
llvm-svn: 122102
|
| |
|
|
| |
llvm-svn: 122101
|
| |
|
|
|
|
| |
Radar 8776599
llvm-svn: 122018
|
| |
|
|
|
|
|
| |
2. Fixed EmitLocalCommonSymbol for ELF (Yes, they exist. :)
Test added.
llvm-svn: 121951
|
| |
|
|
| |
llvm-svn: 121919
|
| |
|
|
|
|
|
| |
Clang is now providing intrinsics for these and so we need to support them
in the backend. Radar 8068427.
llvm-svn: 121902
|
| |
|
|
| |
llvm-svn: 121746
|
| |
|
|
| |
llvm-svn: 121743
|
| |
|
|
|
|
| |
Test has fixme, to move to .s -> .o test when AsmParser works better.
llvm-svn: 121732
|
| |
|
|
|
|
| |
#shamt. rdar://8752056
llvm-svn: 121606
|
| |
|
|
| |
llvm-svn: 121583
|
| |
|
|
|
|
|
| |
Alignments smaller than the total size of the memory being loaded or stored,
unless the alignment is 8 bytes, are not allowed. Add tests for this, too.
llvm-svn: 121506
|
| |
|
|
|
|
|
|
| |
Otherwise, a plain str/ldr should be used instead. Make sure we account for
that in prologue/epilogue code generation.
rdar://8745460
llvm-svn: 121391
|
| |
|
|
|
|
|
|
|
| |
Added test to check bl __aeabi_read_tp gets emitted properly for ELF/ASM
as well as ELF/OBJ (including fixup)
Also added support for ELF::R_ARM_TLS_IE32
llvm-svn: 121312
|
| |
|
|
|
|
|
|
|
|
|
| |
vpush instructions to save / restore VFP / NEON registers like this:
vpush {d8,d10,d11}
vpop {d8,d10,d11}
vpush and vpop do not allow gaps in the register list.
rdar://8728956
llvm-svn: 121197
|
| |
|
|
|
|
| |
message instead of creating DBG_VALUE for undefined value in reg0.
llvm-svn: 121059
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
additional pipeline stall. So it's frequently better to single codegen
vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
vmla + vmla is very bad. But this isn't ideal either:
vmul
vadd
vmla
Instead, we want to expand the second vmla:
vmla
vmul
vadd
Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
faster.
Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.
A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
vmla / vmls will trigger one of the special hazards.
Work in progress, only A+B are enabled.
llvm-svn: 120960
|
| |
|
|
|
|
|
|
|
|
|
| |
Lifted adjustFixupValue() from Darwin for sharing w ELF.
Test added
TODO:
refactor ELFObjectWriter::RecordRelocation more.
Possibly share more code with Darwin?
Lots more relocations...
llvm-svn: 120534
|
| |
|
|
|
|
|
|
|
|
|
| |
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777
llvm-svn: 120501
|
| |
|
|
|
|
| |
The encoding for alignment in VLD4-dup instructions is still a work in progress.
llvm-svn: 120356
|
| |
|
|
|
|
|
| |
assignment instructions from being moved below / above calls.
rdar://8690640
llvm-svn: 120339
|
| |
|
|
| |
llvm-svn: 120336
|
| |
|
|
| |
llvm-svn: 120312
|
| |
|
|
| |
llvm-svn: 120236
|
| |
|
|
| |
llvm-svn: 120194
|
| |
|
|
|
|
|
| |
We need to check if the individual vector elements are sign/zero-extended
values. For now this only handles constants values. Radar 8687140.
llvm-svn: 120034
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
state. Previously Thumb2 would restore sp from fp like this:
mov sp, r7
sub, sp, #4
If an interrupt is taken after the 'mov' but before the 'sub', callee-saved
registers might be clobbered by the interrupt handler. Instead, try
restoring directly from sp:
add sp, #4
Or, if necessary (with VLA, etc.) use a scratch register to compute sp and
then restore it:
sub.w r4, r7, #8
mov sp, r7
rdar://8465407
llvm-svn: 119977
|
| |
|
|
|
|
|
|
| |
illegal types (vector should be split first).
Added test case.
llvm-svn: 119749
|
| |
|
|
|
|
|
|
|
| |
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.
Adjust all testcases accordingly.
llvm-svn: 119725
|
| |
|
|
|
|
|
| |
appear to differ on Linux. Try to make them pass on Linux.
Would be good for a Linux person to review this.
llvm-svn: 119572
|
| |
|
|
|
|
| |
This completes the fixes for Radar 8673120.
llvm-svn: 119566
|
| |
|
|
|
|
|
| |
It is generally not sufficient to check if the starting offset is in range
of the maximum offset that can be efficiently used for the target.
llvm-svn: 119565
|
| |
|
|
|
|
|
| |
This makes it more clear that the symbol is an internal, compiler-generated
name and gives a little more description about its contents.
llvm-svn: 119564
|
| |
|
|
|
|
|
| |
It was mistakenly looking at the pointer type when checking for the size of
global variables. This is a partial fix for Radar 8673120.
llvm-svn: 119563
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.
Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.
Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.
2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.
rdar://8663787, rdar://8241368
llvm-svn: 119548
|
| |
|
|
|
|
|
|
|
|
| |
The live range of a register defined by an early clobber starts at the use slot,
not the def slot.
Except when it is an early clobber tied to a use operand. Then it starts at the
def slot like a standard def.
llvm-svn: 119305
|
| |
|
|
|
|
| |
This reverts r119183 which borke the buildbots.
llvm-svn: 119270
|
| |
|
|
|
|
|
| |
pass in the first place and was masked by earlier failures not warning
and aborting the block.
llvm-svn: 119184
|
| |
|
|
|
|
|
| |
The live range of a register defined by an early clobber starts at the use slot,
not the def slot.
llvm-svn: 119183
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
live ranges for the spill register are also defined at the use slot instead of
the normal def slot.
This fixes PR8612 for the inline spiller. A use was being allocated to the same
register as a spilled early clobber def.
This problem exists in all the spillers. A fix for the standard spiller is
forthcoming.
llvm-svn: 119182
|
| |
|
|
| |
llvm-svn: 118968
|
| |
|
|
| |
llvm-svn: 118951
|
| |
|
|
|
|
| |
forget-me-stick for now.
llvm-svn: 118950
|
| |
|
|
| |
llvm-svn: 118935
|
| |
|
|
|
|
| |
VFP vmla / vmls (they cause stalls). Disabling them in isel is properly not a right solution, I'll look into a proper solution next.
llvm-svn: 118922
|