| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
| |
The VMOVS widening needs to look at the implicit COPY operands. Trying
to dig out the COPY instruction from an iterator in copyPhysReg() is the
wrong approach.
The expandPostRAPseudo() hook gets to look at COPY instructions before
they are converted to copyPhysReg() calls.
llvm-svn: 141619
|
|
|
|
| |
llvm-svn: 140943
|
|
|
|
| |
llvm-svn: 140939
|
|
|
|
|
|
| |
and block addresses.
llvm-svn: 140936
|
|
|
|
|
|
| |
for an MBB.
llvm-svn: 140824
|
|
|
|
|
|
|
| |
This enables NEON domain tracking across basic blocks, but should
otherwise do the same thing.
llvm-svn: 140772
|
|
|
|
| |
llvm-svn: 140653
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is still a hack until we can teach tblgen to generate the
optional CPSR operand rather than an implicit CPSR def. But the
strangeness is now limited to the selection DAG. ADD/SUB MI's no
longer have implicit CPSR defs, nor do we allow flag setting variants
of these opcodes in machine code. There are several corner cases to
consider, and getting one wrong would previously lead to nasty
miscompilation. It's not the first time I've debugged one, so this
time I added enough verification to ensure it won't happen again.
llvm-svn: 140228
|
|
|
|
| |
llvm-svn: 140227
|
|
|
|
| |
llvm-svn: 139431
|
|
|
|
|
|
| |
have a predicate operand, unlike conditional branches.
llvm-svn: 139415
|
|
|
|
|
|
|
|
|
|
|
| |
It appears that our use of the imp-use and imp-def flags with
sub-registers is not yet robust enough to support this.
The failing test case is complicated, I am working on a reduction.
<rdar://problem/10044201>
llvm-svn: 138861
|
|
|
|
|
|
|
|
| |
There is no non-writeback store multiple instruction in Thumb1, so
don't define one. As a result load multiple is the only instantiation of
the multiclass, so refactor that away entirely.
llvm-svn: 138338
|
|
|
|
| |
llvm-svn: 138177
|
|
|
|
|
|
|
|
| |
This pleases the register scavenger and brings
test/CodeGen/ARM/2011-08-12-vmovqqqq-pseudo.ll a little closer to
working with -verify-machineinstrs.
llvm-svn: 138164
|
|
|
|
|
|
|
| |
Therefore, rather then generate a pseudo instruction, which is later expanded,
generate the necessary instructions in place.
llvm-svn: 138163
|
|
|
|
|
|
| |
register subclasses. Hopefully this fixes some buildbots.
llvm-svn: 137223
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Cortex-A8, we use the NEON v2f32 instructions for f32 arithmetic. For
better latency, we also send D-register copies down the NEON pipeline by
translating them to vorr instructions.
This patch promotes even S-register copies to D-register copies when
possible so they can also go down the NEON pipeline. Example:
vldr.32 s0, LCPI0_0
loop:
vorr d1, d0, d0
loop2:
...
vadd.f32 d1, d1, d16
The vorr instruction looked like this after regalloc:
%S2<def> = COPY %S0, %D1<imp-def>
Copies involving odd S-registers, and copies that don't define the full
D-register are left alone.
llvm-svn: 137182
|
|
|
|
|
|
| |
They improve the verbose assembly.
llvm-svn: 137069
|
|
|
|
|
|
| |
allowing us to distinguish the encodings that use shifted registers from those that use shifted immediates. This is necessary to allow the fixed-length decoder to distinguish things like BICS vs LDRH.
llvm-svn: 135693
|
|
|
|
|
|
| |
ARM MC code from target.
llvm-svn: 135636
|
|
|
|
|
|
| |
to simplify the path towards an auto-generated disassembler.
llvm-svn: 135290
|
|
|
|
|
|
| |
registeration and creation code into XXXMCDesc libraries.
llvm-svn: 135184
|
|
|
|
|
|
| |
an opcode. Switch ARM over to using that rather than its own special MCInstrDesc bits.
llvm-svn: 135106
|
|
|
|
| |
llvm-svn: 134858
|
|
|
|
| |
llvm-svn: 134244
|
|
|
|
|
|
|
| |
The tSpill and tRestore instructions are just copies of the tSTRspi and
tLDRspi instructions, respectively. Just use those directly instead.
llvm-svn: 134092
|
|
|
|
| |
llvm-svn: 134030
|
|
|
|
| |
llvm-svn: 134024
|
|
|
|
|
|
|
|
| |
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.
llvm-svn: 134021
|
|
|
|
|
|
| |
There are probably more instances of this floating around.
llvm-svn: 130474
|
|
|
|
|
|
|
|
|
| |
is, it assumes addresses are 64-bit aligned (which should be the more common
case). If the alignment is found not to be aligned, then getOperandLatency()
would adjust the operand latency computation by one to compensate for it.
rdar://9294833
llvm-svn: 129742
|
|
|
|
|
|
| |
a case involving EOR, so I only added a test for ORR.
llvm-svn: 129610
|
|
|
|
|
|
| |
problem as all of the other instructions we fold with CMPs.
llvm-svn: 129602
|
|
|
|
|
|
| |
fixes <rdar://problem/9287901>.
llvm-svn: 129599
|
|
|
|
|
|
| |
Luis Felipe Strano Moraes!
llvm-svn: 129558
|
|
|
|
| |
llvm-svn: 129429
|
|
|
|
|
|
| |
folded comparisons, just like ADD and SUB.
llvm-svn: 129038
|
|
|
|
|
|
| |
actually exist.
llvm-svn: 128461
|
|
|
|
|
|
|
|
|
|
|
|
| |
entries being compared may not be ARMConstantPoolValue. Without checking
whether they are ARMConstantPoolValue first, and if the stars and moons
are aligned properly, the equality test may return true (when the first few
words of two Constants' values happen to be identical) and very bad things can
happen.
rdar://9125354
llvm-svn: 128203
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
int tries = INT_MAX;
while (tries > 0) {
tries--;
}
The check should be:
subs r4, #1
cmp r4, #0
bgt LBB0_1
The subs can set the overflow V bit when r4 is INT_MAX+1 (which loop
canonicalization apparently does in this case). cmp #0 would have cleared
it while not changing the N and Z bits. Since BGT is dependent on the V
bit, i.e. (N == V) && !Z, it is not safe to eliminate the cmp #0.
rdar://9172742
llvm-svn: 128179
|
|
|
|
|
|
|
| |
This is just very first approximation how the stuff should be done
(e.g. ARM-only for now). More to follow.
llvm-svn: 127101
|
|
|
|
|
|
|
|
|
|
| |
1. Fixed ARM pc adjustment.
2. Fixed dynamic-no-pic codegen
3. CSE of pc-relative load of global addresses.
It's now enabled by default for Darwin.
llvm-svn: 123991
|
|
|
|
|
|
|
|
|
|
|
| |
flags. They are still not enable in this revision.
Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with
the scheduler's model of operand latency in the selection DAG.
Generalized unit tests to work with sched-cycles.
llvm-svn: 123969
|
|
|
|
|
|
|
|
|
| |
value, the "add pc" must be CSE'ed at the same time. We could follow the same
approach as T2 by adding pseudo instructions that combine the ldr + "add pc".
But the better approach is to use movw + movt (which I will enable soon), so
I'll leave this as a TODO.
llvm-svn: 123949
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.
Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.
ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
to re-materialize the instruction, allow machine LICM to hoist the set of
instructions out of the loop and make it possible to CSE them. It's a bit
hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.
With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.
llvm-svn: 123905
|
|
|
|
|
|
|
|
|
|
|
|
| |
movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4))
movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4))
LPC0_0:
add r0, pc, r0
It's not yet enabled by default as some tests are failing. I suspect bugs in
down stream tools.
llvm-svn: 123619
|
|
|
|
|
|
|
|
| |
These functions not longer assert when passed 0, but simply return false instead.
No functional change intended.
llvm-svn: 123155
|
|
|
|
| |
llvm-svn: 123048
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.
Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.
Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.
Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.
ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.
ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.
llvm-svn: 122541
|