| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
landing pad as its successor.
SjLj exception handling jumps to the correct landing pad via a switch statement
that's generated right before code-gen. Loosen the constraint in the machine
instruction verifier to allow for this. Note, this isn't the most rigorous check
since we cannot determine where that switch statement came from. But it's
marginally better than turning this check off when SjLj exceptions are used.
<rdar://problem/9187612>
llvm-svn: 130881
|
| |
|
|
|
|
| |
alignment 4 is wrong) and requires hard-float.
llvm-svn: 130875
|
| |
|
|
| |
llvm-svn: 130854
|
| |
|
|
| |
llvm-svn: 130800
|
| |
|
|
| |
llvm-svn: 130778
|
| |
|
|
| |
llvm-svn: 130769
|
| |
|
|
| |
llvm-svn: 130763
|
| |
|
|
|
|
|
|
| |
model constants which can be added to base registers via add-immediate
instructions which don't require an additional register to materialize
the immediate.
llvm-svn: 130743
|
| |
|
|
| |
llvm-svn: 130567
|
| |
|
|
|
|
| |
FastEmit_i can fail for non-Thumb2 ARM. Makes ARMSimplifyAddress work correctly, and reduces the number of fast-isel bailouts on non-Thumb ARM.
llvm-svn: 130560
|
| |
|
|
|
|
| |
ARM/Thumb2 patterns.
llvm-svn: 130552
|
| |
|
|
| |
llvm-svn: 130540
|
| |
|
|
|
|
|
| |
Fix a rather obscure crash caused by ARM fast-isel generating code which redefines a register.
rdar://problem/9338332 .
llvm-svn: 130539
|
| |
|
|
|
|
| |
for bools, but is a start.
llvm-svn: 130534
|
| |
|
|
| |
llvm-svn: 130462
|
| |
|
|
| |
llvm-svn: 130455
|
| |
|
|
|
|
|
|
| |
redefines a register.
rdar://problem/9338332 .
llvm-svn: 130454
|
| |
|
|
|
|
|
| |
This fixes clang generated blocks' variables' debug info.
Radar 9279956.
llvm-svn: 130373
|
| |
|
|
|
|
|
|
|
|
| |
successors) and use inverse depth first search to traverse the BBs. However
that doesn't work when the CFG has infinite loops. Simply do a linear
traversal of all BBs work just fine.
rdar://9344645
llvm-svn: 130324
|
| |
|
|
|
|
|
| |
We cannot rely on the <imp-def> operands added by LiveIntervals in all cases as
demonstrated by the test case.
llvm-svn: 130313
|
| |
|
|
|
|
|
|
|
|
|
|
| |
more callee-saved registers and introduce copies. Only allows it if scheduling
a node above calls would end up lessen register pressure.
Call operands also has added ABI restrictions for register allocation, so be
extra careful with hoisting them above calls.
rdar://9329627
llvm-svn: 130245
|
| |
|
|
|
|
| |
allocation so it's being removed.
llvm-svn: 130243
|
| |
|
|
|
|
|
|
|
|
|
| |
symbols. For example, don't emit:
.comm _i,4,2 ## @i
## @i
instead emit:
.comm _i,4,2 ## @i
llvm-svn: 130192
|
| |
|
|
| |
llvm-svn: 130165
|
| |
|
|
|
|
| |
lit needs a linter ...
llvm-svn: 130126
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>.
t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the
assembly printer correctly prints the 's' suffix.
Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags.
Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS.
Fixes ARM SBC lowering to check for live carry (potential bug).
llvm-svn: 130048
|
| |
|
|
| |
llvm-svn: 129952
|
| |
|
|
| |
llvm-svn: 129947
|
| |
|
|
| |
llvm-svn: 129934
|
| |
|
|
| |
llvm-svn: 129884
|
| |
|
|
|
|
|
|
|
| |
manually and pass all (now) 4 arguments to the mul libcall. Add a new
ExpandLibCall for just this (copied gratuitously from type legalization).
Fixes rdar://9292577
llvm-svn: 129842
|
| |
|
|
|
|
|
|
|
| |
- There is a minor semantic change here (evidenced by the test change) for
Darwin triples that have no version component. I debated changing the default
behavior of isOSVersionLT, but decided it made more sense for triples to be
explicit.
llvm-svn: 129802
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
additional pipeline stall. So it's frequently better to single codegen
vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
vmla + vmla is very bad. But this isn't ideal either:
vmul
vadd
vmla
Instead, we want to expand the second vmla:
vmla
vmul
vadd
Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
faster.
Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.
A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
vmla / vmls will trigger one of the special hazards.
Enable these fp vmlx codegen changes for Cortex-A9.
llvm-svn: 129775
|
| |
|
|
| |
llvm-svn: 129774
|
| |
|
|
|
|
|
| |
(and add false dependency) when it isn't dependent on last CPSR defining
instruction. rdar://8928208
llvm-svn: 129773
|
| |
|
|
|
|
|
|
|
|
|
| |
Add a avoidWriteAfterWrite() target hook to identify register classes that
suffer from write-after-write hazards. For those register classes, try to avoid
writing the same register in two consecutive instructions.
This is currently disabled by default. We should not spill to avoid hazards!
The command line flag -avoid-waw-hazard can be used to enable waw avoidance.
llvm-svn: 129772
|
| |
|
|
|
|
|
| |
Ideally, we would match an S-register to its containing D-register, but that
requires arithmetic (divide by 2).
llvm-svn: 129756
|
| |
|
|
| |
llvm-svn: 129739
|
| |
|
|
| |
llvm-svn: 129738
|
| |
|
|
|
|
|
|
|
| |
registers for fast allocation a different way. This has us updating
used registers only when we're using that exact register.
Fixes rdar://9207598
llvm-svn: 129711
|
| |
|
|
|
|
| |
the node to a libcall. rdar://9280991
llvm-svn: 129633
|
| |
|
|
|
|
| |
a case involving EOR, so I only added a test for ORR.
llvm-svn: 129610
|
| |
|
|
|
|
| |
problem as all of the other instructions we fold with CMPs.
llvm-svn: 129602
|
| |
|
|
|
|
| |
fixes <rdar://problem/9287901>.
llvm-svn: 129599
|
| |
|
|
|
|
| |
forget to right shift the source by 32 first. rdar://9287902
llvm-svn: 129556
|
| |
|
|
| |
llvm-svn: 129468
|
| |
|
|
|
|
|
| |
the max itself, so it is not easy to write a test case for this, but I added a
test case that would fail if the code in AsmPrinter were removed.
llvm-svn: 129432
|
| |
|
|
|
|
|
| |
alignment for its type, use the minimum of the specified alignment and the ABI
alignment. This fixes <rdar://problem/9275290>.
llvm-svn: 129428
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
latency.
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.
Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.
llvm-svn: 129421
|
| |
|
|
| |
llvm-svn: 129417
|