|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The ARM backend can eliminate cmp instructions by reusing flags from a
nearby sub instruction with similar arguments.
Don't do that if the sub is predicated - the flags are not written
unconditionally.
<rdar://problem/12263428>
llvm-svn: 163535 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Now that it is possible to dynamically tie MachineInstr operands,
predicated instructions are possible in SSA form:
  %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg
  %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR
Becomes a predicated SUBri with a tied imp-use:
  SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0>
This means that any instruction that is safe to move can be folded into
a MOVCC, and the *CC pseudo-instructions are no longer needed.
The test case changes reflect that Thumb2SizeReduce recognizes the
predicated instructions. It didn't understand the pseudos.
llvm-svn: 163274 | 
| | 
| 
| 
| 
| 
| 
| 
| | Previous patch accidentally decided it couldn't convert a VFP to a
NEON instruction after it had already destroyed the old one. Not a
good move.
llvm-svn: 163230 | 
| | 
| 
| 
| 
| 
| 
| 
| | NEON domain conversion was too heavy-handed with its widened
registers, which could have stripped existing instructions of their
dependency, leaving them vulnerable to scheduling errors.
llvm-svn: 163070 | 
| | 
| 
| 
| | llvm-svn: 162898 | 
| | 
| 
| 
| | llvm-svn: 162844 | 
| | 
| 
| 
| | llvm-svn: 162825 | 
| | 
| 
| 
| 
| 
| 
| 
| | I have tested the fix, but have not been successfull in generating
a robust unit test. This can only be exposed through particular
register assignments.
llvm-svn: 162821 | 
| | 
| 
| 
| | llvm-svn: 162820 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | ARM."
This wasn't the right way to enforce ordering of atomics.
We are already setting the isVolatile bit on memory operands of atomic
operations which is good enough to enforce the correct ordering.
llvm-svn: 162732 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | It is not safe to use normal LDR instructions because they may be
reordered by the scheduler. The ATOMIC_LDR pseudos have a mayStore flag
that prevents reordering.
Atomic loads are also prevented from participating in rematerialization
and load folding.
llvm-svn: 162713 | 
| | 
| 
| 
| 
| 
| | <rdar://problem/12183003>
llvm-svn: 162703 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | *** Bad machine code: Explicit definition marked as use ***
- function:    test_cos
- basic block: BB#0 L.entry (0x7ff2a2024fd0)
- instruction: VSETLNi32 %D11, %D11<undef>, %R0, 0, pred:14, pred:%noreg, %Q5<imp-use,kill>, %Q5<imp-def>
- operand 0:   %D11
llvm-svn: 162247 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.
Fixes PR13628.
llvm-svn: 162130 | 
| | 
| 
| 
| | llvm-svn: 162094 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.
Then the pseudo-instructions can go away.
llvm-svn: 162061 | 
| | 
| 
| 
| 
| 
| | Use the target independent select analysis hooks.
llvm-svn: 162060 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.
This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.
llvm-svn: 161994 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | stuff
(this corresponds by spilling/reloading regs in DTriple / DQuad reg classes).
No testcase, found by inspection.
llvm-svn: 161300 | 
| | 
| 
| 
| 
| 
| 
| 
| | classes, which
were missed for no reason. This fixes PR13377
llvm-svn: 161299 | 
| | 
| 
| 
| | llvm-svn: 160621 | 
| | 
| 
| 
| | llvm-svn: 160093 | 
| | 
| 
| 
| 
| 
| 
| 
| | It is safe if CPSR is killed or re-defined.
When we are done with the basic block, check whether CPSR is live-out.
Do not optimize away cmp if CPSR is live-out.
llvm-svn: 160090 | 
| | 
| 
| 
| 
| 
| | My last checkin was apparently not the branch I intended. It was missing one change (added by chandlerc), and contained a spurious change.
llvm-svn: 159548 | 
| | 
| 
| 
| 
| 
| | Reapplies r159406 with minor cleanup. The regressions appear to have been spurious.
llvm-svn: 159541 | 
| | 
| 
| 
| 
| 
| 
| 
| | Use getUniqueVRegDef.
Replace a loop with existing interfaces: modifiesRegister and readsRegister.
Factor out code into inline functions and simplify the code.
llvm-svn: 159470 | 
| | 
| 
| 
| 
| 
| | instructions with two register operands.
llvm-svn: 159465 | 
| | 
| 
| 
| 
| 
| | This reverts commit r159406. I noticed a performance regression so I'll back out for now.
llvm-svn: 159411 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The TargetInstrInfo::getNumMicroOps API does not change, but soon it
will be used by MachineScheduler. Now each subtarget can specify the
number of micro-ops per itinerary class. For ARM, this is currently
always dynamic (-1), because it is used for load/store multiple which
depends on the number of register operands.
Zero is now a valid number of micro-ops. This can be used for
nop pseudo-instructions or instructions that the hardware can squash
during dispatch.
llvm-svn: 159406 | 
| | 
| 
| 
| 
| 
| | possible. Sorry. rdar://11745134
llvm-svn: 159236 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".
rdar: 11725965
llvm-svn: 159166 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is a minor drive-by fix with no robust way to unit test.
As an example see neon-div.ll:
SU(16):   %Q8<def> = VMOVLsv4i32 %D17, pred:14, pred:%noreg, %Q8<imp-use,kill>
 val SU(1): Latency=2 Reg=%Q8
...should be latency=1
llvm-svn: 158960 | 
| | 
| 
| 
| 
| 
| 
| 
| | Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.
llvm-svn: 158959 | 
| | 
| 
| 
| | llvm-svn: 158164 | 
| | 
| 
| 
| 
| 
| | Match expectations of the new latency API. Cleanup and make the logic consistent.
llvm-svn: 158163 | 
| | 
| 
| 
| | llvm-svn: 158162 | 
| | 
| 
| 
| | llvm-svn: 158161 | 
| | 
| 
| 
| 
| 
| 
| | Minimum latency determines per-cycle scheduling groups.
Expected latency determines critical path and cost.
llvm-svn: 158021 | 
| | 
| 
| 
| 
| 
| | uint16_t. Simplify loop iterating over one of those tables. No functional change intended.
llvm-svn: 157367 | 
| | 
| 
| 
| 
| 
| | Found by GCC's maybe-uninitialized.
llvm-svn: 156780 | 
| | 
| 
| 
| | llvm-svn: 156620 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1
TO
  subs r1, r3
  bge  L1 or ble L1
If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.
rdar: 10734411
llvm-svn: 156599 | 
| | 
| 
| 
| 
| 
| | This commit broke an external linux bot and gave a compile-time warning.
llvm-svn: 156556 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1
TO
  subs r1, r3
  bge  L1 or ble L1
If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.
rdar: 10734411
llvm-svn: 156550 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.
<rdar://problem/11182914>
llvm-svn: 154033 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | ARM recently gained DPair, DTriple, and DQuad register classes.
Update copyPhysReg() to handle copies in these register classes.
No test case, it is difficult to make the register allocator emit the
odd copies reliably. The missing DPair copy caused a failure on
partialsums in the nightly test suite.
<rdar://problem/11147997>
llvm-svn: 153686 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | The arm_neon intrinsics can create virtual registers from the DPair
register class which allows both even-odd and odd-even D-register pairs.
This fixes PR12389.
llvm-svn: 153603 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | produces a 32-bit immediate which is consumed by the use. It tries to 
fold the immediate by breaking it into two parts and fold them into the
immmediate fields of two uses. e.g
       movw    r2, #40885
       movt    r3, #46540
       add     r0, r0, r3
=>
       add.w   r0, r0, #3019898880
       add.w   r0, r0, #30146560
;
However, this transformation is incorrect if the user produces a flag. e.g.
       movw    r2, #40885
       movt    r3, #46540
       adds    r0, r0, r3
=>
       add.w   r0, r0, #3019898880
       adds.w  r0, r0, #30146560
Note the adds.w may not set the carry flag even if the original sequence
would.
rdar://11116189
llvm-svn: 153484 | 
| | 
| 
| 
| | llvm-svn: 153422 | 
| | 
| 
| 
| 
| 
| 
| | Register pair VLD1/VLD2 all-lanes instructions. Kill off more of the
pseudos as a result.
llvm-svn: 152150 |