|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| ... |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | In this update:
- I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2.
- I kept setting .fpu=neon-vfpv4 code attribute because that is what the
assembler understands.
Patch by Ana Pazos <apazos@codeaurora.org>
llvm-svn: 152036 | 
| | 
| 
| 
| | llvm-svn: 151083 | 
| | 
| 
| 
| 
| 
| | MSP430, PPC, PTX, Sparc, X86, XCore.
llvm-svn: 150878 | 
| | 
| 
| 
| 
| 
| | Patch by Ana Pazos!
llvm-svn: 148658 | 
| | 
| 
| 
| | llvm-svn: 146981 | 
| | 
| 
| 
| | llvm-svn: 142338 | 
| | 
| 
| 
| | llvm-svn: 141370 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | forgotten to commit.
Build on previous patches to successfully distinguish between an M-series and A/R-series MSR and MRS instruction. These take different mask names and have a *slightly* different opcode format.
Add decoder and disassembler tests.
Improvement on the previous patch - successfully distinguish between valid v6m and v7m masks (one is a subset of the other). The patch had to be edited slightly to apply to ToT.
llvm-svn: 140696 | 
| | 
| 
| 
| 
| 
| 
| | instructions are more aligned than the CPU requires, and adds some additional
directives, to follow in future patches. Patch by David Meyer!
llvm-svn: 139125 | 
| | 
| 
| 
| 
| 
| | registeration and creation code into XXXMCDesc libraries.
llvm-svn: 135184 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | - Each target asm parser now creates its own MCSubtatgetInfo (if needed).
- Changed AssemblerPredicate to take subtarget features which tablegen uses
  to generate asm matcher subtarget feature queries. e.g.
  "ModeThumb,FeatureThumb2" is translated to
  "(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0".
llvm-svn: 134678 | 
| | 
| 
| 
| | llvm-svn: 134626 | 
| | 
| 
| 
| | llvm-svn: 134608 | 
| | 
| 
| 
| | llvm-svn: 134606 | 
| | 
| 
| 
| 
| 
| | them down to MC layer. Also fix tests.
llvm-svn: 134590 | 
| | 
| 
| 
| 
| 
| | ARM subtarget info available to MC.
llvm-svn: 134569 | 
| | 
| 
| 
| | llvm-svn: 134281 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The DSP instructions in the Thumb2 instruction set are an optional extension
in the Cortex-M* archtitecture. When present, the implementation is considered
an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation."
Add a subtarget feature hook for the v7e-m instructions and hook it up. The
cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is
a v7e-m implementation.
rdar://9572992
llvm-svn: 134261 | 
| | 
| 
| 
| | llvm-svn: 134259 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | itineraries.
- Refactor TargetSubtarget to be based on MCSubtargetInfo.
- Change tablegen generated subtarget info to initialize MCSubtargetInfo
  and hide more details from targets.
llvm-svn: 134257 | 
| | 
| 
| 
| | llvm-svn: 134129 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | be the first encoded as the first feature. It then uses the CPU name to look up
features / scheduling itineray even though clients know full well the CPU name
being used to query these properties.
The fix is to just have the clients explictly pass the CPU name!
llvm-svn: 134127 | 
| | 
| 
| 
| | llvm-svn: 133738 | 
| | 
| 
| 
| | llvm-svn: 131739 | 
| | 
| 
| 
| | llvm-svn: 131708 | 
| | 
| 
| 
| 
| 
| 
| | (and add false dependency) when it isn't dependent on last CPSR defining
instruction. rdar://8928208
llvm-svn: 129773 | 
| | 
| 
| 
| | llvm-svn: 128709 | 
| | 
| 
| 
| 
| 
| | rdar://9027648.
llvm-svn: 126191 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | 1. Fixed ARM pc adjustment.
2. Fixed dynamic-no-pic codegen
3. CSE of pc-relative load of global addresses.
It's now enabled by default for Darwin.
llvm-svn: 123991 | 
| | 
| 
| 
| 
| 
| | materialize GA indirect symbols.
llvm-svn: 123809 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | movw    r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4))
        movt    r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4))
LPC0_0:
        add     r0, pc, r0
It's not yet enabled by default as some tests are failing. I suspect bugs in
down stream tools.
llvm-svn: 123619 | 
| | 
| 
| 
| | llvm-svn: 123276 | 
| | 
| 
| 
| | llvm-svn: 122794 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.
Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.
Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.
Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.
ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.
ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.
llvm-svn: 122541 | 
| | 
| 
| 
| | llvm-svn: 122539 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.
Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.
A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.
Work in progress, only A+B are enabled.
llvm-svn: 120960 | 
| | 
| 
| 
| 
| 
| 
| | as derived from the target triple.  This is important for enabling
features that are implied based on the architecture version.
llvm-svn: 118643 | 
| | 
| 
| 
| 
| 
| | extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing.
llvm-svn: 118160 | 
| | 
| 
| 
| 
| 
| 
| 
| | "-mattr=+vfp3" is specified. However, this will not work for hardware that
only supports 16 registers.  Add a new flag to support -"mattr=+vfp3,+d16".
Patch by Jan Voung!
llvm-svn: 116310 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | provide more precise
cost modeling for if-conversion.  Now if only we had a way to estimate the misprediction probability.
Adjsut CodeGen/ARM/ifcvt10.ll.  The pipeline on Cortex-A8 is long enough that it is still profitable
to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable.
llvm-svn: 114995 | 
| | 
| 
| 
| 
| 
| | accesses for ARM targets that would otherwise allow it.  Radar 8465431.
llvm-svn: 114941 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.
llvm-svn: 113570 | 
| | 
| 
| 
| | llvm-svn: 110810 | 
| | 
| 
| 
| 
| 
| | support it. e.g. cortex-m* processors.
llvm-svn: 110798 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | memory and synchronization barrier dmb and dsb instructions.
- Change instruction names to something more sensible (matching name of actual
  instructions).
- Added tests for memory barrier codegen.
llvm-svn: 110785 | 
| | 
| 
| 
| | llvm-svn: 110587 | 
| | 
| 
| 
| 
| 
| 
| | instructions to subtarget features and update tests to reflect.
PR5717.
llvm-svn: 103136 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Jordy <snhjordy@gmail.com>.
Followup patches will add some tests and adjust to use Subtarget features
for the instructions.
llvm-svn: 103119 | 
| | 
| 
| 
| | llvm-svn: 101334 | 
| | 
| 
| 
| 
| 
| | Re-commit. This time complete with testsuite updates.
llvm-svn: 99570 |