summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMSubtarget.h
Commit message (Collapse)AuthorAgeFilesLines
* Clean up ARM fused multiply + add/sub support some more: rename some iselEvan Cheng2012-04-111-3/+1
| | | | | | | | | | | predicates. Also remove NEON2 since it's not really useful and it is confusing. If NEON + VFP4 implies NEON2 but NEON2 doesn't imply NEON + VFP4, what does it really mean? rdar://10139676 llvm-svn: 154480
* Fix a number of problems with ARM fused multiply add/subtract instructions.Evan Cheng2012-04-111-1/+1
| | | | | | | | | | 1. The new instruction itinerary entries are not properly described. 2. The asm parser can't handle vfms and vfnms. 3. There were no assembler, disassembler test cases. 4. HasNEON2 has the wrong assembler predicate. rdar://10139676 llvm-svn: 154456
* updated patch for the ARM fused multiply add/subSebastian Pop2012-03-051-2/+2
| | | | | | | | | | | In this update: - I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2. - I kept setting .fpu=neon-vfpv4 code attribute because that is what the assembler understands. Patch by Ana Pazos <apazos@codeaurora.org> llvm-svn: 152036
* Re-commit r151623 with fix. Only issue special no-return calls if it's a ↵Evan Cheng2012-02-281-0/+5
| | | | | | direct call. llvm-svn: 151645
* Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack ↵Daniel Dunbar2012-02-281-5/+0
| | | | | | prediction. ...", it is breaking the Clang build during the Compiler-RT part. llvm-svn: 151630
* Some ARM implementaions, e.g. A-series, does return stack prediction. That is,Evan Cheng2012-02-281-0/+5
| | | | | | | | | | | | | | | | | the processor keeps a return addresses stack (RAS) which stores the address and the instruction execution state of the instruction after a function-call type branch instruction. Calling a "noreturn" function with normal call instructions (e.g. bl) can corrupt RAS and causes 100% return misprediction so LLVM should use a unconditional branch instead. i.e. mov lr, pc b _foo The "mov lr, pc" is issued in order to get proper backtrace. rdar://8979299 llvm-svn: 151623
* Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, ↵Jia Liu2012-02-181-1/+1
| | | | | | MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878
* Add fused multiple+add instructions from VFPv4.Anton Korobeynikov2012-01-221-2/+6
| | | | | | Patch by Ana Pazos! llvm-svn: 148658
* ARM target code clean up. Check for iOS, not Darwin where it makes sense.Evan Cheng2011-12-201-0/+1
| | | | llvm-svn: 146981
* Hide cpu name checking in ARMSubtarget.Evan Cheng2011-11-091-0/+1
| | | | llvm-svn: 144154
* Remove NaClModeDavid Meyer2011-10-181-3/+0
| | | | llvm-svn: 142338
* Reenable tail calls for iOS 5.0 and later.Bob Wilson2011-10-071-0/+6
| | | | llvm-svn: 141370
* Check in a patch that has already been code reviewed by Owen that I'd ↵James Molloy2011-09-281-0/+6
| | | | | | | | | | | | forgotten to commit. Build on previous patches to successfully distinguish between an M-series and A/R-series MSR and MRS instruction. These take different mask names and have a *slightly* different opcode format. Add decoder and disassembler tests. Improvement on the previous patch - successfully distinguish between valid v6m and v7m masks (one is a subset of the other). The patch had to be edited slightly to apply to ToT. llvm-svn: 140696
* Add a new MC bit for NaCl (Native Client) mode. NaCl requires that certainNick Lewycky2011-09-051-0/+6
| | | | | | | instructions are more aligned than the CPU requires, and adds some additional directives, to follow in future patches. Patch by David Meyer! llvm-svn: 139125
* Rewrite comment in English.Evan Cheng2011-07-071-1/+1
| | | | llvm-svn: 134627
* Rename attribute 'thumb' to a more descriptive 'thumb-mode'.Evan Cheng2011-07-071-5/+5
| | | | llvm-svn: 134626
* Compute feature bits at time of MCSubtargetInfo initialization.Evan Cheng2011-07-071-1/+2
| | | | llvm-svn: 134606
* Change some ARM subtarget features to be single bit yes/no in order to sink ↵Evan Cheng2011-07-071-23/+24
| | | | | | them down to MC layer. Also fix tests. llvm-svn: 134590
* Factor ARM triple parsing out of ARMSubtarget. Another step towards making ↵Evan Cheng2011-07-071-11/+7
| | | | | | ARM subtarget info available to MC. llvm-svn: 134569
* Rename XXXGenSubtarget.inc to XXXGenSubtargetInfo.inc for consistency.Evan Cheng2011-07-011-1/+1
| | | | llvm-svn: 134281
* ARMv7M vs. ARMv7E-M support.Jim Grosbach2011-07-011-2/+7
| | | | | | | | | | | | | | The DSP instructions in the Thumb2 instruction set are an optional extension in the Cortex-M* archtitecture. When present, the implementation is considered an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation." Add a subtarget feature hook for the v7e-m instructions and hook it up. The cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is a v7e-m implementation. rdar://9572992 llvm-svn: 134261
* Rename TargetSubtarget to TargetSubtargetInfo for consistency.Evan Cheng2011-07-011-3/+3
| | | | llvm-svn: 134259
* - Added MCSubtargetInfo to capture subtarget features and schedulingEvan Cheng2011-07-011-1/+4
| | | | | | | | | itineraries. - Refactor TargetSubtarget to be based on MCSubtargetInfo. - Change tablegen generated subtarget info to initialize MCSubtargetInfo and hide more details from targets. llvm-svn: 134257
* Fix the ridiculous SubtargetFeatures API where it implicitly expects CPU name toEvan Cheng2011-06-301-3/+3
| | | | | | | | | | be the first encoded as the first feature. It then uses the CPU name to look up features / scheduling itineray even though clients know full well the CPU name being used to query these properties. The fix is to just have the clients explictly pass the CPU name! llvm-svn: 134127
* Sink SubtargetFeature and TargetInstrItineraries (renamed ↵Evan Cheng2011-06-291-2/+1
| | | | | | MCInstrItineraries) into MC. llvm-svn: 134049
* Revert accidental commit.Evan Cheng2011-05-201-6/+0
| | | | llvm-svn: 131739
* Revert r131664 and fix it in instcombine instead. rdar://9467055Evan Cheng2011-05-201-0/+6
| | | | llvm-svn: 131708
* Remove -use-divmod-libcall. Let targets opt in when they are available.Evan Cheng2011-04-201-0/+2
| | | | llvm-svn: 129884
* ADT/Triple: Move a variety of clients to using isOSDarwin() and isOSWindows()Daniel Dunbar2011-04-191-1/+1
| | | | | | predicates. llvm-svn: 129816
* Avoid some 's' 16-bit instruction which partially update CPSRBob Wilson2011-04-191-0/+6
| | | | | | | (and add false dependency) when it isn't dependent on last CPSR defining instruction. rdar://8928208 llvm-svn: 129773
* Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplierEvan Cheng2011-03-311-0/+5
| | | | | | | | | | | accumulator forwarding: vadd d3, d0, d1 vmul d3, d3, d2 => vmul d3, d0, d2 vmla d3, d1, d2 llvm-svn: 128665
* Clean up ARM subtarget code by using Triple ADT.Evan Cheng2011-01-111-3/+6
| | | | llvm-svn: 123276
* Various bits of framework needed for precise machine-level selectionAndrew Trick2010-12-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541
* whitespaceAndrew Trick2010-12-241-1/+1
| | | | llvm-svn: 122539
* Making use of VFP / NEON floating point multiply-accumulate / subtraction isEvan Cheng2010-12-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960
* Fix preload instruction isel. Only v7 supports pli, and only v7 with mp ↵Evan Cheng2010-11-031-0/+5
| | | | | | extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing. llvm-svn: 118160
* PR8359: The ARM backend may end up allocating registers D16 to D31 whenBob Wilson2010-10-121-0/+5
| | | | | | | | "-mattr=+vfp3" is specified. However, this will not work for hardware that only supports 16 registers. Add a new flag to support -"mattr=+vfp3,+d16". Patch by Jan Voung! llvm-svn: 116310
* Jim Asked us to move DataLayout on ARM back to the most specialized classes. DoRafael Espindola2010-10-031-23/+0
| | | | | | | | so and also change X86 for consistency. Investigating if this can be improved a bit. llvm-svn: 115469
* Increase ARM APCS preferred alignment for i64 and f64 from 32 bits to 64 bits.Bob Wilson2010-09-291-2/+2
| | | | | | | LDM/STM instructions can run one cycle faster on some ARM processors if the memory address is 64-bit aligned. Radar 8489376. llvm-svn: 115047
* Add a subtarget hook for reporting the misprediction penalty. Use this to ↵Owen Anderson2010-09-281-0/+2
| | | | | | | | | | | provide more precise cost modeling for if-conversion. Now if only we had a way to estimate the misprediction probability. Adjsut CodeGen/ARM/ifcvt10.ll. The pipeline on Cortex-A8 is long enough that it is still profitable to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable. llvm-svn: 114995
* Add a command line option "-arm-strict-align" to disallow unaligned memoryBob Wilson2010-09-281-0/+7
| | | | | | accesses for ARM targets that would otherwise allow it. Radar 8465431. llvm-svn: 114941
* Hard to imagine there are still people using inferior compilers.Daniel Dunbar2010-09-271-1/+1
| | | | llvm-svn: 114862
* Odd additional stub framework for the ARM MC ELF emission.Rafael Espindola2010-09-271-0/+23
| | | | | | | | | llc now recognizes the "intent" to support MC/obj emission for ARM, but given that they are all stubs, it asserts on --filetype=obj --march=arm Patch by Jason Kim. llvm-svn: 114856
* Teach if-converter to be more careful with predicating instructions that wouldEvan Cheng2010-09-101-0/+10
| | | | | | | | | | | take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570
* cortex m4 has floating point support, but only single precision.Jim Grosbach2010-08-111-0/+5
| | | | llvm-svn: 110810
* Report error if codegen tries to instantiate a ARM target when the cpu does ↵Evan Cheng2010-08-111-0/+5
| | | | | | support it. e.g. cortex-m* processors. llvm-svn: 110798
* Add ARM Archv6M and let it implies FeatureDB (having dmb, etc.)Evan Cheng2010-08-111-1/+1
| | | | llvm-svn: 110795
* - Add subtarget feature -mattr=+db which determine whether an ARM cpu has theEvan Cheng2010-08-111-0/+5
| | | | | | | | | memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785
* Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable ↵Evan Cheng2010-08-091-0/+5
| | | | | | more 32-bit to 16-bit optimizations. llvm-svn: 110584
* Add an ARM "feature". Cortex-a8 fp comparison is very slow (> 20 cycles).Evan Cheng2010-07-131-0/+4
| | | | llvm-svn: 108256
OpenPOWER on IntegriCloud