summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARM.h
Commit message (Collapse)AuthorAgeFilesLines
* This refactoring of ARM machine block size computation creates two utilitySjoerd Meijer2016-07-221-0/+5
| | | | | | | | | functions so that the size computation is available not only in ConstantIslands but in other passes as well. Differential Revision: https://reviews.llvm.org/D22640 llvm-svn: 276399
* ARM: Initialize LoadStore passes in TargetMachineMatthias Braun2016-07-161-0/+4
| | | | | | | | | | | | | | Initializing them in LLVMInitializeARMTarget() makes them visible early enough for "llc -run-pass usage". This required the pass to be renamed from "arm-load-store-opt" to "arm-ldst-opt", because there already exists an arm-load-store-opt cl::opt switch which would now clash with the passname getting added as a switch in opt. On the bright side the pass name now matches the DEBUG_TYPE name. Renamed "arm-prera-load-store-opt" to "arm-repra-ldst-opt" as well for consistency. llvm-svn: 275661
* ARM/ELF: Better codegen for global variable addresses.Peter Collingbourne2015-10-261-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In PIC mode we were previously computing global variable addresses (or GOT entry addresses) by adding the PC, the PC-relative GOT displacement and the GOT-relative symbol/GOT entry displacement. Because the latter two displacements are fixed, we ended up performing one more addition than necessary. This change causes us to compute addresses using a single PC-relative displacement, resulting in a shorter code sequence. This reduces code size by about 4% in a recent build of Chromium for Android. As a result of this change we no longer need to compute the GOT base address in the ARM backend, which allows us to remove the Global Base Reg pass and SDAG lowering for the GOT. We also now no longer use the GOT when addressing a symbol which is known to be defined in the same linkage unit. Specifically, the symbol must have either hidden visibility or a strong definition in the current module in order to not use the the GOT. This is a change from the previous behaviour where we would use the GOT to address externally visible symbols defined in the same module. I think the only cases where this could matter are cases involving symbol interposition, but we don't really support that well anyway. Differential Revision: http://reviews.llvm.org/D13650 llvm-svn: 251322
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* [ARM] Pass a callback to FunctionPass constructors to enable skipping executionAkira Hatanaka2015-06-081-1/+4
| | | | | | | | | | | | | | | | on a per-function basis. Previously some of the passes were conditionally added to ARM's pass pipeline based on the target machine's subtarget. This patch makes changes to add those passes unconditionally and execute them conditonally based on the predicate functor passed to the pass constructors. This enables running different sets of passes for different functions in the module. rdar://problem/20542263 Differential Revision: http://reviews.llvm.org/D8717 llvm-svn: 239325
* [ARM] Remove unused declaration. NFC.Ahmed Bougacha2015-02-161-1/+0
| | | | | | | GlobalMerge was moved to lib/CodeGen a while ago, and is no longer called "ARMGlobalMerge". llvm-svn: 229448
* [PM] Remove a bunch of stale TTI creation method declarations. I nukedChandler Carruth2015-02-011-3/+0
| | | | | | | their definitions, but forgot to clean up all the declarations which are in different files. llvm-svn: 227698
* Reinstate "Nuke the old JIT."Eric Christopher2014-09-021-5/+0
| | | | | | | | Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-131-2/+2
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* Temporarily Revert "Nuke the old JIT." as it's not quite ready toEric Christopher2014-08-071-0/+5
| | | | | | | | | | | be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154
* Nuke the old JIT.Rafael Espindola2014-08-071-5/+0
| | | | | | | | | I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111
* Atomics: promote ARM's IR-based atomics pass to CodeGen.Tim Northover2014-04-171-2/+0
| | | | | | | | | | | | Still only 32-bit ARM using it at this stage, but the promotion allows direct testing via opt and is a reasonably self-contained patch on the way to switching ARM64. At this point, other targets should be able to make use of it without too much difficulty if they want. (See ARM64 commit coming soon for an example). llvm-svn: 206485
* ARM: expand atomic ldrex/strex loops in IRTim Northover2014-04-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded at MachineInstr emission time had grown to be extremely large and involved, to account for the subtly different code needed for the various flavours (8/16/32/64 bit, cmpxchg/add/minmax). Moving this transformation into the IR clears up the code substantially, and makes future optimisations much easier: 1. an atomicrmw followed by using the *new* value can be more efficient. As an IR pass, simple CSE could handle this efficiently. 2. Making use of cmpxchg success/failure orderings only has to be done in one (simpler) place. 3. The common "cmpxchg; did we store?" idiom can be exposed to optimisation. I intend to gradually improve this situation within the ARM backend and make sure there are no hidden issues before moving the code out into CodeGen to be shared with (at least ARM64/AArch64, though I think PPC & Mips could benefit too). llvm-svn: 205525
* Remove duplicated DMB instructionsRenato Golin2014-04-021-0/+1
| | | | | | | | | ARM specific optimiztion, finding places in ARM machine code where 2 dmbs follow one another, and eliminating one of them. Patch by Reinoud Elhorst. llvm-svn: 205409
* Prune includes in ARM target.Craig Topper2014-03-221-4/+3
| | | | llvm-svn: 204548
* Adding an A15 specific optimization pass for interactions between S/D/Q ↵Silviu Baranga2013-03-151-0/+1
| | | | | | registers. The pass handles all the required transformations pre-regalloc. llvm-svn: 177169
* ARM: Fix a few copy-paste errors.Jim Grosbach2013-01-071-1/+1
| | | | | | s/X86/ARM/ llvm-svn: 171789
* Switch TargetTransformInfo from an immutable analysis pass that requiresChandler Carruth2013-01-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis *pass*, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it *always* being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681
* [arm-fast-isel] Add support for ELF PIC.Jush Lu2012-09-271-0/+1
| | | | | | | This is a preliminary step towards ELF support; currently ARMFastISel hasn't been used for ELF object files yet. llvm-svn: 164759
* Reorder includes to match coding standards. Fix an issue or two exposed by that.Craig Topper2012-03-171-2/+0
| | | | llvm-svn: 152978
* comment fix ARM.hJia Liu2012-02-191-1/+1
| | | | llvm-svn: 150904
* Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, ↵Jia Liu2012-02-181-1/+1
| | | | | | MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878
* Delete NEONMoveFix, now unused.Jakob Stoklund Olesen2011-09-291-1/+0
| | | | llvm-svn: 140773
* Move to ISelLowering.Bill Wendling2011-09-291-1/+0
| | | | llvm-svn: 140754
* This is the start of the new SjLj EH preparation pass, which will replace theBill Wendling2011-09-271-0/+1
| | | | | | | | | | | | | | | | | | | current IR-level pass. The old SjLj EH pass has some problems, especially with the new EH model. Most significantly, it violates some of the new restrictions the new model has. For instance, the 'dispatch' table wants to jump to the landing pad, but we cannot allow that because only an invoke's unwind edge can jump to a landing pad. This requires us to mangle the code something awful. In addition, we need to keep the now dead landingpad instructions around instead of CSE'ing them because the DWARF emitter uses that information (they are dead because no control flow edge will execute them - the control flow edge from an invoke's unwind is superceded by the edge coming from the dispatch). Basically, this pass belongs not at the IR level where SSA is king, but at the code-gen level, where we have more flexibility. llvm-svn: 140646
* Code clean up.Evan Cheng2011-07-251-6/+0
| | | | llvm-svn: 135954
* Sink ARM mc routines into MCTargetDesc.Evan Cheng2011-07-231-13/+1
| | | | llvm-svn: 135825
* Next round of MC refactoring. This patch factor MC table instantiations, MCEvan Cheng2011-07-141-2/+1
| | | | | | registeration and creation code into XXXMCDesc libraries. llvm-svn: 135184
* - Eliminate MCCodeEmitter's dependency on TargetMachine. It now uses MCInstrInfoEvan Cheng2011-07-111-6/+8
| | | | | | | | | | | | and MCSubtargetInfo. - Added methods to update subtarget features (used when targets automatically detect subtarget features or switch modes). - Teach X86Subtarget to update MCSubtargetInfo features bits since the MCSubtargetInfo layer can be shared with other modules. - These fixes .code 16 / .code 32 support since mode switch is updated in MCSubtargetInfo so MC code emitter can do the right thing. llvm-svn: 134884
* Add missing header.Jim Grosbach2011-06-221-0/+1
| | | | llvm-svn: 133640
* Move ARMMachObjectWriter to its own file.Jim Grosbach2011-06-221-0/+7
| | | | | | Just tidy up a bit. No functional change. llvm-svn: 133638
* Making use of VFP / NEON floating point multiply-accumulate / subtraction isEvan Cheng2010-12-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960
* Move the ARMAsmPrinter class defintiion into a header file.Jim Grosbach2010-12-011-3/+3
| | | | llvm-svn: 120551
* rename LowerToMCInst -> LowerARMMachineInstrToMCInst.Chris Lattner2010-11-141-1/+2
| | | | llvm-svn: 119071
* even more simplifications. ARM MCInstLowering is now justChris Lattner2010-11-141-0/+5
| | | | | | | a single function instead of a class. It doesn't need the complexity that X86 does. llvm-svn: 119070
* I added a new file ARMAsmBackend which stubs out in similar ways toJason W Kim2010-09-301-0/+3
| | | | | | | | | | the eqv X86 class. For now, I split the ELFARMAsmBackend from the DarwinARMAsmBackend (also mimicking X86) Tested against -r115126 llvm-svn: 115129
* Add skeleton infrastructure for the ARMMCCodeEmitter class. Patch by Jason Kim!Jim Grosbach2010-09-171-0/+5
| | | | llvm-svn: 114195
* Factor out basic enums and hleper functions from ARM.h for cleaner sharingJim Grosbach2010-09-151-101/+1
| | | | | | between the compiler back end and the MC libraries. llvm-svn: 114007
* Convert some VTBL and VTBX instructions to use pseudo instructions prior toBob Wilson2010-09-131-1/+0
| | | | | | | register allocation. Remove the NEONPreAllocPass, which is no longer needed. Yeah!! llvm-svn: 113818
* Add comments for what the condition code symbols mean.Bill Wendling2010-08-241-16/+16
| | | | llvm-svn: 111889
* Cleaned up the for-disassembly-only entries in the arm instruction table so thatJohnny Chen2010-08-121-0/+27
| | | | | | | the memory barrier variants (other than 'SY' full system domain read and write) are treated as one instruction with option operand. llvm-svn: 110951
* Hook in GlobalMerge passAnton Korobeynikov2010-07-241-0/+1
| | | | llvm-svn: 109359
* Remove early IT block formation. It's not used.Evan Cheng2010-07-021-1/+1
| | | | llvm-svn: 107513
* Remove the hidden "neon-reg-sequence" option. The reg sequences are workingBob Wilson2010-06-161-4/+0
| | | | | | now, so there's no need to disable them. llvm-svn: 106155
* Thumb2 IT blocks are fairly expensive. When there are multiple selects usingEvan Cheng2010-06-091-1/+1
| | | | | | | | | | | | | | | the same condition, it's important to make sure they are scheduled together to avoid forming multiple IT blocks. I'm adding a pre-regalloc pass that forms IT blocks early (by re-scheduling instructions and split basic blocks) to attempt to fix this. This is not turned on by default since I am not sure this is the right fix. Another issue is llvm selects are modeled as two-address conditional moves. This can be very bad when the copies before the conditional moves are not coalesced away. Teach IT formation pass to move the copies above the IT block (when legal) to avoid breaking the IT block. llvm-svn: 105669
* Model CONCAT_VECTORS of two 64-bit values as a REG_SEQUENCE.Evan Cheng2010-05-051-2/+6
| | | | llvm-svn: 103104
* Remove late ARM codegen optimization pass committed by accident.Anton Korobeynikov2010-04-071-1/+0
| | | | | | It is not ready for public yet. llvm-svn: 100673
* Some initial version of global mergerAnton Korobeynikov2010-04-071-0/+1
| | | | llvm-svn: 100641
* tidy some targets.Chris Lattner2010-02-021-2/+0
| | | | llvm-svn: 95146
OpenPOWER on IntegriCloud