summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* ARM-BE: test files for vector argument passingChristian Pirker2014-05-141-1/+2
| | | | | | Reviewed at http://reviews.llvm.org/D3766 llvm-svn: 208793
* ARM: Implement big endian bit-conversion for NEON typeChristian Pirker2014-05-121-2/+8
| | | | llvm-svn: 208538
* Pass the value type to TLI::getRegisterByNameHal Finkel2014-05-111-1/+2
| | | | | | | | | | | | | We must validate the value type in TLI::getRegisterByName, because if we don't and the wrong type was used with the IR intrinsic, then we'll assert (because we won't be able to find a valid register class with which to construct the requested copy operation). For PPC64, additionally, the type information is necessary to decide between the 64-bit register and the 32-bit subregister. No functionality change. llvm-svn: 208508
* Add custom lowering for add/sub with overflow intrinsics to ARMLouis Gerbarg2014-05-091-0/+95
| | | | | | | | | | | | | This patch adds support to ARM for custom lowering of the llvm.{u|s}add.with.overflow.i32 intrinsics for i32/i64. This is particularly useful for handling idiomatic saturating math functions as generated by InstCombineCompare. Test cases included. rdar://14853450 llvm-svn: 208435
* ARM: HFAs must be passed in consecutive registersOliver Stannard2014-05-091-24/+117
| | | | | | | | | | When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must be passed in a block of consecutive floating-point registers, or on the stack. This means that unused floating-point registers cannot be back-filled with part of an HFA, however this can currently happen. This patch, along with the corresponding clang patch (http://reviews.llvm.org/D3083) prevents this. llvm-svn: 208413
* ARM: support PIC on Windows on ARMSaleem Abdulrasool2014-05-091-2/+26
| | | | | | | | Handle lowering of global addresses for PIC mode compilation on Windows. Always use the movw/movt load to load the address as Windows on ARM requires ARMv7+ and is a pure Thumb environment. llvm-svn: 208385
* ARM big endian function argument passingChristian Pirker2014-05-081-11/+30
| | | | llvm-svn: 208316
* Implememting named register intrinsicsRenato Golin2014-05-061-0/+11
| | | | | | | | | | | This patch implements the infrastructure to use named register constructs in programs that need access to specific registers (bare metal, kernels, etc). So far, only the stack pointer is supported as a technology preview, but as it is, the intrinsic can already support all non-allocatable registers from any architecture. llvm-svn: 208104
* Convert SelectionDAG::getMergeValues to use ArrayRef.Craig Topper2014-04-271-2/+2
| | | | llvm-svn: 207374
* Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer ↵Craig Topper2014-04-261-3/+2
| | | | | | and size. llvm-svn: 207329
* Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.Craig Topper2014-04-261-37/+27
| | | | llvm-svn: 207327
* DAGCombiner: Turn divs of vector splats into vectorized multiplications.Benjamin Kramer2014-04-261-0/+5
| | | | | | | | | | | | Otherwise the legalizer would just scalarize everything. Support for mulhi in the targets isn't that great yet so on most targets we get exactly the same scalarized output. Add a test for x86 vector udiv. I had to disable the mulhi nodes on ARM because there aren't any patterns for it. As far as I know ARM has instructions for getting the high part of a multiply so this should be fixed. llvm-svn: 207315
* [C++] Use 'nullptr'. Target edition.Craig Topper2014-04-251-18/+19
| | | | llvm-svn: 207197
* Add 'musttail' marker to call instructionsReid Kleckner2014-04-241-0/+3
| | | | | | | | | | | | This is similar to the 'tail' marker, except that it guarantees that tail call optimization will occur. It also comes with convervative IR verification rules that ensure that tail call optimization is possible. Reviewers: nicholas Differential Revision: http://llvm-reviews.chandlerc.com/D3240 llvm-svn: 207143
* ARM: disable emission of __XYZvfp in soft-float environment.Tim Northover2014-04-221-1/+1
| | | | | | | | | | The point of these calls is to allow Thumb-1 code to make use of the VFP unit to perform its operations. This is not desirable with -msoft-float, since most of the reasons you'd want that apply equally to the runtime library. rdar://problem/13766161 llvm-svn: 206874
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | definition below all of the header #include lines, lib/Target/... edition. llvm-svn: 206842
* Atomics: promote ARM's IR-based atomics pass to CodeGen.Tim Northover2014-04-171-0/+81
| | | | | | | | | | | | Still only 32-bit ARM using it at this stage, but the promotion allows direct testing via opt and is a reasonably self-contained patch on the way to switching ARM64. At this point, other targets should be able to make use of it without too much difficulty if they want. (See ARM64 commit coming soon for an example). llvm-svn: 206485
* Convert SelectionDAG::getVTList to use ArrayRefCraig Topper2014-04-161-2/+2
| | | | llvm-svn: 206357
* Make consistent use of MCPhysReg instead of uint16_t throughout the tree.Craig Topper2014-04-041-2/+2
| | | | llvm-svn: 205610
* Tidy up. Trailing whitespace.Jim Grosbach2014-04-031-2/+2
| | | | llvm-svn: 205583
* ARM: tell LLVM about zext properties of ldrexb/ldrexhTim Northover2014-04-031-0/+14
| | | | | | | | | | | | | Implementing this via ComputeMaskedBits has two advantages: + It actually works. DAGISel doesn't deal with the chains properly in the previous pattern-based solution, so they never trigger. + The information can be used in other DAG combines, as well as the trivial "get rid of truncs". For example if the trunc is in a different basic block. rdar://problem/16227836 llvm-svn: 205540
* ARM: expand atomic ldrex/strex loops in IRTim Northover2014-04-031-747/+3
| | | | | | | | | | | | | | | | | | | | | | | | | The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded at MachineInstr emission time had grown to be extremely large and involved, to account for the subtly different code needed for the various flavours (8/16/32/64 bit, cmpxchg/add/minmax). Moving this transformation into the IR clears up the code substantially, and makes future optimisations much easier: 1. an atomicrmw followed by using the *new* value can be more efficient. As an IR pass, simple CSE could handle this efficiently. 2. Making use of cmpxchg success/failure orderings only has to be done in one (simpler) place. 3. The common "cmpxchg; did we store?" idiom can be exposed to optimisation. I intend to gradually improve this situation within the ARM backend and make sure there are no hidden issues before moving the code out into CodeGen to be shared with (at least ARM64/AArch64, though I think PPC & Mips could benefit too). llvm-svn: 205525
* [ARM] When generating a vpaddl node the input lane type is not always the ↵Silviu Baranga2014-04-031-2/+5
| | | | | | | | | | type of the add operation since extract_vector_elt can perform an extend operation. Get the input lane type from the vector on which we're performing the vpaddl operation on and extend or truncate it to the output type of the original add node. llvm-svn: 205523
* ARM: update subtarget information for Windows on ARMSaleem Abdulrasool2014-04-021-1/+2
| | | | | | | Update the subtarget information for Windows on ARM. This enables using the MC layer to target Windows on ARM. llvm-svn: 205459
* ARM: Add support for segmented stacksOliver Stannard2014-04-021-0/+2
| | | | | | Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav. llvm-svn: 205430
* ARM: add intrinsics for the v8 ldaex/stlexTim Northover2014-03-261-0/+4
| | | | | | | | | We've already got versions without the barriers, so this just adds IR-level support for generating the new v8 ones. rdar://problem/16227836 llvm-svn: 204813
* ARM: no need to update SplatBits as it is not usedArnaud A. de Grandmaison2014-03-231-3/+0
| | | | llvm-svn: 204575
* Prune includes in ARM target.Craig Topper2014-03-221-2/+0
| | | | llvm-svn: 204548
* ARM: honour -f{no-,}optimize-sibling-callsSaleem Abdulrasool2014-03-111-2/+4
| | | | | | | | | | | Use the options in the ARMISelLowering to control whether tail calls are optimised or not. Previously, this option was entirely ignored on the ARM target and only honoured on x86. This option is mostly useful in profiling scenarios. The default remains that tail call optimisations will be applied. llvm-svn: 203577
* ARM: remove ancient -arm-tail-calls optionSaleem Abdulrasool2014-03-111-8/+2
| | | | | | | | This option is from 2010, designed to work around a linker issue on Darwin for ARM. According to grosbach this is no longer an issue and this option can safely be removed. llvm-svn: 203576
* ARM: simplify EmitAtomicBinary64Tim Northover2014-03-111-23/+19
| | | | | | | | | ATOMIC_STORE operations always get here as a lowered ATOMIC_SWAP, so there's no need for any code to handle them specially. There should be no functionality change so no tests. llvm-svn: 203567
* IR: add a second ordering operand to cmpxhg for failureTim Northover2014-03-111-4/+4
| | | | | | | | | | | | | | | The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559
* ARM: Correctly align arguments after a byval struct is passed on the stackOliver Stannard2014-03-051-41/+88
| | | | llvm-svn: 202985
* [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.Benjamin Kramer2014-03-021-15/+8
| | | | | | Remove the old functions. llvm-svn: 202636
* Use 16 byte stack alignment for NaCl on ARMMark Seaborn2014-02-161-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NaCl's ARM ABI uses 16 byte stack alignment, so set that in ARMSubtarget.cpp. Using 16 byte alignment exposes an issue in code generation in which a varargs function leaves a 4 byte gap between the values of r1-r3 saved to the stack and the following arguments that were passed on the stack. (Previously, this code only needed to support 4 byte and 8 byte alignment.) With this issue, llc generated: varargs_func: sub sp, sp, #16 push {lr} sub sp, sp, #12 add r0, sp, #16 // Should be 20 stm r0, {r1, r2, r3} ldr r0, .LCPI0_0 // Address of va_list add r1, sp, #16 str r1, [r0] bl external_func Fix the bug by checking for "Align > 4". Also simplify the code by using OffsetToAlignment(), and update comments. Differential Revision: http://llvm-reviews.chandlerc.com/D2677 llvm-svn: 201497
* ARM: use natural LLVM IR for vshll instructionsTim Northover2014-02-101-19/+0
| | | | | | | | Similarly to the vshrn instructions, these are simple zext/sext + trunc operations. Using normal LLVM IR should allow for better code, and more sharing with the AArch64 backend. llvm-svn: 201093
* ARM: use LLVM IR to represent the vshrn operationTim Northover2014-02-101-5/+0
| | | | | | | | | | vshrn is just the combination of a right shift and a truncate (and the limits on the immediate value actually mean the signedness of the shift doesn't matter). Using that representation allows us to get rid of an ARM-specific intrinsic, share more code with AArch64 and hopefully get better code out of the mid-end optimisers. llvm-svn: 201085
* Add address space argument to allowsUnalignedMemoryAccess.Matt Arsenault2014-02-051-3/+4
| | | | | | | On R600, some address spaces have more strict alignment requirements than others. llvm-svn: 200887
* [TLI] Add a new hook to TargetLowering to query the target if a load of a ↵Juergen Ributzka2014-01-281-0/+12
| | | | | | | | | | | | | | | | | constant should be converted to simply the constant itself. Before this patch we used getIntImmCost from TargetTransformInfo to determine if a load of a constant should be converted to just a constant, but the threshold for this was set to an arbitrary value. This value works well for the two targets (X86 and ARM) that implement this target-hook, but it isn't target-independent at all. Now targets have the possibility to decide directly if this optimization should be performed. The default value is set to false to preserve the current behavior. The target hook has been moved to TargetLowering, which removed the last use and need of TargetTransformInfo in SelectionDAG. llvm-svn: 200271
* Fix known typosAlp Toker2014-01-241-2/+2
| | | | | | | Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018
* Switch the NEON register class from QPR to DPair.Jakob Stoklund Olesen2014-01-141-1/+1
| | | | | | | | | | | | | The already allocatable DPair superclass contains odd-even D register pair in addition to the even-odd pairs in the QPR register class. There is no reason to constrain the set of D register pairs that can be used for NEON values. Any NEON instructions that require a Q register will automatically constrain the register class to QPR. The allocation order for DPair begins with the QPR registers, so register allocation is unlikely to change much. llvm-svn: 199186
* ARM MachO: sort out isTargetDarwin/isTargetIOS/... checks.Tim Northover2014-01-061-10/+10
| | | | | | | | | | | | | | | | | | The ARM backend has been using most of the MachO related subtarget checks almost interchangeably, and since the only target it's had to run on has been IOS (which is all three of MachO, Darwin and IOS) it's worked out OK so far. But we'd like to support embedded targets under the "*-*-none-macho" triple, which means everything starts falling apart and inconsistent behaviours emerge. This patch should pick a reasonably sensible set of behaviours for the new triple (and any others that come along, with luck). Some choices were debatable (notably FP == r7 or r11), but we can revisit those later when deficiencies become apparent. llvm-svn: 198617
* Remove unnecessary #includes.Bill Wendling2014-01-061-1/+0
| | | | llvm-svn: 198585
* Refactor function that checks that __builtin_returnaddress's argument is ↵Bill Wendling2014-01-061-4/+1
| | | | | | | | | constant. This moves the check up into the parent class so that all targets can use it without having to copy (and keep in sync) the same error message. llvm-svn: 198579
* Emit an error message if the value passed to __builtin_returnaddress isn't a ↵Bill Wendling2014-01-051-0/+7
| | | | | | | | | | constant __builtin_returnaddress requires that the value passed into is be a constant. However, at -O0 even a constant expression may not be converted to a constant. Emit an error message intead of crashing. llvm-svn: 198531
* [aarch32] fix bug 18268: Incorrect condition of vselWeiming Zhao2013-12-181-1/+1
| | | | | | | | Given vsel_cc, op1, op2, since vsel has no LE/LT, to generate vsel for such selection, it needs to inverse cc and swap op1 and op2. To inverse cc, both L/G and E bits should be flipped. llvm-svn: 197615
* Correct word hyphenationsAlp Toker2013-12-051-1/+1
| | | | | | | This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471
* ARM: decide whether to use movw/movt based on "minsize" attribute.Tim Northover2013-12-021-2/+1
| | | | llvm-svn: 196102
* ARM: add pseudo-instructions for lit-pool global materialisationTim Northover2013-12-021-58/+13
| | | | | | | | | | | | These are used by MachO only at the moment, and (much like the existing MOVW/MOVT set) work around the fact that the labels used in the actual instructions often contain PC-dependent components, which means that repeatedly materialising the same global can't be CSEed. With small modifications, it could be adapted to how ELF finds the address of _GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there. llvm-svn: 196090
* Darwin-ARM: use movw/movt for static relocationsTim Northover2013-11-261-3/+1
| | | | llvm-svn: 195759
OpenPOWER on IntegriCloud