summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* ARM: Make sure the instruction alias for PLI uses the right subtarget features.Tilmann Scheller2013-07-181-1/+3
| | | | | | | | PLI requires both the Thumb2 and the ARMv7 feature. Related to <rdar://problem/14403733>. llvm-svn: 186620
* Change 'n' to 'N' to keep consistent with other instructions.Joey Gouly2013-07-181-4/+4
| | | | llvm-svn: 186576
* [ARMv8] Add NEON instructions VCVT{A, N, P, M}.Joey Gouly2013-07-182-0/+64
| | | | llvm-svn: 186574
* Add Thumb tests for the ARMv8 FP instructions that I recently added.Joey Gouly2013-07-181-2/+2
| | | | | | Also, fix the namespace for two instructions that I missed previously. llvm-svn: 186572
* Remove the extra leading 0 from VMAXNMND.Joey Gouly2013-07-181-1/+1
| | | | | | | | The N3VDIntnp pattern takes bits<5> and I gave it 6 bits. Thanks to Jiangning Liu for spotting it! llvm-svn: 186568
* [ARMv8] Add support for the NEON instructions vmaxnm/vminnm.Joey Gouly2013-07-175-0/+119
| | | | | | | This adds a new class for non-predicable NEON instructions and a new DecoderNamespace for v8 NEON instructions. llvm-svn: 186504
* Fix ARMFastISel::ARMEmitIntExt shift emissionJF Bastien2013-07-171-30/+52
| | | | | | | | My patch 'r183551 - ARM FastISel integer sext/zext improvements' was incorrect when emitting ARM register-immediate ASR, LSL, LSR instructions: they are pseudo-instructions in ARMInstrInfo.td and I should have used MOVsi instead. This is not an issue when code is generated through a .s file, but is an issue when generated straight to a .o (-filetype=obj). llvm-svn: 186489
* Related to r181161 - Indirect branches may not be the last branch in a basicLang Hames2013-07-161-0/+7
| | | | | | | | | | | | block. Blocks that have an indirect branch terminator, even if it's not the last terminator, should still be treated as unanalyzable. <rdar://problem/14437274> Reducing a useful regression test case is proving difficult - I hope to have one soon. llvm-svn: 186461
* ARM: Add support for the Thumb2 PLI alternate literal form.Tilmann Scheller2013-07-161-0/+3
| | | | | | | | | | This adds an instruction alias to make the assembler recognize the alternate literal form: pli [PC, #+/-<imm>] See A8.8.129 in the ARM ARM (DDI 0406C.b). Fixes <rdar://problem/14403733>. llvm-svn: 186459
* ARM: allow printing of ARM atomic DAG nodes.Tim Northover2013-07-161-0/+13
| | | | | | | | We'd forgotten to provide string representations for the special ARMISD atomic nodes; this adds them in. No effect on CodeGen, just makes the output of "-view-whatever-dags" slightly more readable. llvm-svn: 186406
* ARM: implement ldrex, strex and clrex intrinsicsTim Northover2013-07-164-15/+130
| | | | | | | Intrinsics already existed for the 64-bit variants, so these support operations of size at most 32-bits. llvm-svn: 186392
* ARM EABI divmod supportRenato Golin2013-07-163-2/+87
| | | | | | | | | | | | This patch enables calls to __aeabi_idivmod when in EABI mode, by using the remainder value returned on registers (R1), enabled by the ARM triple "none-eabi". Note that Darwin and GNUEABI triples will continue lowering on GNU style, that is, using the stack for the remainder. Still need to add SREM/UREM support fix for 64-bit lowering. llvm-svn: 186390
* Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]).Craig Topper2013-07-151-1/+1
| | | | llvm-svn: 186301
* Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector ↵Craig Topper2013-07-143-13/+13
| | | | | | size. llvm-svn: 186274
* Fix ARM paired GPR COPY loweringJF Bastien2013-07-121-0/+3
| | | | | | | | | | | | | ARM paired GPR COPY was being lowered to two MOVr without CC. This patch puts the CC back. My test is a reduction of the case where I encountered the issue, 64-bit atomics use paired GPRs. The issue only occurs with selectionDAG, FastISel doesn't encounter it so I didn't bother calling it. llvm-svn: 186226
* Remove extraneous braces.Eric Christopher2013-07-121-6/+3
| | | | llvm-svn: 186212
* ARM cost model: Add cost for gather/scatherArnold Schwaighofer2013-07-121-0/+9
| | | | | | | | | | Fixes a 35% degradation compared to unvectorized code in MiBench/automotive-susan and an equally serious regression on a private image processing benchmark. radar://14351991 llvm-svn: 186188
* TargetTransformInfo: address calculation parameter for gather/scatherArnold Schwaighofer2013-07-121-2/+2
| | | | | | | | | | | Address calculation for gather/scather in vectorized code can incur a significant cost making vectorization unbeneficial. Add infrastructure to add cost. Tests and cost model for targets will be in follow-up commits. radar://14351991 llvm-svn: 186187
* Simplify code.Craig Topper2013-07-101-6/+2
| | | | llvm-svn: 186013
* Fix typoStephen Lin2013-07-101-1/+1
| | | | llvm-svn: 185995
* Explicitly define ARMISelLowering::isFMAFasterThanFMulAndFAdd. No ↵Stephen Lin2013-07-101-0/+11
| | | | | | | | | | functionality change. Currently ARM is the only backend that supports FMA instructions (for at least some subtargets) but does not implement this virtual, so FMAs are never generated except from explicit fma intrinsic calls. Apparently this is due to the fact that it supports both fused (one rounding step) and unfused (two rounding step) multiply + add instructions. This patch clarifies that this the case without changing behavior by implementing the virtual function to simply return false, as the default TargetLoweringBase version does. It is possible that some cpus perform the fused version faster than the unfused version and vice-versa, so the function implementation should be revisited if hard data is found. llvm-svn: 185994
* ARM: Fix incorrect pack pattern for thumb2Jim Grosbach2013-07-091-1/+6
| | | | | | | | | | | | | | | Propagate the fix from r185712 to Thumb2 codegen as well. Original commit message applies here as well: A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them in the bottom half of "x". An arithmetic and logic shift are only equivalent in this context if the shift amount is 16. We would be shifting in ones into the bottom 16bits instead of zeros if "y" is negative. rdar://14338767 llvm-svn: 185982
* Add MC assembly/disassembly support for VRINT{A, N, P, M} to V8FP.Joey Gouly2013-07-093-3/+59
| | | | llvm-svn: 185929
* Add MC assembly/disassembly support for VRINT{Z, X, R} to V8FP.Joey Gouly2013-07-091-0/+21
| | | | llvm-svn: 185926
* Add MC assembly/disassembly support for VCVT{A, N, P, M} to V8FP.Joey Gouly2013-07-093-6/+86
| | | | llvm-svn: 185922
* Add a comment to this change, requested by Eric Christopher.Joey Gouly2013-07-081-0/+4
| | | | llvm-svn: 185853
* ARM: Improve codegen for generic vselect.Jim Grosbach2013-07-081-0/+18
| | | | | | | | Fall back to by-element insert rather than building it up on the stack. rdar://14351991 llvm-svn: 185846
* Add MC support for the v8fp instructions: vmaxnm and vminnm.Joey Gouly2013-07-063-8/+27
| | | | llvm-svn: 185767
* ARM: Add a pack pattern for matching arithmetic shift rightArnold Schwaighofer2013-07-051-0/+3
| | | | llvm-svn: 185714
* ARM: Fix incorrect pack patternArnold Schwaighofer2013-07-051-2/+4
| | | | | | | | | | | A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them in the bottom half of "x". An arithmetic and logic shift are only equivalent in this context if the shift amount is 16. We would be shifting in ones into the bottom 16bits instead of zeros if "y" is negative. radar://14338767 llvm-svn: 185712
* PR16490: fix a crash in ARMDAGToDAGISel::SelectInlineAsm.Joey Gouly2013-07-051-0/+6
| | | | | | | | | | | In the SelectionDAG immediate operands to inline asm are constructed as two separate operands. The first is a constant of value InlineAsm::Kind_Imm and the second is a constant with the value of the immediate. In ARMDAGToDAGISel::SelectInlineAsm, if we reach an operand of Kind_Imm we should skip over the next operand too. llvm-svn: 185688
* Remove an unneeded call to 'UpdateThumbVFPPredicate', spotted by Amaury.Joey Gouly2013-07-041-1/+0
| | | | llvm-svn: 185651
* Add support for MC assembling and disassembling of vsel{ge, gt, eq, vs} ↵Joey Gouly2013-07-044-2/+94
| | | | | | | | | instructions. This adds a new decoder table/namespace 'VFPV8', as these instructions have their top 4 bits as 0b1111, while other Thumb instructions have 0b1110. llvm-svn: 185642
* Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.Jakob Stoklund Olesen2013-07-041-2/+0
| | | | | | These exception-related opcodes are not used any longer. llvm-svn: 185625
* Add a V8FP instruction 'vcvt{b,t}' to convert between half and double precision.Joey Gouly2013-07-042-1/+57
| | | | llvm-svn: 185620
* Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid ↵Craig Topper2013-07-041-4/+4
| | | | | | specifying the vector size. llvm-svn: 185606
* Revert r185595-185596 which broke buildbots.Jakob Stoklund Olesen2013-07-041-0/+2
| | | | | | | Revert "Simplify landing pad lowering." Revert "Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes." llvm-svn: 185600
* Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.Jakob Stoklund Olesen2013-07-031-2/+0
| | | | | | These exception-related opcodes are not used any longer. llvm-svn: 185596
* Have ARMBaseRegisterInfo::getCallPreservedMask return the 'correct' mask for ↵Stephen Lin2013-07-032-10/+13
| | | | | | | | the GHC calling convention. This is purely academic because GHC calls are always tail calls so the register mask will never be used; however, this change makes the code clearer and brings the ARM implementation of the GHC calling convention in line with the X86 implementation. Also, it might save someone else some time trying to figuring out what is happening... llvm-svn: 185592
* [ARM] Improve the instruction selection of vector loads.Quentin Colombet2013-07-031-0/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the ARM back-end, build_vector nodes are lowered to a target specific build_vector that uses floating point type. This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. In other words, this conversion may introduce artificial dependencies when the code leading to the build vector cannot be completed with a floating point type. In particular, this happens when loads are not aligned. Before this patch, in that case, the compiler generates general purpose loads and creates the floating point vector from them, instead of directly using the vector unit. The patch uses a vector friendly sequence of code when the inserted bitcasts to floating point survived DAGCombine. This is done by a target specific DAGCombine that changes the target specific build_vector into a sequence of insert_vector_elt that get rid of the bitcasts. <rdar://problem/14170854> llvm-svn: 185587
* ARM: Prevent ARMAsmParser::shouldOmitCCOutOperand() from misidentifying ↵Tilmann Scheller2013-07-031-9/+5
| | | | | | | | | | | | | | certain Thumb2 add immediate T3 encodings. Before the fix Thumb2 instructions of type "add rD, rN, #imm" (T3 encoding, see ARM ARM A8.8.4) with rD and rN both being low registers (r0-r7) were classified as having the T4 encoding. The T4 encoding doesn't have a cc_out operand so for above instructions the operand gets erroneously removed, corrupting the token stream and leading to parse errors later in the process. This bug prevented "add r1, r7, #0xcbcbcbcb" from being assembled correctly. Fixes <rdar://problem/14224440>. llvm-svn: 185575
* Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid ↵Craig Topper2013-07-031-3/+3
| | | | | | specifying the vector size. llvm-svn: 185540
* This corrects the implementation of Thumb ADR instruction. There are three ↵Mihai Popa2013-07-036-8/+46
| | | | | | | | | | issues: 1. it should accept only 4-byte aligned addresses 2. the maximum offset should be 1020 3. it should be encoded with the offset scaled by two bits llvm-svn: 185528
* ARM: relax the atomic release barrier to "dmb ishst" on SwiftTim Northover2013-07-031-1/+11
| | | | | | | | | | | Swift cores implement store barriers that are stronger than the ARM specification but weaker than general barriers. They are, in fact, just about enough to provide the ordering needed for atomic operations with release semantics. This patch makes use of that quirk. llvm-svn: 185527
* Remove address spaces from MC.Rafael Espindola2013-07-021-11/+10
| | | | | | | | This is dead code since PIC16 was removed in 2010. The result was an odd mix, where some parts would carefully pass it along and others would assert it was zero (most of the object streamer for example). llvm-svn: 185436
* Fix ARM EHABI compact model 1 and 2 without handlerdata.Logan Chien2013-07-021-3/+13
| | | | | | | | | | | | | | According to ARM EHABI section 9.2, if the __aeabi_unwind_cpp_pr1() or __aeabi_unwind_cpp_pr2() is used, then the handler data must be emitted after the unwind opcodes. The handler data consists of several words, and should be terminated by zero. In case that the .handlerdata directive is not specified by the programmer, we should emit zero to terminate the handler data. llvm-svn: 185422
* [ARMAsmParser] Sort the ARM register lists based on the encoding value, not theChad Rosier2013-07-011-15/+23
| | | | | | tablegen enum values. This should be the last fix due to fallout from r185094. llvm-svn: 185379
* Revert r185339 (ARM: relax the atomic release barrier to "dmb ishst")Tim Northover2013-07-011-5/+1
| | | | | | | | | Turns out I'd misread the architecture reference manual and thought that was a load/store-store barrier, when it's not. Thanks for pointing it out Eli! llvm-svn: 185356
* ARM: relax the atomic release barrier to "dmb ishst"Tim Northover2013-07-011-1/+5
| | | | | | | | | | | I believe the full "dmb ish" barrier is not required to guarantee release semantics for atomic operations. The weaker "dmb ishst" prevents previous operations being reordered with a store executed afterwards, which is enough. A key point to note (fortunately already correct) is that this barrier alone is *insufficient* for sequential consistency, no matter how liberally placed. llvm-svn: 185339
* Remove unused memberDavid Blaikie2013-06-281-4/+0
| | | | llvm-svn: 185219
OpenPOWER on IntegriCloud