summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [ARMv8] Change hasV8Fp to hasFPARMv8, and other command line optionsJoey Gouly2013-09-131-2/+2
| | | | | | to be more consistent. llvm-svn: 190692
* [ARMv8] Implement the new DMB/DSB operands.Joey Gouly2013-09-051-2/+2
| | | | | | | This removes the custom ISD Node: MEMBARRIER and replaces it with an intrinsic. llvm-svn: 190055
* Clean up some usage of Triple. The base class has methods for determining ↵Cameron Esfahani2013-08-291-1/+1
| | | | | | if the target is iOS and Linux. llvm-svn: 189604
* ARM: Use "dmb sy" for barriers on M-class CPUsTim Northover2013-08-281-1/+4
| | | | | | | | The usual default of "dmb ish" (inner-shareable) isn't even a valid instruction on v6M or v7M (well, it does the same thing but software is strongly discouraged from using it) so we should emit a full-system barrier there. llvm-svn: 189483
* [ARMv8] Add CodeGen for VMAXNM/VMINNM.Joey Gouly2013-08-231-0/+16
| | | | llvm-svn: 189103
* [ARMv8] Add CodeGen support for VSEL.Joey Gouly2013-08-221-1/+93
| | | | | | | | This uses the ARMcmov pattern that Tim cleaned up in r188995. Thanks to Simon Tatham for his floating point help! llvm-svn: 189024
* [ARM] Constrain some register classes in EmitAtomicBinary64 so thatJoey Gouly2013-08-221-0/+4
| | | | | | we pass these tests with -verify-machineinstrs. llvm-svn: 189006
* ARM: implement some simple f64 materializations.Tim Northover2013-08-201-10/+40
| | | | | | | | Previously we used a const-pool load for virtually all 64-bit floating values. Actually, we can get quite a few common values (including 0.0, 1.0) via "vmov" instructions of one stripe or another. llvm-svn: 188773
* ARM: implement allowTruncateForTailCallTim Northover2013-08-061-0/+15
| | | | | | | Now that it's in place, it seems silly not to let ARM make use of the extra tail call opportunities. llvm-svn: 187795
* [ARM] check bitwidth in PerformORCombineSaleem Abdulrasool2013-07-301-14/+21
| | | | | | | | | | | | | When simplifying a (or (and B A) (and C ~A)) to a (VBSL A B C) ensure that the bitwidth of the second operands to both ands match before comparing the negation of the values. Split the check of the value of the second operands to the ands. Move the cast and variable declaration slightly higher to make it slightly easier to follow. Bug-Id: 16700 Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org> llvm-svn: 187404
* [ARM][ISel] Improve the lowering of vector loads.Quentin Colombet2013-07-231-1/+3
| | | | | | | | | | | | | | | | When vectors are built from a single value, the ARM lowering issues a scalar_to_vector node. This node is then always morphed into a move from the general purpose unit to the vector unit. When the value comes from a load, this can be simplified into a vector load to the right lane. This patch changes the lowering of insert_vector_elt to expose a vector friendly pattern in this situation. This is a step toward fixing <rdar://problem/14170854>. llvm-svn: 186999
* ARM: allow printing of ARM atomic DAG nodes.Tim Northover2013-07-161-0/+13
| | | | | | | | We'd forgotten to provide string representations for the special ARMISD atomic nodes; this adds them in. No effect on CodeGen, just makes the output of "-view-whatever-dags" slightly more readable. llvm-svn: 186406
* ARM: implement ldrex, strex and clrex intrinsicsTim Northover2013-07-161-0/+24
| | | | | | | Intrinsics already existed for the 64-bit variants, so these support operations of size at most 32-bits. llvm-svn: 186392
* ARM EABI divmod supportRenato Golin2013-07-161-2/+78
| | | | | | | | | | | | This patch enables calls to __aeabi_idivmod when in EABI mode, by using the remainder value returned on registers (R1), enabled by the ARM triple "none-eabi". Note that Darwin and GNUEABI triples will continue lowering on GNU style, that is, using the stack for the remainder. Still need to add SREM/UREM support fix for 64-bit lowering. llvm-svn: 186390
* Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]).Craig Topper2013-07-151-1/+1
| | | | llvm-svn: 186301
* Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector ↵Craig Topper2013-07-141-5/+5
| | | | | | size. llvm-svn: 186274
* ARM: Improve codegen for generic vselect.Jim Grosbach2013-07-081-0/+18
| | | | | | | | Fall back to by-element insert rather than building it up on the stack. rdar://14351991 llvm-svn: 185846
* Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.Jakob Stoklund Olesen2013-07-041-2/+0
| | | | | | These exception-related opcodes are not used any longer. llvm-svn: 185625
* Revert r185595-185596 which broke buildbots.Jakob Stoklund Olesen2013-07-041-0/+2
| | | | | | | Revert "Simplify landing pad lowering." Revert "Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes." llvm-svn: 185600
* Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.Jakob Stoklund Olesen2013-07-031-2/+0
| | | | | | These exception-related opcodes are not used any longer. llvm-svn: 185596
* [ARM] Improve the instruction selection of vector loads.Quentin Colombet2013-07-031-0/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the ARM back-end, build_vector nodes are lowered to a target specific build_vector that uses floating point type. This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. In other words, this conversion may introduce artificial dependencies when the code leading to the build vector cannot be completed with a floating point type. In particular, this happens when loads are not aligned. Before this patch, in that case, the compiler generates general purpose loads and creates the floating point vector from them, instead of directly using the vector unit. The patch uses a vector friendly sequence of code when the inserted bitcasts to floating point survived DAGCombine. This is done by a target specific DAGCombine that changes the target specific build_vector into a sequence of insert_vector_elt that get rid of the bitcasts. <rdar://problem/14170854> llvm-svn: 185587
* ARM: relax the atomic release barrier to "dmb ishst" on SwiftTim Northover2013-07-031-1/+11
| | | | | | | | | | | Swift cores implement store barriers that are stronger than the ARM specification but weaker than general barriers. They are, in fact, just about enough to provide the ordering needed for atomic operations with release semantics. This patch makes use of that quirk. llvm-svn: 185527
* Revert r185339 (ARM: relax the atomic release barrier to "dmb ishst")Tim Northover2013-07-011-5/+1
| | | | | | | | | Turns out I'd misread the architecture reference manual and thought that was a load/store-store barrier, when it's not. Thanks for pointing it out Eli! llvm-svn: 185356
* ARM: relax the atomic release barrier to "dmb ishst"Tim Northover2013-07-011-1/+5
| | | | | | | | | | | I believe the full "dmb ish" barrier is not required to guarantee release semantics for atomic operations. The weaker "dmb ishst" prevents previous operations being reordered with a store executed afterwards, which is enough. A key point to note (fortunately already correct) is that this barrier alone is *insufficient* for sequential consistency, no matter how liberally placed. llvm-svn: 185339
* ARM: ensure fixed-point conversions have sane typesTim Northover2013-06-281-5/+36
| | | | | | | | | | | We were generating intrinsics for NEON fixed-point conversions that didn't exist (e.g. float -> i16). There are two cases to consider: + iN is smaller than float. In this case we can do the conversion but need an extend or truncate as well. + iN is larger than float. In this case using the NEON conversion would be incorrect so we don't perform any combining. llvm-svn: 185158
* ARM: Proactively ensure that the LowerCallResult hack for 'this'-returns is ↵Stephen Lin2013-06-261-3/+10
| | | | | | | | not used for incompatible calling conventions. (Currently, ARM 'this'-returns are handled in the standard calling convention case by treating R0 as preserved and doing some extra magic in LowerCallResult; this may not apply to calling conventions added in the future so this patch provides and documents an interface for indicating such) llvm-svn: 185024
* The getRegForInlineAsmConstraint function should only accept MVT value types.Chad Rosier2013-06-221-1/+1
| | | | llvm-svn: 184642
* [ARMTargetLowering] ARMISD::{SUB,ADD}{C,E} second result is a boolean ↵Michael Gottesman2013-06-181-1/+11
| | | | | | implying that upper bits are always 0. llvm-svn: 184231
* Converted an overly aggressive assert to a conditional check in ↵Michael Gottesman2013-06-181-2/+5
| | | | | | | | | | | | | | | | | | AddCombineTo64bitMLAL. Said assert assumes that ADDC will always have a glue node as its second argument and is checked before we even know that we are actually performing the relevant MLAL optimization. This is incorrect since on ARM we *CAN* codegen ADDC with a use list based second argument. Thus to have both effects, I converted the assert to a conditional check which if it fails we do not perform the optimization. In terms of tests I can not produce an ADDC from the IR level until I get in my multiprecision optimization patch which is forthcoming. The tests for said patch would cause this assert to fail implying that said tests will provide the relevant tests. llvm-svn: 184230
* Order CALLSEQ_START and CALLSEQ_END nodes.Andrew Trick2013-05-291-2/+3
| | | | | | | | | | | | Fixes PR16146: gdb.base__call-ar-st.exp fails after pre-RA-sched=source fixes. Patch by Xiaoyi Guo! This also fixes an unsupported dbg.value test case. Codegen was previously incorrect but the test was passing by luck. llvm-svn: 182885
* Track IR ordering of SelectionDAG nodes 2/4.Andrew Trick2013-05-251-115/+115
| | | | | | | Change SelectionDAG::getXXXNode() interfaces as well as call sites of these functions to pass in SDLoc instead of DebugLoc. llvm-svn: 182703
* Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.Michael J. Spencer2013-05-241-7/+7
| | | | llvm-svn: 182680
* ARM: implement @llvm.readcyclecounter intrinsicTim Northover2013-05-231-1/+43
| | | | | | | | | | | | | This implements the @llvm.readcyclecounter intrinsic as the specific MRC instruction specified in the ARM manuals for CPUs with the Power Management extensions. Older CPUs had slightly different methods which may also have to be implemented eventually, but this should cover all v7 cases. rdar://problem/13939186 llvm-svn: 182603
* PR15868 fix.Stepan Dyatkovskiy2013-05-201-6/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduction: In case when stack alignment is 8 and GPRs parameter part size is not N*8: we add padding to GPRs part, so part's last byte must be recovered at address K*8-1. We need to do it, since remained (stack) part of parameter starts from address K*8, and we need to "attach" "GPRs head" without gaps to it: Stack: |---- 8 bytes block ----| |---- 8 bytes block ----| |---- 8 bytes... [ [padding] [GPRs head] ] [ ------ Tail passed via stack ------ ... FIX: Note, once we added padding we need to correct *all* Arg offsets that are going after padded one. That's why we need this fix: Arg offsets were never corrected before this patch. See new test-cases included in patch. We also don't need to insert padding for byval parameters that are stored in GPRs only. We need pad only last byval parameter and only in case it outsides GPRs and stack alignment = 8. Though, stack area, allocated for recovered byval params, must satisfy "Size mod 8 = 0" restriction. This patch reduces stack usage for some cases: We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be "packed" with alignment 4 in some cases. llvm-svn: 182237
* Replace some bit operations with simpler ones. No functionality change.Benjamin Kramer2013-05-191-9/+7
| | | | llvm-svn: 182226
* Add LLVMContext argument to getSetCCResultTypeMatt Arsenault2013-05-181-1/+1
| | | | llvm-svn: 182180
* ARM ISel: Don't create illegal types during LowerMULArnold Schwaighofer2013-05-141-25/+32
| | | | | | | | | | | | | | | | | The transformation happening here is that we want to turn a "mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have to make sure that X still has a valid vector type - possibly recreate an extension to a smaller type. In case of a extload of a memory type smaller than 64 bit we used create a ext(load()). The problem with doing this - instead of recreating an extload - is that an illegal type is exposed. This patch fixes this by creating extloads instead of ext(load()) sequences. Fixes PR15970. radar://13871383 llvm-svn: 181842
* Correctly preserve the input chain for potential tailcall nodes whoseLang Hames2013-05-131-1/+1
| | | | | | | | | | | | return values are bitcasts. The chain had previously been being clobbered with the entry node to the dag, which sometimes caused other code in the function to be erroneously deleted when tailcall optimization kicked in. <rdar://problem/13827621> llvm-svn: 181696
* For r181148: fixed warning 'enumeral and non-enumeral type in conditional ↵Stepan Dyatkovskiy2013-05-081-1/+1
| | | | | | expression'. llvm-svn: 181437
* For ARM backend, fixed "byval" attribute support.Stepan Dyatkovskiy2013-05-051-33/+102
| | | | | | | | | | | | | | | | | | | | | | | Now even the small structures could be passed within byval (small enough to be stored in GPRs). In regression tests next function prototypes are checked: PR15293: %artz = type { i32 } define void @foo(%artz* byval %s) define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2) foo: "s" stored in R0 foo2: "s" stored in R0, "s2" stored in R2. Next AAPCS rules are checked: 5.5 Parameters Passing, C.4 and C.5, "ParamSize" is parameter size in 32bit words: -- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4. Parameter should be sent to the stack; NCRN := R4. -- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4. Parameter stored in GPRs; NCRN += ParamSize. llvm-svn: 181148
* Refactoring patch.Stepan Dyatkovskiy2013-04-301-41/+70
| | | | | | | | | | | | 1. VarArgStyleRegisters: functionality that emits "store" instructions for byval regs moved out into separated method "StoreByValRegs". Before this patch VarArgStyleRegisters had confused use-cases. It was used for both variadic functions and for regular functions with byval parameters. In last case it created new stack-frame and registered it as VarArg frame, that is wrong. This patch replaces VarArgsStyleRegisters usage for byval parameters with StoreByValRegs method. 2. In ARMMachineFunctionInfo, "get/setVarArgsRegSaveSize" was renamed to "get/setArgRegsSaveSize". By the same reason. Sometimes it was used for variadic functions, and sometimes for byval parameters in regular functions. Actually, this property means the size of registers, that keeps arguments, and thats why it was renamed. 3. In ARMISelLowering.cpp, ARMTargetLowering class, in methods computeRegArea and StoreByValRegs, VARegXXXXXX was renamed to ArgRegsXXXXXX still by the same reasons. llvm-svn: 180774
* Add more tests for r179925 to verify correct handling of signext/zeroext; ↵Stephen Lin2013-04-231-3/+6
| | | | | | strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change) llvm-svn: 180138
* Lowercase "is" boolean variable prefix for consistency within function, no ↵Stephen Lin2013-04-231-12/+12
| | | | | | functionality change. llvm-svn: 180136
* Fix for 5.5 Parameter Passing --> Stage C:Stepan Dyatkovskiy2013-04-221-0/+1
| | | | | | | | | | | | | | | -- C.4 and C.5 statements, when NSAA is not equal to SP. -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a variadic procedure. Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are some exceptions in AAPCS. 1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs. 2. Check that for VA functions all params uses GPRs and then stack. No exceptions, no CPRCs here. llvm-svn: 180011
* Remove unused ShouldFoldAtomicFences flag.Tim Northover2013-04-201-2/+0
| | | | | | | | I think it's almost impossible to fold atomic fences profitably under LLVM/C++11 semantics. As a result, this is now unused and just cluttering up the target interface. llvm-svn: 179940
* Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.Tim Northover2013-04-201-32/+0
| | | | llvm-svn: 179939
* Add CodeGen support for functions that always return arguments via a new ↵Stephen Lin2013-04-201-5/+28
| | | | | | parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter). llvm-svn: 179925
* Test commitStephen Lin2013-04-201-1/+1
| | | | llvm-svn: 179913
* Remove the old CodePlacementOpt pass.Benjamin Kramer2013-03-291-2/+0
| | | | | | It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1. llvm-svn: 178349
* Improve long vector sext/zext lowering on ARMRenato Golin2013-03-191-0/+55
| | | | | | | | | | | | | The ARM backend currently has poor codegen for long sext/zext operations, such as v8i8 -> v8i32. This patch addresses this by performing a custom expansion in ARMISelLowering. It also adds/changes the cost of such lowering in ARMTTI. This partially addresses PR14867. Patch by Pete Couperus llvm-svn: 177380
OpenPOWER on IntegriCloud