summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix typo.Matthias Braun2015-05-071-1/+1
| | | | llvm-svn: 236785
* Change getTargetNodeName() to produce compiler warnings for missing cases, ↵Matthias Braun2015-05-071-2/+6
| | | | | | fix them llvm-svn: 236775
* [ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe ↵Artyom Skrobov2015-05-061-25/+100
| | | | | | math mode too llvm-svn: 236590
* Don't always apply kill flag in thumb2 ABS pseudo expansion.Pete Cooper2015-04-301-1/+2
| | | | | | | | | The expansion for t2ABS was always setting the kill flag on the rsb instruction. It should instead only be set on rsb if it was set on the original ABS instruction. rdar://problem/20752113 llvm-svn: 236272
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-281-223/+251
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-281-251/+223
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
* [DebugInfo] Add debug locations to constant SD nodesSergey Dmitrouk2015-04-281-223/+251
| | | | | | | | | | | | | | | | | | | | | | | This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
* Cleanup, remove unused return valueMatthias Braun2015-04-281-4/+2
| | | | llvm-svn: 235952
* [ARM] make vminnm/vmaxnm work with ?le, ?ge and no-nans-fp-mathScott Douglass2015-04-081-9/+18
| | | | | | | | | | Because -menable-no-nans causes fcmp conditions to be rewritten without 'o' or 'u' the recognition code in needs to cope. Also extended it to handle 'le' and 'ge. Differential Revision: http://reviews.llvm.org/D8725 llvm-svn: 234421
* Use movw/movt instead of constant pool loads to lower byval parameter copiesDerek Schuff2015-03-261-5/+9
| | | | | | | | | | | | | | Summary: The ARM backend can use a loop to implement copying byval parameters before a call. In non-thumb2 mode it uses a constant pool load to materialize the trip count. For targets that need movt instead (e.g. Native Client), use the same code as in thumb2 mode to materialize the trip count. Reviewers: jfb, t.p.northover Differential Revision: http://reviews.llvm.org/D8442 llvm-svn: 233324
* Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.Benjamin Kramer2015-03-231-1/+2
| | | | llvm-svn: 232998
* [ARM] Remove target-specific ITOFP/FPTOI nodesJames Molloy2015-03-231-44/+8
| | | | | | | | Anton tried this 5 years ago but it was reverted due to extra VMOVs being emitted. This can be easily fixed with a liberal application of patterns - matching loads/stores and extractelts. llvm-svn: 232958
* [ARM] Align stack objects passed to memory intrinsicsJohn Brawn2015-03-181-0/+15
| | | | | | | | | | | | Memcpy, and other memory intrinsics, typically tries to use LDM/STM if the source and target addresses are 4-byte aligned. In CodeGenPrepare look for calls to memory intrinsics and, if the object is on the stack, 4-byte align it if it's large enough that we expect that memcpy would want to use LDM/STM to copy it. Differential Revision: http://reviews.llvm.org/D7908 llvm-svn: 232627
* In preparation for moving ARM's TargetRegisterInfo to the TargetMachineEric Christopher2015-03-121-1/+1
| | | | | | | merge Thumb1RegisterInfo and Thumb2RegisterInfo. This will enable us to match the TargetMachine for our TargetRegisterInfo classes. llvm-svn: 232117
* Silencing an "enumeral and non-enumeral type in conditional expression" ↵Aaron Ballman2015-03-121-1/+1
| | | | | | warning; NFC. llvm-svn: 232035
* Have getCallPreservedMask and getThisCallPreservedMask take aEric Christopher2015-03-111-3/+3
| | | | | | | MachineFunction argument so that we can grab subtarget specific features off of it. llvm-svn: 231979
* ARM: simplify and extend byval handlingTim Northover2015-03-111-216/+103
| | | | | | | | | | | | | | | | | | | The main issue being fixed here is that APCS targets handling a "byval align N" parameter with N > 4 were miscounting what objects were where on the stack, leading to FrameLowering setting the frame pointer incorrectly and clobbering the stack. But byval handling had grown over many years, and had multiple layers of cruft trying to compensate for each other and calculate padding correctly. This only really needs to be done once, in the HandleByVal function. Elsewhere should just do what it's told by that call. I also stripped out unnecessary APCS/AAPCS distinctions (now that Clang emits byvals with the correct C ABI alignment), which simplified HandleByVal. rdar://20095672 llvm-svn: 231959
* Remove the remaining uses of abs64 and nuke it.Benjamin Kramer2015-03-091-3/+3
| | | | | | std::abs works just fine and we're already using it in many places. NFC intended. llvm-svn: 231696
* Make constant arrays that are passed to functions as const.Benjamin Kramer2015-03-071-7/+5
| | | | | | | | In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a different story. NFC. llvm-svn: 231571
* [ARM] Enable vector extload combine for legal types.Ahmed Bougacha2015-03-051-0/+22
| | | | | | | | | | | | | | | | | | | | | This commit enables forming vector extloads for ARM. It only does so for legal types, and when we can't fold the extension in a wide/long form of the user instruction. Enabling it for larger types isn't as good an idea on ARM as it is on X86, because: - we pretend that extloads are legal, but end up generating vld+vmov - we have instructions like vld {dN, dM}, which can't be generated when we "manually expand" extloads to vld+vmov. For legal types, the combine doesn't fire that often: in the integration tests only in a big endian testcase, where it removes a pointless AND. Related to rdar://19723053 Differential Revision: http://reviews.llvm.org/D7423 llvm-svn: 231396
* Mutate TargetLowering::shouldExpandAtomicRMWInIR to specifically dictate how ↵JF Bastien2015-03-041-2/+5
| | | | | | | | | | | | | | | | | | | | | AtomicRMWInsts are expanded. Summary: In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all. This does not represent a behavioural change and as such no tests were added. Patch by: Richard Diamond. Reviewers: jfb Reviewed By: jfb Subscribers: jfb, aemerson, t.p.northover, llvm-commits Differential Revision: http://reviews.llvm.org/D7713 llvm-svn: 231250
* Remove MCStreamer.h include from MCContext.h and explictly include it where ↵Pete Cooper2015-03-041-0/+1
| | | | | | necessary. NFC llvm-svn: 231193
* getRegForInlineAsmConstraint wants to use TargetRegisterInfo forEric Christopher2015-02-261-2/+3
| | | | | | | | | a lookup, pass that in rather than use a naked call to getSubtargetImpl. This involved passing down and around either a TargetMachine or TargetRegisterInfo. Update all callers/definitions around the targets and SelectionDAG. llvm-svn: 230699
* Remove an argument-less call to getSubtargetImpl from TargetLoweringBase.Eric Christopher2015-02-261-4/+5
| | | | | | | | | This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583
* ARM: treat [N x i32] and [N x i64] as AAPCS composite typesTim Northover2015-02-241-4/+8
| | | | | | | | | | | The logic is almost there already, with our special homogeneous aggregate handling. Tweaking it like this allows front-ends to emit AAPCS compliant code without ever having to count registers or add discarded padding arguments. Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to apply the logic to all integer arrays for more consistency. llvm-svn: 230348
* Rewrite the global merge pass to be subprogram agnostic for now.Eric Christopher2015-02-231-6/+0
| | | | | | | | | | | | | It was previously using the subtarget to get values for the global offset without actually checking each function as it was generating code. Go ahead and solidify the current behavior and make the existing FIXMEs more prominent. As a note the ARM backend previously had a thumb1 and non-thumb1 set of defaults. Only the former was tested so I've changed the behavior to only use that for now. llvm-svn: 230245
* CodeGen: convert CCState interface to using ArrayRefsTim Northover2015-02-211-9/+6
| | | | | | | | | | | Everyone except R600 was manually passing the length of a static array at each callsite, calculated in a variety of interesting ways. Far easier to let ArrayRef handle that. There should be no functional change, but out of tree targets may have to tweak their calls as with these examples. llvm-svn: 230118
* [ARM] Re-re-apply VLD1/VST1 base-update combine.Ahmed Bougacha2015-02-191-14/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-applies r223862, r224198, r224203, and r224754, which were reverted in r228129 because they exposed Clang misalignment problems when self-hosting. The combine caused the crashes because we turned ISD::LOAD/STORE nodes to ARMISD::VLD1/VST1_UPD nodes. When selecting addressing modes, we were very lax for the former, and only emitted the alignment operand (as in "[r1:128]") when it was larger than the standard alignment of the memory type. However, for ARMISD nodes, we just used the MMO alignment, no matter what. In our case, we turned ISD nodes to ARMISD nodes, and this caused the alignment operands to start being emitted. And that's how we exposed alignment problems that were ignored before (but I believe would have been caught with SCTRL.A==1?). To fix this, we can just mirror the hack done for ISD nodes: only take into account the MMO alignment when the access is overaligned. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). rdar://19717869, rdar://14062261. llvm-svn: 229932
* [ARM] Minor cleanup to CombineBaseUpdate. NFC.Ahmed Bougacha2015-02-191-20/+22
| | | | | | | | | | | In preparation for a future patch: - rename isLoad to isLoadOp: the former is confusing, and can be taken to refer to the fact that the node is an ISD::LOAD. (it isn't, yet.) - change formatting here and there. - add some comments. - const-ify bools. llvm-svn: 229929
* [CodeGen] Use ArrayRef instead of std::vector&. NFC.Ahmed Bougacha2015-02-191-1/+1
| | | | | | The former lets us use SmallVectors. Do so in ARM and AArch64. llvm-svn: 229925
* AArch64: Safely handle the incoming sret call argument.Andrew Trick2015-02-161-3/+8
| | | | | | | | | | | | | | | | | | This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413
* ARM: Canonicalize access to function attributes, NFCDuncan P. N. Exon Smith2015-02-141-9/+4
| | | | | | | | | | | | Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229220
* MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line ↵Benjamin Kramer2015-02-121-5/+1
| | | | | | | | with countTrailingZeros Update all callers. llvm-svn: 228930
* ARM & AArch64: teach LowerVSETCC that output type size may differ from input.Tim Northover2015-02-081-13/+16
| | | | | | | | | | | | | | | | | | While various DAG combines try to guarantee that a vector SETCC operation will have the same output size as input, there's nothing intrinsic to either creation or LegalizeTypes that actually guarantees it, so the function needs to be ready to handle a mismatch. Fortunately this is easy enough, just extend or truncate the naturally compared result. I couldn't reproduce the failure in other backends that I know have SIMD, so it's probably only an issue for these two due to shared heritage. Should fix PR21645. llvm-svn: 228518
* Reverting VLD1/VST1 base-updating/post-incrementing combiningRenato Golin2015-02-041-102/+14
| | | | | | | | | | | This reverts patches 223862, 224198, 224203, and 224754, which were all related to the vector load/store combining and were reverted/reaplied a few times due to the same alignment problems we're seeing now. Further tests, mainly self-hosting Clang, will be needed to reapply this patch in the future. llvm-svn: 228129
* Remove getSubtargetImpl from ARMISelLowering and cache theEric Christopher2015-01-291-31/+19
| | | | | | | correct subtarget by passing it in during the constructor as TargetLowering is Subtarget specific. llvm-svn: 227401
* This patch fixes issue with lowering below mentioned pattern :-Jyoti Allur2015-01-231-7/+10
| | | | | | | | | | | | | | | | | | _foo: smull r0, r1, r1, r0 smull r2, r3, r3, r2 adds r0, r2, r0 adc r1, r3, r1 bx lr to _foo: smull r0, r1, r1, r0 smlal r0, r1, r3, r2 bx lr llvm-svn: 226904
* [SelectionDAG] Allow targets to specify legality of extloads' resultAhmed Bougacha2015-01-081-10/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | type (in addition to the memory type). The *LoadExt* legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421
* [CodeGen] Use MVT iterator_ranges in legality loops. NFC intended.Ahmed Bougacha2015-01-071-17/+12
| | | | | | | | A few loops do trickier things than just iterating on an MVT subset, so I'll leave them be for now. Follow-up of r225387. llvm-svn: 225392
* ARM: permit tail calls to weak externals on COFFSaleem Abdulrasool2015-01-031-1/+3
| | | | | | | | | | | Weak externals are resolved statically, so we can actually generate the tail call on PE/COFF targets without breaking the requirements. It is questionable whether we want to propagate the current behaviour for MachO as the requirements are part of the ARM ELF specifications, and it seems that prior to the SVN r215890, we would have tail'ed the call. For now, be conservative and only permit it on PE/COFF where the call will always be fully resolved. llvm-svn: 225119
* [ARM] Don't break alignment when combining base updates into load/stores.Ahmed Bougacha2014-12-231-2/+47
| | | | | | | | | | | | | | | | | | r223862/r224203 tried to also combine base-updating load/stores. There was a mistake there: the alignment was added as is as an operand to the ARMISD::VLD/VST node. However, the VLD/VST selection logic doesn't care about less-than-standard alignment attributes. For example, no matter the alignment of a v2i64 load (say 1), SelectVLD picks VLD1q64 (because of the memory type). But VLD1q64 ("vld1.64 {dXX, dYY}") is 8-aligned, per ARMARMv7a 3.2.1. For the 1-aligned load, what we really want is VLD1q8. This commit introduces bitcasts if necessary, and changes the vld/vst type to one whose standard alignment matches the original load/store alignment. Differential Revision: http://reviews.llvm.org/D6759 llvm-svn: 224754
* Fixing -Wsign-compare warnings; NFC.Aaron Ballman2014-12-161-1/+2
| | | | llvm-svn: 224337
* [ARM] Prevent PerformVCVTCombine from combining a vmul/vcvt with 8 lanesBradley Smith2014-12-161-3/+5
| | | | | | This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 224332
* Silence more static analyzer warnings.Michael Ilseman2014-12-151-1/+3
| | | | | | | | Add in definedness checks for shift operators, null checks when pointers are assumed by the code to be non-null, and explicit unreachables. llvm-svn: 224255
* Reapply "[ARM] Combine base-updating/post-incrementing vector load/stores."Ahmed Bougacha2014-12-131-6/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | r223862 tried to also combine base-updating load/stores. r224198 reverted it, as "it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown." Reapply, with a fix to ignore non-normal load/stores. Truncstores are handled elsewhere (you can actually write a pattern for those, whereas for postinc loads you can't, since they return two values), but it should be possible to also combine extloads base updates, by checking that the memory (rather than result) type is of the same size as the addend. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 224203
* Revert "[ARM] Combine base-updating/post-incrementing vector load/stores."Renato Golin2014-12-131-38/+6
| | | | | | | | | This reverts commit r223862, as it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown. We'll investigate the issue and re-apply when safe. llvm-svn: 224198
* [ARM] Combine base-updating/post-incrementing vector load/stores.Ahmed Bougacha2014-12-101-6/+38
| | | | | | | | | | | | | | | | | | We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 223862
* [ARM] Factor out base-updating VLD/VST combiner function. NFC.Ahmed Bougacha2014-12-091-6/+15
| | | | | | | | | Move the combiner-state check into another function, add a few small comments, and use a more general type in a cast<>. In preparation for a future patch. llvm-svn: 223834
* [ARM] Move the store combiner function down. NFC.Ahmed Bougacha2014-12-091-141/+143
| | | | | | | And flip its final condition. In preparation for a future patch. llvm-svn: 223833
* Both of these subtargets have functions that check whether orEric Christopher2014-12-051-1/+1
| | | | | | not the target is mach-o. Use them. llvm-svn: 223420
OpenPOWER on IntegriCloud