summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [SelectionDAG] Allow targets to specify legality of extloads' resultAhmed Bougacha2015-01-081-10/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | type (in addition to the memory type). The *LoadExt* legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421
* [CodeGen] Use MVT iterator_ranges in legality loops. NFC intended.Ahmed Bougacha2015-01-071-20/+15
| | | | | | | | A few loops do trickier things than just iterating on an MVT subset, so I'll leave them be for now. Follow-up of r225387. llvm-svn: 225392
* ARM: permit tail calls to weak externals on COFFSaleem Abdulrasool2015-01-031-1/+3
| | | | | | | | | | | Weak externals are resolved statically, so we can actually generate the tail call on PE/COFF targets without breaking the requirements. It is questionable whether we want to propagate the current behaviour for MachO as the requirements are part of the ARM ELF specifications, and it seems that prior to the SVN r215890, we would have tail'ed the call. For now, be conservative and only permit it on PE/COFF where the call will always be fully resolved. llvm-svn: 225119
* [AArch64] MachO large code-model: Materialize FP constants in code.Juergen Ributzka2014-12-101-0/+7
| | | | | | | | | | | | | | | In the large code model we have to first get the address of the GOT entry, load the address of the constant, and then load the constant itself. To avoid these loads and the GOT entry alltogether this commit changes the way how FP constants are materialized in the large code model. The constats are now materialized in a GPR and then bitconverted/moved into the FPR. Reviewed by Tim Northover Fixes rdar://problem/16572564. llvm-svn: 223941
* AArch64: use explicit MVT::i64 when creating EXTRACT_SUBVECTOR nodes.Tim Northover2014-12-061-10/+12
| | | | | | | | | All our patterns use MVT::i64, but the ISelLowering nodes were inconsistent in their choice. No functional change. llvm-svn: 223551
* [AArch64] Combining Load and IntToFp should check for neon availabilityWeiming Zhao2014-12-041-3/+4
| | | | llvm-svn: 223382
* AArch64: fix wrong-endian parameter passing.Tim Northover2014-12-031-2/+4
| | | | | | | The blocked arguments code didn't take account of the hacks needed to support it. llvm-svn: 223247
* [AArch64] Don't combine "select (setcc i1 LHS, RHS), vL, vR".Ahmed Bougacha2014-12-011-0/+6
| | | | | | | | | | | | | r208210 introduced an optimization that improves the vector select codegen by doing the setcc on vectors directly. This is a problem they the setcc operands are i1s, because the optimization would create vectors of i1, which aren't legal. Part of PR21549. Differential Revision: http://reviews.llvm.org/D6308 llvm-svn: 223075
* [AArch64] Fix v2i8->i16 bitcast legalization.Ahmed Bougacha2014-12-011-5/+4
| | | | | | | | | | | | | | | r213378 improved f16 bitcasts, so that they go directly through subregs, instead of through the stack. That code now causes an assertion failure for bitcasts from other 16-bits types (most importantly v2i8). Correct that by doing the custom lowering for i16 bitcasts only when the input is an f16. Part of PR21549. Differential Revision: http://reviews.llvm.org/D6307 llvm-svn: 223074
* AArch64: treat [N x Ty] as a block during procedure calls.Tim Northover2014-11-271-0/+6
| | | | | | | | | | | | | | The AAPCS treats small structs and homogeneous floating (or vector) aggregates specially, and guarantees they either get passed as a contiguous block of registers, or prevent any future use of those registers and get passed on the stack. This concept can fit quite neatly into LLVM's own type system, mapping an HFA to [N x float] and so on, and small structs to [N x i64]. Doing so allows front-ends to emit AAPCS compliant code without having to duplicate the register counting logic. llvm-svn: 222903
* DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same ↵Hao Liu2014-11-211-0/+6
| | | | | | | | | | | | divisor info FMULs by the reciprocal. E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip) A hook is added to allow the target to control whether it needs to do such combine. Reviewed in http://reviews.llvm.org/D6334 llvm-svn: 222510
* Fix more instances of -Wsentinel on Windows with s/NULL/nullptr/Reid Kleckner2014-11-201-1/+1
| | | | | | Follow up to r221940, where I must not have caught em all. NFC llvm-svn: 222481
* [Aarch64] Customer lowering of CTPOP to SIMD should check for NEON availabilityWeiming Zhao2014-11-191-0/+3
| | | | llvm-svn: 222292
* We can get the TLOF from the TargetMachine - so constructor no longer ↵Aditya Nandakumar2014-11-131-1/+1
| | | | | | requires TargetLoweringObjectFile to be passed. llvm-svn: 221926
* This patch changes the ownership of TLOF from TargetLoweringBase to ↵Aditya Nandakumar2014-11-131-10/+1
| | | | | | TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878
* [AArch64] Fix miscompile of comparison with 0xffffffffffffffffOliver Stannard2014-11-031-4/+4
| | | | | | | Some literals in the AArch64 backend had 15 'f's rather than 16, causing comparisons with a constant 0xffffffffffffffff to be miscompiled. llvm-svn: 221157
* [AArch64] Fix a silent codegen fault in BUILD_VECTOR lowering.James Molloy2014-10-171-9/+9
| | | | | | | | | | We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same. While we're at it, avoid writing to std::vector::end()... Bug found with random testing and a lot of coffee. llvm-svn: 220051
* [AArch64] Fix miscompile of sdiv-by-power-of-2.Juergen Ributzka2014-10-161-1/+1
| | | | | | | | | | | When the constant divisor was larger than 32bits, then the optimized code generated for the AArch64 backend would emit the wrong code, because the shift was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would loose the upper 32bits. This fixes rdar://problem/18678801. llvm-svn: 219934
* [AArch64] Generate vector signed/unsigned mul and mla/mls long.Chad Rosier2014-10-081-0/+200
| | | | | | | Phabricator Revision: http://reviews.llvm.org/D5589 Patch by Balaram Makam <bmakam@codeaurora.org>!! llvm-svn: 219276
* Make AAMDNodes ctor and operator bool (!!!) explicit, mop up bugs and ↵Benjamin Kramer2014-10-041-1/+1
| | | | | | weirdness exposed by it. llvm-svn: 219068
* constify TargetMachine parameter.Eric Christopher2014-10-031-1/+1
| | | | llvm-svn: 218934
* Add missing natual vector cast.Asiri Rathnayake2014-10-011-0/+1
| | | | | | | | | Summary: The natual vector cast node (similar to bitcast) AArch64ISD::NVCAST was introduced in r217159 and r217138. This patch adds a missing cast from v2f32 to v1i64 which is causing some compilation failures. Also added test cases to cover various modimm types and BUILD_VECTORs with i64 elements. llvm-svn: 218751
* [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPassRobin Morisset2014-09-171-0/+4
| | | | | | | | | | | | This required a new hook called hasLoadLinkedStoreConditional to know whether to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to CmpXchg (X86). Apart from that, the new code in AtomicExpandPass is mostly moved from X86AtomicExpandPass. The main result of this patch is to get rid of that pass, which had lots of code duplicated with AtomicExpandPass. llvm-svn: 217928
* [AArch 64] Use a constant pool load for weak symbol references whenAsiri Rathnayake2014-09-101-1/+21
| | | | | | | | | | | | | | using static relocation model and small code model. Summary: currently we generate GOT based relocations for weak symbol references regardless of the underlying relocation model. This should be change so that in static relocation model we use a constant pool load instead. Patch from: Keith Walker Reviewers: Renato Golin, Tim Northover llvm-svn: 217503
* AArch64: fix vector-immediate BIC/ORR on big-endian devices.Tim Northover2014-09-041-12/+12
| | | | | | | | | | Follow up to r217138, extending the logic to other NEON-immediate instructions. As before, the instruction already performs the correct operation and we're just using a different type for convenience, so we want a true nop-cast. Patch by Asiri Rathnayake. llvm-svn: 217159
* AArch64: fix big-endian immediate materialisationTim Northover2014-09-041-21/+21
| | | | | | | | | | | | We were materialising big-endian constants using DAG nodes with types different from what was requested, followed by a bitcast. This is fine on little-endian machines where bitcasting is a nop, but we need a slightly different representation for big-endian. This adds a new set of NVCAST (natural-vector cast) operations which are always nops. Patch by Asiri Rathnayake. llvm-svn: 217138
* Refactor AtomicExpandPass and add a generic isAtomic() method to InstructionRobin Morisset2014-09-031-13/+22
| | | | | | | | | | | | | | | | | | | | | Summary: Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs. Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving the X86 backend to this pass. This requires a way of easily detecting which instructions are atomic. I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making isAtomic() a method of Instruction implemented by a switch on the opcodes. Test Plan: make check Reviewers: jfb Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5035 llvm-svn: 217080
* Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly ↵Aaron Ballman2014-09-021-1/+1
| | | | | | converted to 64 bits (was 64-bit shift intended?)). NFC. llvm-svn: 216902
* AArch64: Silence -Wabsolute-value warning with std::absReid Kleckner2014-08-291-1/+2
| | | | llvm-svn: 216794
* Fix typos in comments, NFCRobin Morisset2014-08-291-2/+1
| | | | | | | | | | | | | | Summary: Just fixing comments, no functional change. Test Plan: N/A Reviewers: jfb Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5130 llvm-svn: 216784
* Remove spurious mask operations from AArch64 add->compares on 16 and 8 bit ↵Louis Gerbarg2014-08-291-0/+263
| | | | | | | | | | | | | | | | | | | | | | | | | | values This patch checks for DAG patterns that are an add or a sub followed by a compare on 16 and 8 bit inputs. Since AArch64 does not support those types natively they are legalized into 32 bit values, which means that mask operations are inserted into the DAG to emulate overflow behaviour. In many cases those masks do not change the result of the processing and just introduce a dependent operation, often in the middle of a hot loop. This patch detects the relevent DAG patterns and then tests to see if the transforms are equivalent with and without the mask, removing the mask if possible. The exact mechanism of this patch was discusses in http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074444.html There is a reasonably good chance there are missed oppurtunities due to similiar (but not identical) DAG patterns that could be funneled into this test, adding them should be simple if we see test cases. Tests included. rdar://13754426 llvm-svn: 216776
* AArch64: only try to get operand of a known node.Tim Northover2014-08-291-5/+5
| | | | | | | A bug in r216725 meant we tried to discover the type of a SETCC before confirming the node actually was a SETCC. llvm-svn: 216734
* AArch64: skip select/setcc combine in complex case.Tim Northover2014-08-291-8/+10
| | | | | | | | | | | In an llvm-stress generated test, we were trying to create a v0iN type and asserting when that failed. This case could probably be handled by the function, but not without added complexity and the situation it arises in is sufficiently odd that there's probably no benefit anyway. Should fix PR20775. llvm-svn: 216725
* [AArch64] Fix some failures exposed by value type v4f16 and v8f16.Jiangning Liu2014-08-291-2/+2
| | | | | | | 1) Add some missing bitcast patterns for v8f16. 2) Add type promotion for operand of ld/st operations. llvm-svn: 216706
* AArch64: More correctly constrain target vector extend lowering.Jim Grosbach2014-08-281-3/+3
| | | | | | | | | | | | | | | The AArch64 target lowering for [zs]ext of vectors is set up to handle input simple types and expects the generic SDag path to do something reasonable with anything that's not a simple type. The code, however, was only checking that the result type was a simple type and assuming that implied that the source type would also be a simple type. That's not a valid assumption, as operations like "zext <1 x i1> %0 to <1 x i32>" demonstrate. The fix is to simply explicitly validate the source type as well as the result type. PR20791 llvm-svn: 216689
* Generate CMN when comparing a short int with minusDavid Xu2014-08-281-3/+41
| | | | llvm-svn: 216651
* Teach the AArch64 backend about v4f16 and v8f16Oliver Stannard2014-08-271-8/+96
| | | | | | | | This teaches the AArch64 backend to deal with the operations required to deal with the operations on v4f16 and v8f16 which are exposed by NEON intrinsics, plus the add, sub, mul and div operations. llvm-svn: 216555
* Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or ↵Craig Topper2014-08-271-2/+2
| | | | | | just letting them be implicitly created. llvm-svn: 216525
* Hide two different AlignMode enums in anonymous namespaces. This bug is ↵Alexey Samsonov2014-08-191-0/+2
| | | | | | reported by UBSan. llvm-svn: 216001
* Make use of isAtLeastRelease/Acquire in the ARM/AArch64 backendsRobin Morisset2014-08-181-4/+2
| | | | | | | | | | | | | | | | | Summary: Make use of isAtLeastRelease/Acquire in the ARM/AArch64 backends These helper functions are introduced in D4844. Depends D4844 Test Plan: make check-all passes Reviewers: jfb Subscribers: aemerson, llvm-commits, mcrosier, reames Differential Revision: http://reviews.llvm.org/D4937 llvm-svn: 215902
* Teach the AArch64 backend to handle f16Oliver Stannard2014-08-181-0/+9
| | | | | | | This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv operations on f16 (half-precision) types by promoting to f32. llvm-svn: 215891
* [ARM,AArch64] Do not tail-call to an externally-defined function with weak ↵Oliver Stannard2014-08-181-0/+13
| | | | | | | | | | | | | linkage Externally-defined functions with weak linkage should not be tail-called on ARM or AArch64, as the AAELF spec requires normal calls to undefined weak functions to be replaced with a NOP or jump to the next instruction. The behaviour of branch instructions in this situation (as used for tail calls) is implementation-defined, so we cannot rely on the linker replacing the tail call with a return. llvm-svn: 215890
* [AArch64] Narrow arguments passed in wrong position on the stack inAmara Emerson2014-08-151-2/+2
| | | | | | | | | | big-endian mode. Patch by Asiri Rathnayake. Differential Revision: http://reviews.llvm.org/D4922 llvm-svn: 215716
* AArch64: Tidy up a few comments.Jim Grosbach2014-08-111-2/+2
| | | | | | Have the comments match the actual parameter names. Found via clang-tidy. llvm-svn: 215401
* Remove the target machine from CCState. Previously it was only usedEric Christopher2014-08-061-17/+17
| | | | | | | | | to get the subtarget and that's accessible from the MachineFunction now. This helps clear the way for smaller changes where we getting a subtarget will require passing in a MachineFunction/Function as well. llvm-svn: 214988
* [AArch64] Conditional selects are expensive on out-of-order cores.James Molloy2014-08-061-0/+4
| | | | | | | | Specifically Cortex-A57. This probably applies to Cyclone too but I haven't enabled it for that as I can't test it. This gives ~4% improvement on SPEC 174.vpr, and ~1% in 471.omnetpp. llvm-svn: 214957
* AArch64: Add support for instruction prefetch intrinsicYi Kong2014-08-051-2/+2
| | | | | | | | | Instruction prefetch is not implemented for AArch64, it is incorrectly translated into data prefetch instruction. Differential Revision: http://reviews.llvm.org/D4777 llvm-svn: 214860
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-4/+8
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* [AArch64] Generate tbz/tbnz when comparing against zero.Chad Rosier2014-08-011-10/+16
| | | | | | | | | | | | | | | | | The tbz/tbnz checks the sign bit to convert op w1, w1, w10 cmp w1, #0 b.lt .LBB0_0 to op w1, w1, w10 tbnz w1, #31, .LBB0_0 Differential Revision: http://reviews.llvm.org/D4440 llvm-svn: 214518
* Make sure no loads resulting from load->switch DAGCombine are marked invariantLouis Gerbarg2014-07-311-1/+1
| | | | | | | | | | | | | | Currently when DAGCombine converts loads feeding a switch into a switch of addresses feeding a load the new load inherits the isInvariant flag of the left side. This is incorrect since invariant loads can be reordered in cases where it is illegal to reoarder normal loads. This patch adds an isInvariant parameter to getExtLoad() and updates all call sites to pass in the data if they have it or false if they don't. It also changes the DAGCombine to use that data to make the right decision when creating the new load. llvm-svn: 214449
OpenPOWER on IntegriCloud