summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM64/ARM64ISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AArch64/ARM64: move ARM64 into AArch64's placeTim Northover2014-05-241-7895/+0
| | | | | | | | | | | | | | | This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577
* [ARM64] Fix a bug in shuffle vector lowering to generate corect vext ISD ↵Jiangning Liu2014-05-231-15/+14
| | | | | | with swapped input vectors. llvm-svn: 209495
* [ARM64] PR19792: Fix cycle in DAG after performPostLD1CombineAdam Nemet2014-05-201-1/+6
| | | | | | | | | | | | | | | | | | | Povray and dealII currently assert with "Overran sorted position" in AssignTopologicalOrder. The problem is that performPostLD1Combine can introduce cycles. Consider: (insert_vector_elt (INSERT_SUBREG undef, (load (add %vreg0, Constant<8>), undef), <= A TargetConstant<2>), (load %vreg0, undef), <= B Constant<1>) This is turned into a LD1LANEpost node. However the address in A is not a valid user of the post-incremented address of B in LD1LANEpost. llvm-svn: 209242
* SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the ↵Benjamin Kramer2014-05-191-0/+2
| | | | | | | | | | bswap not. - On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though. - On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal. - On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled. llvm-svn: 209123
* Target: remove old constructors for CallLoweringInfoSaleem Abdulrasool2014-05-171-4/+4
| | | | | | | | | | This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082
* Target: change member from reference to pointerSaleem Abdulrasool2014-05-171-1/+1
| | | | | | | | | This is a preliminary step to help ease the construction of CallLoweringInfo. Changing the construction to a chained function pattern requires that the parameter be nullable. However, rather than copying the vector, save a pointer rather than the reference to permit a late binding of the arguments. llvm-svn: 209080
* Revert "Implement global merge optimization for global variables."Rafael Espindola2014-05-161-14/+0
| | | | | | | | | | | | This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978
* [ARM64]Implement NEON post-increment LD1(lane) and post-increment LD1R.Hao Liu2014-05-161-1/+94
| | | | llvm-svn: 208955
* Implement global merge optimization for global variables.Jiangning Liu2014-05-151-0/+14
| | | | | | | | | | | This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934
* [ARM64] Support aggressive fastcc/tailcallopt breaking ABI by popping out ↵Jiangning Liu2014-05-151-67/+245
| | | | | | argument stack from callee. llvm-svn: 208837
* Rename ComputeMaskedBits to computeKnownBits. "Masked" has beenJay Foad2014-05-141-4/+4
| | | | | | inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811
* Folding into CSEL when there is ZEXT between SETCC and ADDWeiming Zhao2014-05-131-3/+11
| | | | | | | | | | | Normally, patterns like (add x, (setcc cc ...)) will be folded into (csel x, x+1, not cc). However, if there is a ZEXT after SETCC, they won't be folded. This patch recognizes the ZEXT and allows the generation of CSINC. This patch fixes bug 19680. llvm-svn: 208660
* Pass the value type to TLI::getRegisterByNameHal Finkel2014-05-111-1/+2
| | | | | | | | | | | | | We must validate the value type in TLI::getRegisterByName, because if we don't and the wrong type was used with the IR intrinsic, then we'll assert (because we won't be able to find a valid register class with which to construct the requested copy operation). For PPC64, additionally, the type information is necessary to decide between the 64-bit register and the 32-bit subregister. No functionality change. llvm-svn: 208508
* ARM64: fix SELECT_CC lowering in absence of NaNs.Tim Northover2014-05-101-8/+9
| | | | | | | | | | We were swapping the true & false results while testing for FMAX/FMIN, but not putting them back to the original state if the later checks failed. Should fix PR19700. llvm-svn: 208469
* AArch64/ARM64: Port NEON post-increment load/store with 2/3/4 vectors to ↵Hao Liu2014-05-081-0/+190
| | | | | | ARM64 backend. llvm-svn: 208284
* AArch64/ARM64: optimise vector selects & enable testTim Northover2014-05-071-0/+41
| | | | | | | | | When performing a scalar comparison that feeds into a vector select, it's actually better to do the comparison on the vector side: the scalar route would be "CMP -> CSEL -> DUP", the vector is "CM -> DUP" since the vector comparisons are all mask based. llvm-svn: 208210
* [ARM64-BE] Fix variable-argument saving.James Molloy2014-05-071-1/+2
| | | | llvm-svn: 208199
* [ARM64-BE] Make big endian (scalar) argument passing work correctly.James Molloy2014-05-071-6/+38
| | | | | | | | | | This completes the port of r204814 (cpirker "AArch64_BE function argument passing for ARM ABI") from AArch64 to ARM64, and fixes a bunch of issues found during later development along the way. The biggest of these was that the alignment fixup logic wasn't replicated into all the places it should have been. llvm-svn: 208192
* Implememting named register intrinsicsRenato Golin2014-05-061-0/+11
| | | | | | | | | | | This patch implements the infrastructure to use named register constructs in programs that need access to specific registers (bare metal, kernels, etc). So far, only the stack pointer is supported as a technology preview, but as it is, the intrinsic can already support all non-allocatable registers from any architecture. llvm-svn: 208104
* [ARM64] Enable alignment control option in front-end for ARM64.Kevin Qin2014-05-061-4/+15
| | | | | | This is the modification in llvm part. llvm-svn: 208074
* AArch64/ARM64: support indexed loads/stores on vector types.Tim Northover2014-05-021-0/+8
| | | | | | | | While post-indexed LD1/ST1 instructions do exist for vector loads, this patch makes use of the more flexible addressing-modes in LDR/STR instructions. llvm-svn: 207838
* [ARM64] Prevent bit extraction to be adjusted by following shiftWeiming Zhao2014-04-301-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx more difficult. For example: Given %shr = lshr i64 %x, 4 %and = and i64 %shr, 15 %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and %0 = load i64* %arrayidx With current shift folding, it takes 3 instrs to compute base address: lsr x8, x0, #1 and x8, x8, #0x78 add x8, x9, x8 If using ubfx, it only needs 2 instrs: ubfx x8, x0, #4, #4 add x8, x9, x8, lsl #3 This fixes bug 19589 llvm-svn: 207702
* AArch64/ARM64: expunge CPSR from the sourcesTim Northover2014-04-301-9/+9
| | | | | | | | | | | | AArch64 does not have a CPSR register in the same way that AArch32 does. Most of its compiler-relevant roles have been taken over by the more specific NZCV register (representing just the flags set by normal instructions). Its system control functions still remain, but are now under the pseudo-register referred to as "PSTATE". They're accessed via various MRS & MSR instructions described in the reference manual. llvm-svn: 207645
* AArch64/ARM64: use HS instead of CS & LO instead of CC.Tim Northover2014-04-301-6/+6
| | | | | | | | | On instructions using the NZCV register, a couple of conditions have dual representations: HS/CS and LO/CC (meaning unsigned-higher-or-same/carry-set and unsigned-lower/carry-clear). The first of these is more descriptive in most circumstances, so we should print it. llvm-svn: 207644
* [ARM64] Simplify if condition.James Molloy2014-04-301-6/+2
| | | | | | | v2f32 and v4f32 were missed out of these conditions, so this is also a bugfix. llvm-svn: 207628
* Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I ↵Craig Topper2014-04-301-3/+3
| | | | | | introduced most of these recently. llvm-svn: 207616
* [ARM64]Fix a bug about incorrect operand order in an EXT instruction, which ↵Hao Liu2014-04-291-3/+9
| | | | | | is introduced by r207485. llvm-svn: 207500
* [ARM64]Fix a bug when lowering shuffle vector to an EXT instruction.Hao Liu2014-04-291-28/+23
| | | | | | E.g. Mask like <-1, -1, 1, ...> will generate incorrect EXT index. llvm-svn: 207485
* Convert SelectionDAG::getMergeValues to use ArrayRef.Craig Topper2014-04-271-3/+3
| | | | llvm-svn: 207374
* Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.Craig Topper2014-04-261-19/+16
| | | | llvm-svn: 207327
* DAGCombiner: Turn divs of vector splats into vectorized multiplications.Benjamin Kramer2014-04-261-0/+5
| | | | | | | | | | | | Otherwise the legalizer would just scalarize everything. Support for mulhi in the targets isn't that great yet so on most targets we get exactly the same scalarized output. Add a test for x86 vector udiv. I had to disable the mulhi nodes on ARM because there aren't any patterns for it. As far as I know ARM has instructions for getting the high part of a multiply so this should be fixed. llvm-svn: 207315
* Revert r206749 till a final decision about the intrinsics is made.Michael Zolotukhin2014-04-261-2/+0
| | | | llvm-svn: 207313
* [ARM64] Add RUN lines for "–target arm64 –mattr=-fp-armv8" on AArch64 ↵Kevin Qin2014-04-251-3/+3
| | | | | | | | | | no-fp test. This patch is a supplement of implementing predicate of FP, enabling aarch64 backend no-fp tests on arm64 target for verification. During this, one bug is exposed and fixed by this patch. llvm-svn: 207215
* [C++] Use 'nullptr'. Target edition.Craig Topper2014-04-251-7/+7
| | | | llvm-svn: 207197
* Add 'musttail' marker to call instructionsReid Kleckner2014-04-241-0/+3
| | | | | | | | | | | | This is similar to the 'tail' marker, except that it guarantees that tail call optimization will occur. It also comes with convervative IR verification rules that ensure that tail call optimization is possible. Reviewers: nicholas Differential Revision: http://llvm-reviews.chandlerc.com/D3240 llvm-svn: 207143
* [ARM64] Enable feature predicates for NEON / FP / CRYPTO.Kevin Qin2014-04-231-122/+142
| | | | | | | | AArch64 has feature predicates for NEON, FP and CRYPTO instructions. This allows the compiler to generate code without using FP, NEON or CRYPTO instructions. llvm-svn: 206949
* AArch64/ARM64: make use of ANDS and BICS instructions for comparisons.Tim Northover2014-04-221-11/+20
| | | | llvm-svn: 206888
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-2/+2
| | | | | | | definition below all of the header #include lines, lib/Target/... edition. llvm-svn: 206842
* ARM64: Combine shifts and uses from different basic block to bit-extract ↵Yi Jiang2014-04-211-0/+2
| | | | | | instruction llvm-svn: 206774
* Reapply r206732. This time without optimization of branches.Michael Zolotukhin2014-04-211-0/+2
| | | | llvm-svn: 206749
* Revert r206732 which is causing llc to crash on most of the build bots.Chandler Carruth2014-04-211-2/+0
| | | | | | | | Original commit message: Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN, safe.urem.iN (iN = i8, i61, i32, or i64). llvm-svn: 206735
* Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN,Michael Zolotukhin2014-04-211-0/+2
| | | | | | safe.urem.iN (iN = i8, i16, i32, or i64). llvm-svn: 206732
* AArch64/ARM64: add non-scalar lowering for more FCVT operations.Tim Northover2014-04-181-2/+8
| | | | llvm-svn: 206591
* AArch64/ARM64: improve spotting of EXT instructions from VECTOR_SHUFFLE.Tim Northover2014-04-181-5/+7
| | | | | | | We couldn't cope if the first mask element was UNDEF before, which isn't ideal. llvm-svn: 206588
* AArch64/ARM64: spot a greater variety of concat_vector operations.Tim Northover2014-04-181-14/+72
| | | | | | | | | | Code mostly copied from AArch64, just tidied up a trifle and plumbed into the ARM64 way of doing things. This also enables the AArch64 tests which inspired the previous untested commits. llvm-svn: 206574
* ARM64: spot a vector_shuffle that maps to INS and expand.Tim Northover2014-04-181-0/+64
| | | | | | | Tests will be coming very shortly when all the optimisations needed to support AArch64's neon-copy.ll file are committed. llvm-svn: 206572
* AArch64/ARM64: emit all vector FP comparisons as such.Tim Northover2014-04-181-5/+42
| | | | | | | | | | | | ARM64 was scalarizing some vector comparisons which don't quite map to AArch64's compare and mask instructions. AArch64's approach of sacrificing a little efficiency to emulate them with the limited set available was better, so I ported it across. More "inspired by" than copy/paste since the backend's internal expectations were a bit different, but the tests were invaluable. llvm-svn: 206570
* AArch64/ARM64: port BSL logic from AArch64 & enable test.Tim Northover2014-04-181-0/+52
| | | | | | | | | | | I enhanced it a little in the process. The decision shouldn't really be beased on whether a BUILD_VECTOR is a splat: any set of constants will do the job provided they're related in the correct way. Also, the BUILD_VECTOR could be any operand of the incoming AND nodes, so it's best to check for all 4 possibilities rather than assuming it'll be the RHS. llvm-svn: 206569
* AArch64/ARM64: copy byval implementation from AArch64.Tim Northover2014-04-181-16/+41
| | | | | | | | It's not actually used to handle C or C++ ABI rules on ARM64, but could well be emitted by other language front-ends, so it's as well to have a sensible implementation. llvm-svn: 206568
* Improve ARM64 vector creationLouis Gerbarg2014-04-171-2/+2
| | | | | | | | | | | This patch improves the performance of vector creation in caseiswhere where several of the lanes in the vector are a constant floating point value. It also includes new patterns to fold together some of the instructions when the value is 0.0f. Test cases included. rdar://16349427 llvm-svn: 206496
OpenPOWER on IntegriCloud