summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM64
Commit message (Collapse)AuthorAgeFilesLines
* TableGen: use PrintMethods to print more aliasesTim Northover2014-05-125-37/+62
| | | | llvm-svn: 208607
* AArch64/ARM64: use InstAliases for NEON logical (imm) instructions.Tim Northover2014-05-122-72/+67
| | | | llvm-svn: 208606
* AArch64/ARM64: implement "mov $Rd, $Imm" aliases in TableGen.Tim Northover2014-05-123-58/+94
| | | | | | | | This is a slightly different approach to AArch64 (the base instruction definitions aren't quite right for that to work), but achieves the same thing and reduces C++ hackery in AsmParser. llvm-svn: 208605
* ARM64: remove dead validation code from the AsmParser.Tim Northover2014-05-121-198/+0
| | | | | | | If this code triggers, any immediate has already been validated so it can't possibly trigger a diagnostic. llvm-svn: 208564
* ARM64: merge "extend" and "shift" addressing-mode enums.Tim Northover2014-05-126-330/+241
| | | | | | | | In terms of assembly, these have too much overlap to be neatly modelled as disjoint classes: in many cases "lsl" is an acceptable alternative to either "uxtw" or "uxtx". llvm-svn: 208563
* [ARM64] Add proper bounds checking/diagnostics to logical shiftsBradley Smith2014-05-123-22/+43
| | | | llvm-svn: 208540
* [ARM64] Add diagnostics for bitfield extract/insert instructionsBradley Smith2014-05-121-19/+54
| | | | | | | | Unfortunately, since ARM64 models all these instructions as aliases, the checks need to be done at the time the alias is seen rather than during instruction validation as AArch64 does it. llvm-svn: 208529
* [ARM64] Correct more bounds checks/diagnostics for arithmetic shift operandsBradley Smith2014-05-122-10/+18
| | | | llvm-svn: 208528
* [ARM64] Move register/register MOV handling into tablegen and improve ↵Bradley Smith2014-05-124-54/+26
| | | | | | diagnostics llvm-svn: 208527
* Pass the value type to TLI::getRegisterByNameHal Finkel2014-05-112-2/+3
| | | | | | | | | | | | | We must validate the value type in TLI::getRegisterByName, because if we don't and the wrong type was used with the IR intrinsic, then we'll assert (because we won't be able to find a valid register class with which to construct the requested copy operation). For PPC64, additionally, the type information is necessary to decide between the 64-bit register and the 32-bit subregister. No functionality change. llvm-svn: 208508
* Add 'override' to getRegisterByName in *ISelLowering.hHal Finkel2014-05-111-1/+1
| | | | | | No functionality change intended. llvm-svn: 208507
* ARM64: fix SELECT_CC lowering in absence of NaNs.Tim Northover2014-05-101-8/+9
| | | | | | | | | | We were swapping the true & false results while testing for FMAX/FMIN, but not putting them back to the original state if the later checks failed. Should fix PR19700. llvm-svn: 208469
* Fix broken buildJonathan Roelofs2014-05-091-1/+1
| | | | | | ARM64 backend was missing a required_library entry. llvm-svn: 208437
* [ARM64] Add diagnostics for expected arithmetic shiftsBradley Smith2014-05-083-14/+32
| | | | llvm-svn: 208330
* [ARM64] Re-work parsing of ADD/SUB shifted immediate operandsBradley Smith2014-05-083-130/+210
| | | | | | | | | | | | The parsing of ADD/SUB shifted immediates needs to be done explicitly so that better diagnostics can be emitted, as a side effect this also removes some of the hacks in the current method of handling this operand type. Additionally remove manual CMP aliasing to ADD/SUB and use InstAlias instead. llvm-svn: 208329
* [ARM64] Ensure immediates in extend operands are in a valid rangeBradley Smith2014-05-082-4/+19
| | | | | | Also emit a more useful diagnostic when they are not. llvm-svn: 208318
* [ARM64] Check for proper immediate in shift/extend operandsBradley Smith2014-05-081-42/+58
| | | | llvm-svn: 208317
* [ARM64-BE] Teach fast-isel about how to set up sub-word stack arguments for ↵James Molloy2014-05-081-1/+7
| | | | | | | | big endian calls. SelectionDAG already knows about this, but fast-isel was ignorant. llvm-svn: 208307
* ARM64: make sure FastISel emits SSA MachineInstrsTim Northover2014-05-081-3/+4
| | | | | | We need to use a temporary register for a 2-step operation like REM. llvm-svn: 208297
* AArch64/ARM64: Port NEON post-increment load/store with 2/3/4 vectors to ↵Hao Liu2014-05-083-56/+743
| | | | | | ARM64 backend. llvm-svn: 208284
* [ARM64][fast-isel] Disable target specific optimizations at -O0. Functionally,Chad Rosier2014-05-073-31/+30
| | | | | | | | | | | | | | | this patch disables the dead register elimination pass and the load/store pair optimization pass at -O0. The ILP optimizations don't require the optimization level to be checked because the call to addILPOpts is predicated with the necessary check. The AdvSIMDScalar pass is disabled by default at all optimization levels. This patch leaves that pass disabled by default. Also, move command-line options into ARM64TargetMachine.cpp and add a few additional flags to aid in debugging. This fixes an issue with the -debug-pass=Structure flag where passes were printed, but not actually run (i.e., AdvSIMDScalar pass). llvm-svn: 208223
* AArch64/ARM64: optimise vector selects & enable testTim Northover2014-05-071-0/+41
| | | | | | | | | When performing a scalar comparison that feeds into a vector select, it's actually better to do the comparison on the vector side: the scalar route would be "CMP -> CSEL -> DUP", the vector is "CM -> DUP" since the vector comparisons are all mask based. llvm-svn: 208210
* [ARM64-BE] Fix fast-isel, and add appropriate RUN lines to appropriate tests.James Molloy2014-05-071-0/+5
| | | | llvm-svn: 208200
* [ARM64-BE] Fix variable-argument saving.James Molloy2014-05-071-1/+2
| | | | llvm-svn: 208199
* [ARM64-BE] Implement the lane-twiddling logic at AAPCS boundaries for big ↵James Molloy2014-05-071-0/+17
| | | | | | | | | | | | | | | | endian. The AAPCS states that values passed in registers must have a value as though they had been loaded with "LDR". LDR is equivalent to "LD1.64 vX.1D" - that is, loading scalars to vector registers and loading 1-element vectors is equivalent. The logic implemented here is to ensure that at all call boundaries and during formal argument lowering all vectors are treated as their bitwidth-based floating point scalar counterpart, which is always one of f64 or f128 (v2i32 -> f64, v4i32 -> f128 etc). A BITCAST is inserted so that the appropriate REV will be generated during code generation. llvm-svn: 208198
* [ARM64-BE] Implement the crazy bitcast handling for big endian vectors.James Molloy2014-05-071-46/+326
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because we've canonicalised on using LD1/ST1, every time we do a bitcast between vector types we must do an equivalent lane reversal. Consider a simple memory load followed by a bitconvert then a store. v0 = load v2i32 v1 = BITCAST v2i32 v0 to v4i16 store v4i16 v2 In big endian mode every memory access has an implicit byte swap. LDR and STR do a 64-bit byte swap, whereas LD1/ST1 do a byte swap per lane - that is, they treat the vector as a sequence of elements to be byte-swapped. The two pairs of instructions are fundamentally incompatible. We've decided to use LD1/ST1 only to simplify compiler implementation. LD1/ST1 perform the equivalent of a sequence of LDR/STR + REV. This makes the original code sequence: v0 = load v2i32 v1 = REV v2i32 (implicit) v2 = BITCAST v2i32 v1 to v4i16 v3 = REV v4i16 v2 (implicit) store v4i16 v3 But this is now broken - the value stored is different to the value loaded due to lane reordering. To fix this, on every BITCAST we must perform two other REVs: v0 = load v2i32 v1 = REV v2i32 (implicit) v2 = REV v2i32 v3 = BITCAST v2i32 v2 to v4i16 v4 = REV v4i16 v5 = REV v4i16 v4 (implicit) store v4i16 v5 This means an extra two instructions, but actually in most cases the two REV instructions can be combined into one. For example: (REV64_2s (REV64_4h X)) === (REV32_4h X) There is also no 128-bit REV instruction. This must be synthesized with an EXT instruction. Most bitconverts require some sort of conversion. The only exceptions are: a) Identity conversions - vNfX <-> vNiX b) Single-lane-to-scalar - v1fX <-> fX or v1iX <-> iX Even though there are hundreds of changed lines, I have a fairly high confidence that they are somewhat correct. The changes to add two REV instructions per bitcast were pretty mechanical, and once I'd done that I threw the resulting .td at a script I wrote which combined the two REVs together (and added an EXT instruction, for f128) based on an instruction description I gave it. This was much less prone to error than doing it all manually, plus my brain would not just have melted but would have vapourised. llvm-svn: 208194
* [ARM64-BE] Predicate VLDR/VSTR for vectors as little-endian only. We must ↵James Molloy2014-05-071-95/+131
| | | | | | use LD1/ST1 on big-endian. llvm-svn: 208193
* [ARM64-BE] Make big endian (scalar) argument passing work correctly.James Molloy2014-05-071-6/+38
| | | | | | | | | | This completes the port of r204814 (cpirker "AArch64_BE function argument passing for ARM ABI") from AArch64 to ARM64, and fixes a bunch of issues found during later development along the way. The biggest of these was that the alignment fixup logic wasn't replicated into all the places it should have been. llvm-svn: 208192
* Implememting named register intrinsicsRenato Golin2014-05-062-0/+12
| | | | | | | | | | | This patch implements the infrastructure to use named register constructs in programs that need access to specific registers (bare metal, kernels, etc). So far, only the stack pointer is supported as a technology preview, but as it is, the intrinsic can already support all non-allocatable registers from any architecture. llvm-svn: 208104
* AArch64/ARM64: implement diagnosis of unpredictable loads & storesTim Northover2014-05-061-17/+62
| | | | llvm-svn: 208091
* AArch64/ARM64: make NEON vector list parsing a bit more robustTim Northover2014-05-061-2/+5
| | | | | | | It doesn't change the results, but it seems silly not to diagnose obvious problems early on. llvm-svn: 208083
* AArch64/ARM64: add more specific diagnostic for floating imm 0.0.Tim Northover2014-05-061-4/+5
| | | | llvm-svn: 208082
* AArch64/ARM64: add more specific diagnostic for invalid vector lanesTim Northover2014-05-062-4/+21
| | | | llvm-svn: 208081
* AArch64/ARM64: produce more informative diagnostic assembling some immediatesTim Northover2014-05-062-32/+38
| | | | | | | No tests here, they'll be added when the entire neon-diagnostics.s test from AArch64 is enabled. llvm-svn: 208079
* [ARM64] Enable alignment control option in front-end for ARM64.Kevin Qin2014-05-061-4/+15
| | | | | | This is the modification in llvm part. llvm-svn: 208074
* Fix typo.Eric Christopher2014-05-051-1/+1
| | | | llvm-svn: 208006
* [ARM64] Correctly select ANDWri in FastISel.Joey Gouly2014-05-031-6/+13
| | | | | | http://reviews.llvm.org/D3598 llvm-svn: 207917
* AArch64/ARM64: add patterns for post-indexed ST1 ops.Tim Northover2014-05-021-0/+47
| | | | llvm-svn: 207840
* ARM64: refactor NEON post-indexed loads & stores (MC).Tim Northover2014-05-023-987/+443
| | | | | | | | | | | | | | | | | | | Previously, LLVM had no knowledge that these instructions actually modified their address register: fine if they never end up in CodeGen, but when I'd rather like to write some patterns for them it becomes a disaster. The change is mostly straightforward, I think the most significant design decision was to *always* put the address write-back first. This allows loads and stores to be accessed more uniformly, for example permitting the continued sharing of the InstAlias definitions. I also discovered that the custom Decode logic is no longer needed, so I removed it. No tests, because there should be no functionality change. llvm-svn: 207839
* AArch64/ARM64: support indexed loads/stores on vector types.Tim Northover2014-05-024-1/+72
| | | | | | | | While post-indexed LD1/ST1 instructions do exist for vector loads, this patch makes use of the more flexible addressing-modes in LDR/STR instructions. llvm-svn: 207838
* [ARM64] Prefer generation of bzero on Darwin onlyBradley Smith2014-05-011-2/+5
| | | | llvm-svn: 207760
* AArch64/ARM64: print BFM instructions as BFI or BFXILTim Northover2014-05-011-0/+27
| | | | | | | The canonical form of the BFM instruction is always one of the more explicit extract or insert operations, which makes reading output much easier. llvm-svn: 207752
* [ARM64] Conditionalize CPU specific system registers on subtarget featuresBradley Smith2014-05-015-18/+74
| | | | llvm-svn: 207742
* [ARM64] Prevent bit extraction to be adjusted by following shiftWeiming Zhao2014-04-302-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx more difficult. For example: Given %shr = lshr i64 %x, 4 %and = and i64 %shr, 15 %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and %0 = load i64* %arrayidx With current shift folding, it takes 3 instrs to compute base address: lsr x8, x0, #1 and x8, x8, #0x78 add x8, x9, x8 If using ubfx, it only needs 2 instrs: ubfx x8, x0, #4, #4 add x8, x9, x8, lsl #3 This fixes bug 19589 llvm-svn: 207702
* ARM64: print fp immediates without using scientific notation.Tim Northover2014-04-301-6/+4
| | | | llvm-svn: 207669
* AArch64/ARM64: implement remaining TLS relocations (purely MC).Tim Northover2014-04-305-18/+34
| | | | llvm-svn: 207668
* AArch64/ARM64: add specific diagnostic for MRS/MSR and enable tests.Tim Northover2014-04-302-1/+9
| | | | llvm-svn: 207667
* AArch64/ARM64: accept and print floating-point immediate 0 as "#0.0"Tim Northover2014-04-302-19/+41
| | | | | | | | | | It's been decided that in the future, the floating-point immediate in instructions like "fcmeq v0.2s, v1.2s, #0.0" will be canonically "0.0", which has been implemented on AArch64 already but not ARM64. This fixes that issue. llvm-svn: 207666
* [ARM64][fast-isel] Fast-isel doesn't know how to handle f128.Chad Rosier2014-04-301-1/+14
| | | | llvm-svn: 207659
* ARM64: print lsr instead of lsrv for variable shifts (etc)Tim Northover2014-04-301-13/+13
| | | | | | | The canonical syntax for shifts by a variable amount does not end with 'v', but that syntax should be supported as an alias (presumably for legacy reasons). llvm-svn: 207649
OpenPOWER on IntegriCloud