summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM64
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r191049/r191059 as it can produce wrong code (see PR17975).Robert Lougher2014-04-151-0/+4
| | | | | | It has already been reverted on the 3.4 branch in r196521. llvm-svn: 206311
* ARM64: add constraints to various FastISel operationsTim Northover2014-04-151-4/+6
| | | | llvm-svn: 206284
* Fix for codegen bug that could cause illegal cmn instruction generationLouis Gerbarg2014-04-141-0/+18
| | | | | | | | | | | In rare cases the dead definition elimination pass code can cause illegal cmn instructions when it replaces dead registers on instructions that use unmaterialized frame indexes. This patch disables the dead definition optimization for instructions which include frame index operands. rdar://16438284 llvm-svn: 206208
* Add a flag to disable the ARM64DeadRegisterDefinitionsPassLouis Gerbarg2014-04-141-0/+16
| | | | | | | This patch adds a -arm64-dead-def-elimination flag so that it is possible to disable dead definition elimination. Includes test case. llvm-svn: 206207
* ARM64: remove buggy REV16 pattern.Tim Northover2014-04-141-1/+4
| | | | | | The 32-bit pattern is still valid: 0123 -> 3210 -> 1032. llvm-svn: 206172
* Add the ability to use GEPs for address sinking in CGPHal Finkel2014-04-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The current memory-instruction optimization logic in CGP, which sinks parts of the address computation that can be adsorbed by the addressing mode, does this by explicitly converting the relevant part of the address computation into IR-level integer operations (making use of ptrtoint and inttoptr). For most targets this is currently not a problem, but for targets wishing to make use of IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a problem for two reasons: 1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr 2. In cases where type-punning was used, and BasicAA was used to override TBAA, BasicAA may no longer do so. (this had forced us to disable all use of TBAA in CodeGen; something which we can now enable again) This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by default (except for those targets that use AA during CodeGen), and so aside from some PowerPC subtargets and SystemZ, there should be no change in behavior. We may be able to switch completely away from the ptrtoint/inttoptr sinking on all targets, but further testing is required. I've doubled-up on a number of existing tests that are sensitive to the address sinking behavior (including some store-merging tests that are sensitive to the order of the resulting ADD operations at the SDAG level). llvm-svn: 206092
* Add ARM64 CLS patternsLouis Gerbarg2014-04-111-0/+36
| | | | | | | | | This patch adds patterns to generate the cls instruction ARM64. Includes tests for 64 bit and 32 bit operands. rdar://15611957 llvm-svn: 206079
* [DAGCombiner] DAG combine does not know how to combine indexed loads withQuentin Colombet2014-04-091-0/+45
| | | | | | | | | | | sign/zero/any extensions. However a few places were not checking properly the property of the load and were turning an indexed load into a regular extended load. Therefore the indexed value was lost during the process and this was triggering an assertion. <rdar://problem/16389332> llvm-svn: 205923
* Fix some doc and comment typosAlp Toker2014-04-092-2/+2
| | | | llvm-svn: 205899
* [ARM64] Rename LR to the UAL-compliant 'X30'.Bradley Smith2014-04-093-9/+9
| | | | llvm-svn: 205885
* [ARM64] Rename FP to the UAL-compliant 'X29'.Bradley Smith2014-04-098-41/+45
| | | | llvm-svn: 205884
* ARM64: scalarize v1i64 mul operationTim Northover2014-04-091-0/+7
| | | | | | This is the second part of fixing PR19367. llvm-svn: 205836
* ARM64: add pattern for <1 x i64> custom not node.Tim Northover2014-04-091-0/+9
| | | | | | This should fix PR19367. llvm-svn: 205835
* [Constant Hoisting][ARM64] Enable constant hoisting for ARM64.Juergen Ributzka2014-04-081-0/+23
| | | | | | | | This implements the target-hooks for ARM64 to enable constant hoisting. This fixes <rdar://problem/14774662> and <rdar://problem/16381500>. llvm-svn: 205791
* ARM64: fix fmsub patterns which assumed accum operand was firstTim Northover2014-04-081-10/+10
| | | | | | | | | | Confusingly, the NEON fmla instructions put the accumulator first but the scalar versions put it at the end (like the fma lib function & LLVM's intrinsic). This should fix PR19345, assuming there's only one issue. llvm-svn: 205758
* DAGLegalize: add last-ditch type-legalization for VSELECT.Tim Northover2014-04-041-0/+9
| | | | | | | | | | | | | When LLVM sees something like (v1iN (vselect v1i1, v1iN, v1iN)) it can decide that the result is OK (v1i64 is legal on AArch64, for example) but it still need scalarising because of that v1i1. There was no code to do this though. AArch64 and ARM64 have DAG combines to produce efficient code and prevent that occuring in *most* such situations, but there are edge cases that they miss. This adds a legalization to cope with that. llvm-svn: 205626
* ARM64: handle v1i1 types arising from setcc properly.Tim Northover2014-04-041-0/+65
| | | | | | | | | | | | | | | | | | | | There were several overlapping problems here, and this solution is closely inspired by the one adopted in AArch64 in r201381. Firstly, scalarisation of v1i1 setcc operations simply fails if the input types are legal. This is fixed in LegalizeVectorTypes.cpp this time, and allows AArch64 code to be simplified slightly. Second, vselect with such a setcc feeding into it ends up in ScalarizeVectorOperand, where it's not handled. I experimented with an implementation, but found that whatever DAG came out was rather horrific. I think Hao's DAG combine approach is a good one for quality, though there are edge cases it won't catch (to be fixed separately). Should fix PR19335. llvm-svn: 205625
* ARM64: use regalloc-friendly COPY_TO_REGCLASS for bitcastsTim Northover2014-04-041-0/+12
| | | | | | | | | | | | | | | | | | The previous patterns directly inserted FMOV or INS instructions into the DAG for scalar_to_vector & bitconvert patterns. This is horribly inefficient and can generated lots more GPR <-> FPR register traffic than necessary. It's much better to emit instructions the register allocator understands so it can coalesce the copies when appropriate. It led to at least one ISelLowering hack to avoid the problems, which was incorrect for v1i64 (FPR64 has no dsub). It can now be removed entirely. This should also fix PR19331. llvm-svn: 205616
* ARM64: add 128-bit MLA operations to the custom selection code.Tim Northover2014-04-041-0/+26
| | | | | | | | | | | Without this change, the llvm_unreachable kicked in. The code pattern being spotted is rather non-canonical for 128-bit MLAs, but it can happen and there's no point in generating sub-optimal code for it just because it looks odd. Should fix PR19332. llvm-svn: 205615
* [ARM64] Teach the ARM64DeadRegisterDefinition pass to respect implicit-defs.Lang Hames2014-04-031-0/+32
| | | | | | | | | | | | | When rematerializing through truncates, the coalescer may produce instructions with dead defs, but live implicit-defs of subregs: E.g. %X1<def,dead> = MOVi64imm 2, %W1<imp-def>; %X1:GPR64, %W1:GPR32 These instructions are live, and their definitions should not be rewritten. Fixes <rdar://problem/16492408> llvm-svn: 205565
* ARM64: add regression test for r205519.Tim Northover2014-04-031-0/+29
| | | | llvm-svn: 205520
* ARM64: don't generate __sincos_stret calls unless on MachOTim Northover2014-04-031-7/+18
| | | | | | This should fix PR19314. llvm-svn: 205514
* ARM64: use GOT for weak symbols & PIC.Tim Northover2014-04-022-0/+105
| | | | | | | | | | Weak symbols cannot use the small code model's usual ADRP sequences since the instruction simply may not be able to encode a value of 0. This redirects them to use the GOT, which hopefully linkers are able to cope with even in the static relocation model. llvm-svn: 205426
* ARM64: fix lowering of fp128 fptosi/fptouiTim Northover2014-04-021-0/+48
| | | | | | | We were creating libcall nodes that returned an MVT::f128, when these particular operations actually return an int of some stripe. llvm-svn: 205425
* ARM64: make sure first argument to INSERT_SUBVECTOR has right type.Tim Northover2014-04-021-3/+8
| | | | | | | | Again, coalescing and other optimisations swiftly made the MachineInstrs consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was produced. llvm-svn: 205423
* ARM64: convert fp16 narrowing ISel to pseudo-instructionTim Northover2014-04-021-2/+2
| | | | | | | | The previous attempt was fine with optimisations, but was actually rather cavalier with its types. When compiled at -O0, it produced invalid COPY MachineInstrs. llvm-svn: 205422
* ARM64: add intrinsic for pmull (p64 x p64 = p128) operations.Tim Northover2014-04-011-0/+18
| | | | llvm-svn: 205302
* ARM64: add patterns for more lane-wise ld1/st1 operations.Tim Northover2014-04-012-80/+188
| | | | llvm-svn: 205294
* ARM64: fix bug in ld3r (1d) SelectionDAG.Tim Northover2014-04-011-0/+31
| | | | llvm-svn: 205293
* [Stackmaps] Update the stackmap format to use 64-bit relocations for the ↵Juergen Ributzka2014-03-312-46/+58
| | | | | | | | | | | | function address and properly align all entries. This commit updates the stackmap format to version 1 to indicate the reorganizaion of several fields. This was done in order to align stackmap entries to their natural alignment and to minimize padding. Fixes <rdar://problem/16005902> llvm-svn: 205254
* ARM64: add extra patterns for scalar shiftsTim Northover2014-03-312-0/+22
| | | | llvm-svn: 205209
* ARM64: add extra scalar neg pattern & tests.Tim Northover2014-03-311-0/+71
| | | | llvm-svn: 205208
* ARM64: add patterns for scalar sqdmlal & sqdmlsl.Tim Northover2014-03-311-0/+16
| | | | llvm-svn: 205207
* ARM64: add more patterns for commuted fmsub operations.Tim Northover2014-03-311-0/+18
| | | | llvm-svn: 205206
* ARM64: shuffle patterns around for fmin/fmax & add tests.Tim Northover2014-03-311-0/+101
| | | | llvm-svn: 205205
* ARM64: add more scalar patterns for usqadd & suqadd.Tim Northover2014-03-311-2/+34
| | | | llvm-svn: 205204
* ARM64: add more scalar patterns for reciprocal ops.Tim Northover2014-03-311-0/+55
| | | | llvm-svn: 205203
* ARM64: add i64 scalar pattern for @llvm.arm64.absTim Northover2014-03-311-0/+8
| | | | | | This will be used by the Clang front-end code for vabsd_s64. llvm-svn: 205202
* [ARM64] Fix materialization of an fp128 zero immediate. There currentlyChandler Carruth2014-03-311-0/+11
| | | | | | | | | is not a pattern to lower this with clever instructions that zero the register, so restrict the zero immediate legality special case to f64 and f32 (the only two sizes which fmov seems to directly support). Fixes backend errors when building code such as libxml. llvm-svn: 205161
* Suppress llvm/test/CodeGen/ARM64 for targeting pecoff. ARM64 is unaware of that.NAKAMURA Takumi2014-03-301-0/+5
| | | | | FIXME: Could we support them? llvm-svn: 205126
* ARM64: initial backend importTim Northover2014-03-29230-0/+32356
This adds a second implementation of the AArch64 architecture to LLVM, accessible in parallel via the "arm64" triple. The plan over the coming weeks & months is to merge the two into a single backend, during which time thorough code review should naturally occur. Everything will be easier with the target in-tree though, hence this commit. llvm-svn: 205090
OpenPOWER on IntegriCloud