summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64] Change AArch64 Windows EH UnwindHelp object to be a fixed objectDaniel Frampton2020-06-251-15/+38
| | | | | | | | | | | | | | | The UnwindHelp object is used during exception handling by runtime code. It must be findable from a fixed offset from FP. This change allocates the UnwindHelp object as a fixed object (as is done for x86_64) to ensure that both the generated code and runtime agree on the location of the object. Fixes https://bugs.llvm.org/show_bug.cgi?id=45346 Differential Revision: https://reviews.llvm.org/D77016 (cherry picked from commit 494abe139a9aab991582f1b3f3370b99b252944c)
* [AArch64] Fix mismatch in prologue and epilogue for funclets on WindowsDaniel Frampton2020-06-251-28/+20
| | | | | | | | | | | | | | | The generated code for a funclet can have an add to sp in the epilogue for which there is no corresponding sub in the prologue. This patch removes the early return from emitPrologue that was preventing the sub to sp, and instead conditionalizes the appropriate parts of the rest of the function. Fixes https://bugs.llvm.org/show_bug.cgi?id=45345 Differential Revision: https://reviews.llvm.org/D77015 (cherry picked from commit 522b4c4b88a5606b0074926e8658e7fede97c230)
* [AARch64] Add Marvell ThunderX3T110 supportWei Zhao2020-06-1714-15/+2056
| | | | | | | | | | | This is the first checkin to support Marvell ThunderX3T110. Initial definition of the micro-ops of the instructions in ThunderX3T110 is included. Differential Revision: https://reviews.llvm.org/D78129 (cherry picked from commit 382d3a85e2a9269569e7fb8caa487d7ef57900c6)
* [AArch64] Fix BTI instruction emission.Daniel Kiss2020-06-161-3/+5
| | | | | | | | | | | | | | | | | | | | | | Summary: SCTLR_EL1.BT[01] controls the PACI[AB]SP compatibility with PBYTE 11 (see [1]) This bit will be set to zero so PACI[AB]SP are equal to BTI C instruction only. [1] https://developer.arm.com/docs/ddi0595/b/aarch64-system-registers/sctlr_el1 Reviewers: chill, tamas.petz, pbarrio, ostannard Reviewed By: tamas.petz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81746 (cherry picked from commit b8ae3fdfa579dbf366b1bb1cbfdbf8c51db7fa55)
* [AArch64] Fix BTI landing pad generation.Daniel Kiss2020-06-161-0/+4
| | | | | | | | | | In some cases BTI landing pad is inserted even compatible instruction was there already. Meta instruction does not count in this case therefore skip them in the check for first instructions in the function. Differential revision: https://reviews.llvm.org/D74492 (cherry picked from commit d5a186a60014dc1a8c979c978cb32aba7ecb9102)
* [FPEnv][AArch64] Add lowering of f128 STRICT_FSETCCJohn Brawn2020-02-181-2/+4
| | | | | | | | These get lowered to function calls, like the non-strict versions. Differential Revision: https://reviews.llvm.org/D73784 (cherry picked from commit 68cf574857c81f711f498a479855a17e7bea40f7)
* [FPEnv][AArch64] Add lowering and instruction selection for strict conversionsJohn Brawn2020-02-182-24/+54
| | | | | | | | | | Strict fp-to-int and int-to-fp conversions can be handled in the same way that the non-strict versions are (by using the appropriate instruction or converting to a function call when we have no instruction). Differential Revision: https://reviews.llvm.org/D73625 (cherry picked from commit 0bb9a27c9895c0fbc3f55f56ad7f1e1927398fce)
* [FPEnv][AArch64] Add lowering and instruction selection for STRICT_FP_ROUNDJohn Brawn2020-02-182-8/+16
| | | | | | | | | | This gets selected to the appropriate fcvt instruction. Handling from there on isn't fully correct yet, as we need to model fcvt reading and writing to fpsr and fpcr. Differential Revision: https://reviews.llvm.org/D73201 (cherry picked from commit 258d8dd76afd88a12539b182a53ff21dcba16a2d)
* Add lowering of STRICT_FSETCC and STRICT_FSETCCSJohn Brawn2020-02-183-11/+57
| | | | | | | | | | These become STRICT_FCMP and STRICT_FCMPE, which then get selected to the corresponding FCMP and FCMPE instructions, though the handling from there on isn't fully correct as we don't model reads and writes to FPCR and FPSR. Differential Revision: https://reviews.llvm.org/D73368 (cherry picked from commit 2224407ef5baf6100fa22420feb4d25af1a9493f)
* [AArch64] Add option to enable/disable load-store renaming.Florian Hahn2020-02-101-0/+7
| | | | | | | | This patch adds a new option to enable/disable register renaming in the load-store optimizer. Defaults to disabled, as there is a potential mis-compile caused by this. (cherry picked from commit 8e3f59b45ae185cc9b4e3a817d7ac958f1d55976)
* [AArch64][ARM] Always expand ordered vector reductions (PR44600)Nikita Popov2020-02-051-1/+15
| | | | | | | | | | | | | | | | | | fadd/fmul reductions without reassoc are lowered to VECREDUCE_STRICT_FADD/FMUL nodes, which don't have legalization support. Until that is in place, expand these intrinsics on ARM and AArch64. Other targets always expand the vector reduction intrinsics. Additionally expand fmax/fmin reductions without nonan flag on AArch64, as the backend asserts that the flag is present when lowering VECREDUCE_FMIN/FMAX. This fixes https://bugs.llvm.org/show_bug.cgi?id=44600. Differential Revision: https://reviews.llvm.org/D73135 (cherry picked from commit 70d345e687caba4ac1f95655c6924dfa91e0083f)
* [AArch64] -fpatchable-function-entry=N,0: place patch label after BTIFangrui Song2020-02-031-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after 9a24488cb67a90f889529987275c5e411ce01dda, we place the NOP sled after the initial BTI. ``` .Lfunc_begin0: bti c nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lfunc_begin0 ``` This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label: ``` .Lfunc_begin0: bti c .Lpatch0: nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lpatch0 ``` This placement is compatible with the resolution in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 . A local linkage function whose address is not taken does not need a BTI. Placing the patch label after BTI has the advantage that code does not need to differentiate whether the function has an initial BTI. Reviewers: mrutland, nickdesaulniers, nsz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73680 (cherry picked from commit 06b8e32d4fd3f634f793e3bc0bc4fdb885e7a3ac)
* Add function attribute "patchable-function-prefix" to support ↵Fangrui Song2020-01-241-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -fpatchable-function-entry=N,M where M>0 Similar to the function attribute `prefix` (prefix data), "patchable-function-prefix" inserts data (M NOPs) before the function entry label. -fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry) will look like: ``` .type foo,@function .Ltmp0: # @foo nop foo: .Lfunc_begin0: # optional `bti c` (AArch64 Branch Target Identification) or # `endbr64` (Intel Indirect Branch Tracking) nop .section __patchable_function_entries,"awo",@progbits,get,unique,0 .p2align 3 .quad .Ltmp0 ``` -fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html): ``` (a) (b) func: func: .Ltmp0: bti c bti c .Ltmp0: nop nop ``` (a) needs no additional code. If the consensus is to go for (b), we will need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp . Differential Revision: https://reviews.llvm.org/D73070 (cherry picked from commit 22467e259507f5ead2a87d989251b4c951a587e4)
* [AArch64] Don't rename registers with pseudo defs in Ld/St opt.Florian Hahn2020-01-221-0/+13
| | | | | | | | | | | | If the root def of for renaming is a noop-pseudo instruction like kill, we would end up without a correct def for the renamed register, causing miscompiles. This patch conservatively bails out on any pseudo instruction. This fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1037912#c70 (cherry picked from commit 300997c41a00b705ca10264c15910dd8d691ab75)
* Revert rG6078f2fedcac5797ac39ee5ef3fd7a35ef1202d5 - "[AArch64][GlobalISel]: ↵Simon Pilgrim2020-01-151-38/+1
| | | | | | | | | | | | | Support @llvm.{return,frame}address selection." These intrinsics expand to a variable number of instructions so just like in ISelLowering.cpp we use custom code to deal with them. Committing Tim's original patch. Differential Revision: https://reviews.llvm.org/D65656 ---- Breaks EXPENSIVE_CHECKS builds.
* [AArch64][SVE] Fold variable into assert to silence unused variable warnings ↵Benjamin Kramer2020-01-151-2/+2
| | | | in Release builds
* [AArch64][SVE] Add ptest intrinsicsCullen Rhodes2020-01-154-1/+54
| | | | | | | | | | | | | | | | | | | Summary: Implements the following intrinsics: * @llvm.aarch64.sve.ptest.any * @llvm.aarch64.sve.ptest.first * @llvm.aarch64.sve.ptest.last Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72398
* CMake: Make most target symbols hidden by defaultTom Stellard2020-01-146-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF this change makes all symbols in the target specific libraries hidden by default. A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these libraries public, which is mainly needed for the definitions of the LLVMInitialize* functions. This patch reduces the number of public symbols in libLLVM.so by about 25%. This should improve load times for the dynamic library and also make abi checker tools, like abidiff require less memory when analyzing libLLVM.so One side-effect of this change is that for builds with LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that access symbols that are no longer public will need to be statically linked. Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1): nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l 36221 nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l 26278 Reviewers: chandlerc, beanz, mgorny, rnk, hans Reviewed By: rnk, hans Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54439
* [AArch64][GlobalISel]: Support @llvm.{return,frame}address selection.Amara Emerson2020-01-141-1/+38
| | | | | | | | | These intrinsics expand to a variable number of instructions so just like in ISelLowering.cpp we use custom code to deal with them. Committing Tim's original patch. Differential Revision: https://reviews.llvm.org/D65656
* [SVE] Add patterns for MUL immediate instruction.Danilo Carvalho Grael2020-01-142-2/+7
| | | | | | | | | | | | Summary: Add the missing MUL pattern for integer immediate instructions. Reviewers: sdesmalen, huntergr, efriedma, c-rhodes, kmclaughlin Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D72654
* [AArch64] Fix save register pairing for Windows AAPCSSanne Wouda2020-01-141-4/+16
| | | | | | | | | | | | | | | | | | | | | | Summary: On Windows, when a function does not have an unwind table (for example, EH filtering funclets), we don't correctly pair FP and LR to form the frame record in all circumstances. Fix this by invalidating a pair when the second register is FP when compiling for Windows, even when CFI is not needed. Fixes PR44271 introduced by D65653. Reviewers: efriedma, sdesmalen, rovka, rengolin, t.p.northover, thegameg, greened Reviewed By: rengolin Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71754
* Make helper functions static or move them into anonymous namespaces. NFC.Benjamin Kramer2020-01-141-1/+1
|
* [GlobalISel] Change representation of shuffle masks in MachineOperand.Eli Friedman2020-01-131-6/+3
| | | | | | | | | | | | We're planning to remove the shufflemask operand from ShuffleVectorInst (D72467); fix GlobalISel so it doesn't depend on that Constant. The change to prelegalizercombiner-shuffle-vector.mir happens because the input contains a literal "-1" in the mask (so the parser/verifier weren't really handling it properly). We now treat it as equivalent to "undef" in all contexts. Differential Revision: https://reviews.llvm.org/D72663
* [AArch64][SVE] Add patterns for some arith SVE instructions.Danilo Carvalho Grael2020-01-135-10/+67
| | | | | | | | | | | Summary: Add patterns for the following instructions: - smax, smin, umax, umin Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: amehsan Differential Revision: https://reviews.llvm.org/D71779
* [AArch64] Emit HINT instead of PAC insns in Armv8.2-A or belowPablo Barrio2020-01-131-15/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The Pointer Authentication Extension (PAC) was added in Armv8.3-A. Some instructions are implemented in the HINT space to allow compiling code common to CPUs regardless of whether they feature PAC or not, and still benefit from PAC protection in the PAC-enabled CPUs. The 8.3-specific mnemonics were currently enabled in any architecture, and LLVM was emitting them in assembly files when PAC code generation was enabled. This was ok for compilations where both LLVM codegen and the integrated assembler were used. However, the LLVM codegen was not compatible with other assemblers (e.g. GAS). Given the fact that the approach from these assemblers (i.e. to disallow Armv8.3-A mnemonics if compiling for Armv8.2-A or lower) is entirely reasonable, this patch makes LLVM to emit HINT when building for Armv8.2-A and below, instead of PACIASP, AUTIASP and friends. Then, LLVM assembly should be compatible with other assemblers. Reviewers: samparker, chill, LukeCheeseman Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71658
* This option allows selecting the TLS size in the local exec TLS model,KAWASHIMA Takahiro2020-01-133-26/+117
| | | | | | | | | | | | | | | | | | which is the default TLS model for non-PIC objects. This allows large/ many thread local variables or a compact/fast code in an executable. Specification is same as that of GCC. For example, the code model option precedes the TLS size option. TLS access models other than local-exec are not changed. It means supoort of the large code model is only in the local exec TLS model. Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>) Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard Reviewd By: peter.smith Committed by: peter.smith Differential Revision: https://reviews.llvm.org/D71688
* [Disassembler] Delete the VStream parameter of MCDisassembler::getInstruction()Fangrui Song2020-01-112-3/+1
| | | | | | | | | | The argument is llvm::null() everywhere except llvm::errs() in llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds. If we ever have the needs to add verbose log to disassemblers, we can record log with a member function, instead of passing it around as an argument.
* [AArch64] Don't generate libcalls for wide shifts on DarwinJessica Paquette2020-01-101-1/+1
| | | | | | | Similar to cff90f07cb5cc3. Darwin doesn't always use compiler-rt, and so we can't assume that these functions are available (at least on arm64).
* [AArch64] Add isAuthenticated predicate to MCInstDescVedant Kumar2020-01-102-6/+14
| | | | | | | | | | Add a predicate to MCInstDesc that allows tools to determine whether an instruction authenticates a pointer. This can be used by diagnostic tools to hint at pointer authentication failures. Differential Revision: https://reviews.llvm.org/D70329 rdar://55089604
* [AArch64] Add function attribute "patchable-function-entry" to add NOPs at ↵Fangrui Song2020-01-101-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | function entry The Linux kernel uses -fpatchable-function-entry to implement DYNAMIC_FTRACE_WITH_REGS for arm64 and parisc. GCC 8 implemented -fpatchable-function-entry, which can be seen as a generalized form of -mnop-mcount. The N,M form (function entry points before the Mth NOP) is currently only used by parisc. This patch adds N,0 support to AArch64 codegen. N is represented as the function attribute "patchable-function-entry". We will use a different function attribute for M, if we decide to implement it. The patch reuses the existing patchable-function pass, and TargetOpcode::PATCHABLE_FUNCTION_ENTER which is currently used by XRay. When the integrated assembler is used, __patchable_function_entries will be created for each text section with the SHF_LINK_ORDER flag to prevent --gc-sections (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93197) and COMDAT (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93195) issues. Retrospectively, __patchable_function_entries should use a PC-relative relocation type to avoid the SHF_WRITE flag and dynamic relocations. "patchable-function-entry"'s interaction with Branch Target Identification is still unclear (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 for GCC discussions). Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D72215
* TableGen/GlobalISel: Add way for SDNodeXForm to work on timmMatt Arsenault2020-01-091-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation assumes there is an instruction associated with the transform, but this is not the case for timm/TargetConstant/immarg values. These transforms should directly operate on a specific MachineOperand in the source instruction. TableGen would assert if you attempted to define an equivalent GISDNodeXFormEquiv using timm when it failed to find the instruction matcher. Specially recognize SDNodeXForms on timm, and pass the operand index to the render function. Ideally this would be a separate render function type that looks like void renderFoo(MachineInstrBuilder, const MachineOperand&), but this proved to be somewhat mechanically painful. Add an optional operand index which will only be passed if the transform should only look at the one source operand. Theoretically it would also be possible to only ever pass the MachineOperand, and the existing renderers would check the parent. I think that would be somewhat ugly for the standard usage which may want to inspect other operands, and I also think MachineOperand should eventually not carry a pointer to the parent instruction. Use it in one sample pattern. This isn't a great example, since the transform exists to satisfy DAG type constraints. This could also be avoided by just changing the MachineInstr's arbitrary choice of operand type from i16 to i32. Other patterns have nontrivial uses, but this serves as the simplest example. One flaw this still has is if you try to use an SDNodeXForm defined for imm, but the source pattern uses timm, you still see the "Failed to lookup instruction" assert. However, there is now a way to avoid it.
* CodeGen: Use LLT instead of EVT in getRegisterByNameMatt Arsenault2020-01-092-2/+2
| | | | | | Only PPC seems to be using it, and only checks some simple cases and doesn't distinguish between FP. Just switch to using LLT to simplify use from GlobalISel.
* [AArch64][GlobalISel] Implement selection of <2 x float> vector splat.Amara Emerson2020-01-092-7/+36
| | | | | | Also requires making G_IMPLICIT_DEF of v2s32 legal. Differential Revision: https://reviews.llvm.org/D72422
* [GlobalISel][AArch64] Import + select LDR*roW and STR*roW patternsJessica Paquette2020-01-092-46/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for selecting a large chunk of the load/store *roW patterns. This is pretty much a straight port of AArch64DAGToDAGISel::SelectAddrModeWRO into GISel. The code is very similar to the XRO code. The main difference is that in the *roW patterns, we want to try and fold in an extend, and *possibly* a shift along with it. A good portion of this patch is refactoring the existing XRO code. - Add selectAddrModeWRO - Factor out the code from selectAddrModeShiftedExtendXReg which is used by both selectAddrModeXRO and selectAddrModeWRO into selectExtendedSHL. This is similar to the function of the same name in AArch64DAGToDAGISel. - Add support for extends to the factored out code in selectExtendedSHL. - Teach getExtendTypeForInst how to handle AND masks that are intended to be used in loads/stores (necessary for this addressing mode.) - Make getExtendTypeForInst not static because moving it made an annoying diff and I wanted to have the WRO/XRO functions close to each other while I was writing the code. Differential Revision: https://reviews.llvm.org/D72426
* [APFloat] Fix checked error assert failuresEhud Katz2020-01-091-2/+2
| | | | | | | | | | | `APFLoat::convertFromString` returns `Expected` result, which must be "checked" if the LLVM_ENABLE_ABI_BREAKING_CHECKS preprocessor flag is set. To mark an `Expected` result as "checked" we must consume the `Error` within. In many cases, we are only interested in knowing if an error occured, without the need to examine the error info. This is achieved, easily, with the `errorToBool()` API.
* Revert "Merge memtag instructions with adjacent stack slots."Evgenii Stepanov2020-01-087-489/+30
| | | | | | | | | | | | *** Bad machine code: Tied use must be a register *** - function: stg_alloca17 - basic block: %bb.0 entry (0x20076710580) - instruction: early-clobber %0:gpr64common, early-clobber %1:gpr64sp = STGloop 272, %stack.0.a :: (store 272 into %ir.a, align 16) - operand 3: %stack.0.a http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/21481/steps/test-check-all/logs/stdio This reverts commit b675a7628ce6a21b1e4a71c079a67badfb8b073d.
* Merge memtag instructions with adjacent stack slots.Evgenii Stepanov2020-01-087-30/+489
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Detect a run of memory tagging instructions for adjacent stack frame slots, and replace them with a shorter instruction sequence * replace STG + STG with ST2G * replace STGloop + STGloop with STGloop This code needs to run when stack slot offsets are already known, but before FrameIndex operands in STG instructions are eliminated; that's the reason for the new hook in PrologueEpilogue. This change modifies STGloop and STZGloop pseudos to take the size as an immediate integer operand, and base address as a FI operand when possible. This is needed to simplify recognizing an STGloop instruction as operating on a stack slot post-regalloc. This improves memtag code size by ~0.25%, and it looks like an additional ~0.1% is possible by rearranging the stack frame such that consecutive STG instructions reference adjacent slots (patch pending). Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70286
* AArch64: add missing Apple CPU names and use them by default.Tim Northover2020-01-084-7/+107
| | | | | | | | Apple's CPUs are called A7-A13 in official communication, occasionally with weird suffixes which we probably don't need to care about. This adds each one and describes its features. It also switches the default CPU to the canonical name for Cyclone, but leaves legacy support in so that existing bitcode still compiles.
* [MachineOutliner][AArch64] Save + restore LR in noreturn functionsJessica Paquette2020-01-071-1/+11
| | | | | | | | | | | | | | Conservatively always save + restore LR in noreturn functions. These functions do not end in a RET, and so they aren't guaranteed to have an instruction which uses LR in any way. So, as a result, you can end up in unfortunate situations where you can't backtrace out of these functions in a debugger. Remove the old noreturn test, and add a new one which is more descriptive. Remove the restriction that we can't outline from noreturn functions as well since we now do the right thing.
* [MC] Add parameter `Address` to MCInstrPrinter::printInstructionFangrui Song2020-01-062-5/+5
| | | | | | | | Follow-up of D72172. Reviewed By: jhenderson, rnk Differential Revision: https://reviews.llvm.org/D72180
* [MC] Add parameter `Address` to MCInstPrinter::printInstFangrui Song2020-01-062-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | printInst prints a branch/call instruction as `b offset` (there are many variants on various targets) instead of `b address`. It is a convention to use address instead of offset in most external symbolizers/disassemblers. This difference makes `llvm-objdump -d` output unsatisfactory. Add `uint64_t Address` to printInst(), so that it can pass the argument to printInstruction(). `raw_ostream &OS` is moved to the last to be consistent with other print* methods. The next step is to pass `Address` to printInstruction() (generated by tablegen from the instruction set description). We can gradually migrate targets to print addresses instead of offsets. In any case, downstream projects which don't know `Address` can pass 0 as the argument. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72172
* Lower TAGPstack with negative offset to SUBG.Evgenii Stepanov2020-01-062-3/+12
| | | | | | | | | | | | | | Summary: This never really occurs in the current codegen, so only a MIR test is possible. Reviewers: ostannard, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72123
* [NFC] Fix trivial typos in commentsJames Henderson2020-01-062-2/+2
| | | | | | | | Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.
* [APFloat] Add recoverable string parsing errors to APFloatEhud Katz2020-01-061-4/+10
| | | | | | Implementing the APFloat part in PR4745. Differential Revision: https://reviews.llvm.org/D69770
* GlobalISel: Add type argument to getRegBankFromRegClassMatt Arsenault2020-01-032-4/+5
| | | | | | AMDGPU can't unambiguously go back from the selected instruction register class to the register bank without knowing if this was used in a boolean context.
* [DAGCombine] Initialize the default operation action for SIGN_EXTEND_INREG ↵QingShan Zhang2020-01-031-0/+5
| | | | | | | | | | | for vector type as 'expand' instead of 'legal' For now, we didn't set the default operation action for SIGN_EXTEND_INREG for vector type, which is 0 by default, that is legal. However, most target didn't have native instructions to support this opcode. It should be set as expand by default, as what we did for ANY_EXTEND_VECTOR_INREG. Differential Revision: https://reviews.llvm.org/D70000
* DAG: Use TargetConstant for FENCE operandsMatt Arsenault2020-01-021-3/+3
|
* Remove unneeded extra variable realArgIdx. NFC.Jay Foad2020-01-021-6/+5
|
* [AArch64][SVE] Gather loads: pass 32 bit unpacked offsets as nxv2i32Andrzej Warzynski2020-01-021-14/+21
| | | | | | | | | | | | | | | | | | | Summary: Currently 32 bit unpacked offsets are passed as nxv2i64. However, as pointed out in https://reviews.llvm.org/D71074, using nxv2i32 instead would improve consistency with: * how other arguments are treated * how scatter stores are implemented This patch makes sure that 32 bit unpacked offsets are passes as nxv2i32 instead of nxv2i64. Reviewers: sdesmalen, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71724
* [LegalizeVectorOps][AArch64] Stop asking for v4f16 fp_round and fp_extend to ↵Craig Topper2019-12-311-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | be promoted. These operations are needed as building blocks for promoting so they can't be promoted themselves. This appeared to work because the fp_extend query type for operation actions is the result type, not the input type so it never triggered in the legalizer. For fp_round, the vector op legalizer just ended up creating a nop fp_extend that was elided by getNode, followed by a nop fp_round that was also elided by getNode. This was followed by a final fp_round from v4f32 back to vf416 which was CSEd to the original node. Then legalize vector ops just believed that node legalized to itself. LegalizeDAG took another crack at promoting it, but didn't have a handler so just skipped it with a debug message saying it wasn't promoted. This patch just removes the operation actions to avoid this non-sense. Found while trying to refactor LegalizeVectorOps to handle multiple result nodes better.
OpenPOWER on IntegriCloud