summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* [MachineOutliner] XFAIL machine-outliner-noredzone.llJessica Paquette2018-04-201-2/+3
| | | | | | | | | | The verifier began complaining about an undefined physical register in this test. XFAILing for the purposes of getting a bot up while I look into it. Failure: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/11385/ llvm-svn: 330493
* [MachineOutliner] Change B instruction for tail calls to TCRETURNdiJessica Paquette2018-04-201-2/+2
| | | | | | | | | | First off, this is more correct than having the B. Second off, this was making a bot upset. This fixes that. Update the test to include -verify-machineinstrs as well to prevent stuff like this slipping by non debug/assert builds in the future. llvm-svn: 330459
* [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-04-201-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was originally committed at rL328921 and reverted at rL329920 to investigate failures in Chrome. This time I've added to the ReleaseNotes to warn users of the potential of exposing UB and let me repeat that here for more exposure: Optimization of floating-point casts is improved. This may cause surprising results for code that is relying on undefined behavior. Code sanitizers can be used to detect affected patterns such as this: int main() { float x = 4294967296.0f; x = (float)((int)x); printf("junk in the ftrunc: %f\n", x); return 0; } $ clang -O1 ftrunc.c -fsanitize=undefined ; ./a.out ftrunc.c:5:15: runtime error: 4.29497e+09 is outside the range of representable values of type 'int' junk in the ftrunc: 0.000000 Original commit message: fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 330437
* [AArch64] Add isel pattern for v8i8->v2f32 NVCASTs.Amara Emerson2018-04-181-0/+16
| | | | | | rdar://39454635 llvm-svn: 330276
* Add tests for shrink wrapping and VLAsMomchil Velikov2018-04-181-0/+95
| | | | | | Differential revision: https://reviews.llvm.org/D45727 llvm-svn: 330253
* MachO: trap unreachable instructionsTim Northover2018-04-131-0/+7
| | | | | | | Debugability is more important than saving 4 bytes to let us to fall through to nonense. llvm-svn: 330073
* Revert r329956, "AArch64: Introduce a DAG combine for folding offsets into ↵Peter Collingbourne2018-04-136-172/+68
| | | | | | | | | | addresses." Caused a hang and eventually an assertion failure in LTO builds of 7zip-benchmark on aarch64 iOS targets. http://green.lab.llvm.org/green/job/lnt-ctmark-aarch64-O3-flto/2024/ llvm-svn: 330063
* AArch64: Introduce a DAG combine for folding offsets into addresses.Peter Collingbourne2018-04-126-68/+172
| | | | | | | | | | | This is a code size win in code that takes offseted addresses frequently, such as C++ constructors that typically need to compute an offseted address of a vtable. This reduces the size of Chromium for Android's .text section by 108KB. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 329956
* [AArch64] Move AFI->setRedZone(false) to top of emitPrologueJessica Paquette2018-04-121-5/+35
| | | | | | | | | | | | | AFI->setRedZone(false) was put in the wrong place before, and so it only fired on functions that didn't have stack frames. This moves that to the top of emitPrologue to make sure that every function without a redzone has it set correctly. This also adds a function representing one of the early exit cases (GHC calling convention) to the MachineOutliner noredzone test to ensure that we can outline from functions like these, where we never use a redzone. llvm-svn: 329922
* revert r328921 - [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-04-121-4/+8
| | | | | | | This change is exposing UB in source code - as was warned/predicted. :) See D44909 for discussion. Reverting while we figure out how to fix things. llvm-svn: 329920
* [NFC] fix trivial typos in documents and commentsHiroshi Inoue2018-04-121-1/+1
| | | | | | "is is" -> "is", "if if" -> "if", "or or" -> "or" llvm-svn: 329878
* [FastISel] Disable local value sinking by defaultReid Kleckner2018-04-117-13/+13
| | | | | | | | | | | | | | | | | | This is causing compilation timeouts on code with long sequences of local values and calls (i.e. foo(1); foo(2); foo(3); ...). It turns out that code coverage instrumentation is a great way to create sequences like this, which how our users ran into the issue in practice. Intel has a tool that detects these kinds of non-linear compile time issues, and Andy Kaylor reported it as PR37010. The current sinking code scans the whole basic block once per local value sink, which happens before emitting each call. In theory, local values should only be introduced to be used by instructions between the current flush point and the last flush point, so we should only need to scan those instructions. llvm-svn: 329822
* [AArch64] Add test case for r329797Francis Visoiu Mistrih2018-04-111-0/+17
| | | | | | Forgot to add a test case in the previous commit. llvm-svn: 329805
* [AArch64][Falkor] Fix bug in Falkor HWPF collision avoidance pass.Geoff Berry2018-04-101-0/+25
| | | | | | | | | | | | | | | | Summary: When inserting MOVs to avoid Falkor HWPF collisions, the non-base register operand of load instructions (e.g. a register offset) was not being considered live, so it could potentially have been used as a scratch register, clobbering the actual offset value. Reviewers: mcrosier Subscribers: rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45502 llvm-svn: 329761
* [AArch64] Fix isel failure when BUILD_PAIR nodes are left over.Amara Emerson2018-04-101-0/+13
| | | | | | rdar://39175175 llvm-svn: 329743
* Revert r329611, "AArch64: Allow offsets to be folded into addresses with ELF."Peter Collingbourne2018-04-1011-191/+127
| | | | | | Caused a build failure in check-tsan. llvm-svn: 329718
* [AArch64] Use FP to access the emergency spill slotFrancis Visoiu Mistrih2018-04-102-2/+23
| | | | | | | | | | | | | | | | | | | | | In the presence of variable-sized stack objects, we always picked the base pointer when resolving frame indices if it was available. This makes us hit an assert where we can't reach the emergency spill slot if it's too far away from the base pointer. Since on AArch64 we decide to place the emergency spill slot at the top of the frame, it makes more sense to use FP to access it. The changes here don't affect only emergency spill slots but all the frame indices. The goal here is to try to choose between FP, BP and SP so that we minimize the offset and avoid scavenging, or worse, asserting when trying to access a slot allocated by the scavenger. Previously discussed here: https://reviews.llvm.org/D40876. Differential Revision: https://reviews.llvm.org/D45358 llvm-svn: 329691
* AArch64: Allow offsets to be folded into addresses with ELF.Peter Collingbourne2018-04-0911-127/+191
| | | | | | | | | | | | | | | This is a code size win in code that takes offseted addresses frequently, such as C++ constructors that typically need to compute an offseted address of a vtable. It reduces the size of Chromium for Android's .text section by 46KB, or 56KB with ThinLTO (which exposes more opportunities to use a direct access rather than a GOT access). Because the addend range is limited in COFF and Mach-O, this is enabled for ELF only. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 329611
* Remove MachineLoopInfo dependency from AsmPrinter.Michael Zolotukhin2018-04-093-9/+1
| | | | | | | | | | | | | | | | | | | Summary: Currently MachineLoopInfo is used in only two places: 1) for computing IsBasicBlockInsideInnermostLoop field of MCCodePaddingContext, and it is never used. 2) in emitBasicBlockLoopComments, which is called only if `isVerbose()` is true. Despite that, we currently have a dependency on MachineLoopInfo, which makes pass manager to compute it and MachineDominator Tree. This patch removes the use (1) and makes the use (2) lazy, thus avoiding some redundant recomputations. Reviewers: opaparo, gadi.haber, rafael, craig.topper, zvi Subscribers: rengolin, javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44812 llvm-svn: 329542
* [DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst))Guozhi Wei2018-04-071-0/+14
| | | | | | | | | | | | | | In our real world application, we found the following optimization is missed in DAGCombiner (zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst)) If the user of original zext is an add, it may enable further lea optimization on x86. This patch add a new function CombineZExtLogicopShiftLoad to do this optimization. Differential Revision: https://reviews.llvm.org/D44402 llvm-svn: 329516
* Reapply ARM: Do not spill CSR to stack on entry to noreturn functionsTim Northover2018-04-071-2/+2
| | | | | | | | | | | | | | | | | | Should fix UBSan bot by also checking there's no "uwtable" attribute before skipping. Otherwise the unwind table will be useless since its moves expect CSRs to actually be preserved. A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch mostly by myeisha (pmb). llvm-svn: 329494
* Revert "ARM: Do not spill CSR to stack on entry to noreturn functions"Vitaly Buka2018-04-071-2/+2
| | | | | | | | Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM This reverts commit r329287 llvm-svn: 329486
* ARM: Do not spill CSR to stack on entry to noreturn functionsTim Northover2018-04-051-2/+2
| | | | | | | | | | | | | | A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch by myeisha (pmb). llvm-svn: 329287
* AArch64: Implement support for the shadowcallstack attribute.Peter Collingbourne2018-04-041-0/+47
| | | | | | | | | | | | The implementation of shadow call stack on aarch64 is quite different to the implementation on x86_64. Instead of reserving a segment register for the shadow call stack, we reserve the platform register, x18. Any function that spills lr to sp also spills it to the shadow call stack, a pointer to which is stored in x18. Differential Revision: https://reviews.llvm.org/D45239 llvm-svn: 329236
* [AArch64] Add patterns matching (fabs (fsub x y)) to (fabd x y)John Brawn2018-04-043-0/+106
| | | | | | Differential Revision: https://reviews.llvm.org/D44573 llvm-svn: 329163
* [MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfoJessica Paquette2018-04-031-0/+47
| | | | | | | | | | | | This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It returns true if the function is known to use a redzone, false if it is known to not use a redzone, and no value otherwise. This removes the requirement to pass -mno-red-zone when outlining for AArch64. https://reviews.llvm.org/D45189 llvm-svn: 329120
* [MachineOutliner][NFC] Make outlined functions have internal linkageJessica Paquette2018-04-031-14/+14
| | | | | | | | | | | | The linkage type on outlined functions was private before. This meant that if you set a breakpoint in an outlined function, the debugger wouldn't be able to give a sane name to the outlined function. This commit changes the linkage type to internal and updates any tests that relied on the prefixes on the names of outlined functions. llvm-svn: 329116
* [AArch64] Reserve x18 register on FuchsiaPetr Hosek2018-04-013-6/+3
| | | | | | | | This register is reserved as a platform register on Fuchsia. Differential Revision: https://reviews.llvm.org/D45105 llvm-svn: 328950
* [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-03-311-8/+4
| | | | | | | | | | fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 328921
* [SelectionDAG] Removing FABS folding from DAGCombinerSanjay Patel2018-03-301-4/+9
| | | | | | | | | | | | | | The code has bugs dealing with -0.0. Since D44550 introduced FABS pattern folding in InstCombine, this patch removes the now-redundant code that causes https://bugs.llvm.org/show_bug.cgi?id=36600. Patch by Mikhail Dvoretckii! Differential Revision: https://reviews.llvm.org/D44683 llvm-svn: 328872
* [MachineCopyPropagation] Handle COPY with overlapping source/dest.Eli Friedman2018-03-301-0/+32
| | | | | | | | | | | | | | | | | | | MachineCopyPropagation::CopyPropagateBlock has a bunch of special handling for COPY instructions. This handling assumes that COPY instructions do not modify the source of the copy; this is wrong if the COPY destination overlaps the source. To fix the bug, check explicitly for this situation, and fall back to the generic instruction handling. This bug can't happen for most register classes because they don't have this sort of overlap, but there are a few register classes where this is possible. The testcase uses the AArch64 QQQQ register class. Differential Revision: https://reviews.llvm.org/D44911 llvm-svn: 328851
* [MachineOutliner] Simplify call outlining + require valid callee save info ↵Jessica Paquette2018-03-282-2/+2
| | | | | | | | | | for call outlining This commit simplifies the call outlining logic by removing references to the Function associated with the callee. To do this, it requires that valid callee save info is available to the outliner. llvm-svn: 328719
* [AArch64] add ftrunc tests; NFCSanjay Patel2018-03-281-0/+47
| | | | | | As suggested in D44909. llvm-svn: 328683
* [MachineOutliner] AArch64: Don't outline ADRPs with un-outlinable operandsJessica Paquette2018-03-271-0/+41
| | | | | | | | | If an ADRP appears with, say, a CPI operand, we shouldn't outline it. This moves the check for unsafe operands so that it occurs before the special-case for ADRPs. Also add a test for outlining ADRPs. llvm-svn: 328674
* Fix a reoccuring typo in load-combine testsArtur Pilipenko2018-03-271-2/+2
| | | | | | | | | | | %tmp = bitcast i32* %arg to i8* %tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0 - %tmp2 = load i8, i8* %tmp, align 1 + %tmp2 = load i8, i8* %tmp1, align 1 This doesn't change the semantics of the tests but makes use of %tmp1 which was originally intended. llvm-svn: 328642
* Use .set instead of = when printing assignment in assembly outputKrzysztof Parzyszek2018-03-273-16/+16
| | | | | | | | | On Hexagon "x = y" is a syntax used in most instructions, and is not treated as a directive. Differential Revision: https://reviews.llvm.org/D44256 llvm-svn: 328635
* [AArch64] Don't reduce the width of loads if it prevents combining a shiftJohn Brawn2018-03-232-2/+307
| | | | | | | | | | | | | | | Loads and stores can only shift the offset register by the size of the value being loaded, but currently the DAGCombiner will reduce the width of the load if it's followed by a trunc making it impossible to later combine the shift. Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and make it prevent the width reduction if this is what would happen, though do allow it if reducing the load width will let us eliminate a later sign or zero extend. Differential Revision: https://reviews.llvm.org/D44794 llvm-svn: 328321
* [GlobalISel] Fix legalizer combine to not use illegal input G_EXTRACT.Amara Emerson2018-03-231-3/+3
| | | | | | | This was being masked because GISel is enabled by default for -O0 and the abort was disabled. Modified test to explicitly enable abort. llvm-svn: 328311
* State that CFG is preserved in 'Falkor HW Prefetch Fix Late Phase'.Michael Zolotukhin2018-03-221-2/+0
| | | | | | That removes some redundant recomputations from the passes pipeline. llvm-svn: 328272
* Reapply "[test] Add tests for llc passes pipelines." with a fix for bots ↵Michael Zolotukhin2018-03-222-0/+240
| | | | | | with expensive checks on. llvm-svn: 328267
* [CodeGen] Add a new pass for PostRA sinkJun Bum Lim2018-03-222-0/+387
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This pass sinks COPY instructions into a successor block, if the COPY is not used in the current block and the COPY is live-in to a single successor (i.e., doesn't require the COPY to be duplicated). This avoids executing the the copy on paths where their results aren't needed. This also exposes additional opportunites for dead copy elimination and shrink wrapping. These copies were either not handled by or are inserted after the MachineSink pass. As an example of the former case, the MachineSink pass cannot sink COPY instructions with allocatable source registers; for AArch64 these type of copy instructions are frequently used to move function parameters (PhyReg) into virtual registers in the entry block.. For the machine IR below, this pass will sink %w19 in the entry into its successor (%bb.1) because %w19 is only live-in in %bb.1. ``` %bb.0: %wzr = SUBSWri %w1, 1 %w19 = COPY %w0 Bcc 11, %bb.2 %bb.1: Live Ins: %w19 BL @fun %w0 = ADDWrr %w0, %w19 RET %w0 %bb.2: %w0 = COPY %wzr RET %w0 ``` As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be able to see %bb.0 as a candidate. With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted in spec2000/2006/2017 on AArch64. Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz Reviewed By: sebpop Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41463 llvm-svn: 328237
* [GISel]: Fix incorrect IRTranslation while translating null pointer typesAditya Nandakumar2018-03-225-5/+9
| | | | | | | | | | | | | | | https://reviews.llvm.org/D44762 Currently IRTranslator produces %vreg17<def>(p0) = G_CONSTANT 0; instead we should build %vreg16(s64) = G_CONSTANT 0 %vreg17(p0) = G_INTTOPTR %vreg16 reviewed by @aemerson. llvm-svn: 328218
* Revert "[test] Add tests for llc passes pipelines."Jonas Devlieghere2018-03-222-239/+0
| | | | | | | This reverts r328159 because the two AArch64 tests fail on GreenDragon: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/11030/ llvm-svn: 328188
* [test] Add tests for llc passes pipelines.Michael Zolotukhin2018-03-212-0/+239
| | | | | | | This is basically an extension of existing test test/CodeGen/X86/O0-pipeline.ll introduced in r302608. llvm-svn: 328159
* [AArch64] Add vmulxh_lane fp16 vector intrinsicAbderrazek Zaafrani2018-03-201-0/+30
| | | | | | https://reviews.llvm.org/D44591 llvm-svn: 328035
* [AArch64] add fabs tests for PR36600; NFCSanjay Patel2018-03-201-0/+33
| | | | llvm-svn: 327995
* [MachineOutliner] AArch64: Emit CFI instructions when outlining callsJessica Paquette2018-03-191-0/+67
| | | | | | | | | | | When outlining calls, the outliner needs to update CFI to ensure that, say, exception handling works. This commit adds that functionality and adds a test just for call outlining. Call outlining stuff in machine-outliner.mir should be moved into machine-outliner-calls.mir in a later commit. llvm-svn: 327917
* [ARM, AArch64] Check the no-stack-arg-probe attribute for dynamic stack probesMartin Storsjo2018-03-192-0/+29
| | | | | | | | | | | | | This extends the use of this attribute on ARM and AArch64 from SVN r325900 (where it was only checked for fixed stack allocations on ARM/AArch64, but for all stack allocations on X86). This also adds a testcase for the existing use of disabling the fixed stack probe with the attribute on ARM and AArch64. Differential Revision: https://reviews.llvm.org/D44291 llvm-svn: 327897
* [MachineOutliner] Make KILLs invisibleJessica Paquette2018-03-161-0/+3
| | | | | | | | | At the point the outliner runs, KILLs don't impact anything, but they're still considered unique instructions. This commit makes them invisible like DebugValues so that they can still be outlined without impacting outlining decisions. llvm-svn: 327760
* [AArch64] Codegen tests for the Armv8.2-A FP16 intrinsicsSjoerd Meijer2018-03-157-0/+893
| | | | | | | | | This is a follow up of the AArch64 FP16 intrinsics work; the codegen tests had not been added yet. Differential Revision: https://reviews.llvm.org/D44510 llvm-svn: 327624
OpenPOWER on IntegriCloud