summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* [ARM] Disallow zexts in ARMCodeGenPrepareSam Parker2018-08-104-281/+343
| | | | | | | | | | | | | | | | | | | Enabling ARMCodeGenPrepare by default caused a whole load of failures. This is due to zexts and truncs not being handled properly. ZExts are messy so it's just easier to disable for now and truncs are allowed only as 'sinks'. I still need to figure out why allowing them as 'sources' causes so many failures. The other main changes are that we are explicit in the types that we converting to, it's now always 'TypeSize'. Type support is also now performed while checking for valid opcodes as it unnecessarily complicated having the checks are different stages. I've moved the tests around too, so we have the zext and truncs in their own file as well as the overflowing opcode tests. Differential Revision: https://reviews.llvm.org/D50518 llvm-svn: 339432
* [ARM] Replace processor check with featureEvandro Menezes2018-08-091-10/+13
| | | | | | | Add new feature, `FeatureUseWideStrideVFP`, that replaces the need for a processor check. Otherwise, NFC. llvm-svn: 339354
* [ARM] FP16: codegen support for VTRNSjoerd Meijer2018-08-091-19/+23
| | | | | | Differential Revision: https://reviews.llvm.org/D50454 llvm-svn: 339340
* [ADT] Normalize empty triple componentsPetr Hosek2018-08-081-8/+8
| | | | | | | | | | | | | | | | | LLVM triple normalization is handling "unknown" and empty components differently; for example given "x86_64-unknown-linux-gnu" and "x86_64-linux-gnu" which should be equivalent, triple normalization returns "x86_64-unknown-linux-gnu" and "x86_64--linux-gnu". autoconf's config.sub returns "x86_64-unknown-linux-gnu" for both "x86_64-linux-gnu" and "x86_64-unknown-linux-gnu". This changes the triple normalization to behave the same way, replacing empty triple components with "unknown". This addresses PR37129. Differential Revision: https://reviews.llvm.org/D50219 llvm-svn: 339294
* [ARM] Avoid spilling lr with Thumb1 tail calls.Eli Friedman2018-08-081-30/+137
| | | | | | | | | | | | | | | Normally, if any registers are spilled, we prefer to spill lr on Thumb1 so we can fold the "bx lr" into the "pop". However, if there are tail calls involved, restoring lr is expensive, so skip the optimization in that case. The spill of r7 in the new test also isn't necessary, but that's mostly orthogonal to this patch. (It's the same code in ARMFrameLowering, but it's not related to tail calls.) Differential Revision: https://reviews.llvm.org/D49459 llvm-svn: 339283
* revert tests of '[CodeGen] emit inline asm clobber list warnings for reserved'Ties Stuij2018-08-081-27/+0
| | | | llvm-svn: 339276
* [CodeGen] emit inline asm clobber list warnings for reservedTies Stuij2018-08-081-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results. For example: ``` extern int bar(int[]); int foo(int i) { int a[i]; // VLA asm volatile( "mov r7, #1" : : : "r7" ); return 1 + bar(a); } ``` Compiled for thumb, this gives: ``` $ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb ... foo: .fnstart @ %bb.0: @ %entry .save {r4, r5, r6, r7, lr} push {r4, r5, r6, r7, lr} .setfp r7, sp, #12 add r7, sp, #12 .pad #4 sub sp, #4 movs r1, #7 add.w r0, r1, r0, lsl #2 bic r0, r0, #7 sub.w r0, sp, r0 mov sp, r0 @APP mov.w r7, #1 @NO_APP bl bar adds r0, #1 sub.w r4, r7, #12 mov sp, r4 pop {r4, r5, r6, r7, pc} ... ``` r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block. This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now. The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter. If we find a reserved register, we print a warning: ``` repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm] "mov r7, #1" ^ ``` Reviewers: eli.friedman, olista01, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49727 llvm-svn: 339257
* [ARM][NFC] Replaced tab-characters in test file vtrn.llSjoerd Meijer2018-08-081-100/+100
| | | | llvm-svn: 339251
* [ARM] FP16: codegen support for VEXTSjoerd Meijer2018-08-081-12/+18
| | | | | | Differential Revision: https://reviews.llvm.org/D50427 llvm-svn: 339241
* [ARM] FP16: vector vmov and vdup supportSjoerd Meijer2018-08-081-52/+72
| | | | | | | | This adds codegen support for the vmov_n_f16 and vdup_n_f16 variants. Differential Revision: https://reviews.llvm.org/D50329 llvm-svn: 339238
* [ARM] FP16: vector VMUL variantsSjoerd Meijer2018-08-081-34/+44
| | | | | | | | This adds codegen support for the vmul_lane_f16 and vmul_n_f16 variants. Differential Revision: https://reviews.llvm.org/D50326 llvm-svn: 339232
* [ARM] FP16: support vector INT_TO_FP and FP_TO_INTSjoerd Meijer2018-08-081-43/+65
| | | | | | | | This adds codegen support for the different vcvt_f16 variants. Differential Revision: https://reviews.llvm.org/D50393 llvm-svn: 339227
* Support inline asm with multiple 64bit output in 32bit GPRThomas Preud'homme2018-08-082-122/+307
| | | | | | | | | | | | | | Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45437 llvm-svn: 339225
* [ARM] FP16: support the vector vmin and vmax variantsSjoerd Meijer2018-08-082-32/+350
| | | | | | Differential Revision: https://reviews.llvm.org/D50238 llvm-svn: 339221
* [ARM] FP16: codegen support for VACGTSjoerd Meijer2018-08-071-27/+17
| | | | | | Differential Revision: https://reviews.llvm.org/D50236 llvm-svn: 339148
* [ARM][NFC] Replaced tab characters in test file vfcmp.ll.Sjoerd Meijer2018-08-071-55/+55
| | | | llvm-svn: 339111
* [ARM] FP16: support vector zip and unzipSjoerd Meijer2018-08-031-36/+48
| | | | | | | | This is addressing PR38404. Differential Revision: https://reviews.llvm.org/D50186 llvm-svn: 338835
* [ARM] FP16: support VFMASjoerd Meijer2018-08-031-24/+38
| | | | | | This is addressing PR38404. llvm-svn: 338830
* [ARM][NFC] Follow up of r338568Sjoerd Meijer2018-08-021-75/+120
| | | | | | I disabled more tests than necessary, this enables them. llvm-svn: 338717
* [GlobalISel] Rewrite CallLowering::lowerReturn to accept multiple VRegs per ↵Alexander Ivchenko2018-08-021-36/+13
| | | | | | | | | | Value This is logical continuation of https://reviews.llvm.org/D46018 (r332449) Differential Revision: https://reviews.llvm.org/D49660 llvm-svn: 338685
* [ARM] Armv8.2-A FP16 vector intrinsics testsSjoerd Meijer2018-08-011-0/+1148
| | | | | | | | | | | | | Clang support for the Armv8.2-A FP16 vector intrinsic was committed in rC328277, but this was never followed up, i.e. the LLVM part is missing. I've raised PR38404, and this is the first step to address this. I.e., this adds tests for the Armv8.2-A FP16 vector intrinsic, and thus shows which intrinsics already work, and which need further work. Differential Revision: https://reviews.llvm.org/D50142 llvm-svn: 338568
* Revert r338354 "[ARM] Revert r337821"Reid Kleckner2018-07-313-11/+11
| | | | | | | | | | | | | | | | | Disable ARMCodeGenPrepare by default again. It is causing verifier failues in V8 that look like: Duplicate integer as switch case switch i32 %trunc, label %if.end13 [ i32 0, label %cleanup36 i32 0, label %if.then8 ], !dbg !4981 i32 0 fatal error: error in backend: Broken function found, compilation aborted! I will continue reducing the test case and send it along. llvm-svn: 338452
* [ARM] Revert r337821Sam Parker2018-07-313-11/+11
| | | | | | | Re-enabling ARMCodeGenPrepare by default after failing to reproduce the bootstrap issues that I was concerned it was causing. llvm-svn: 338354
* Reapply "Fix crash on inline asm with 64bit matching input in 32bit GPR"Thomas Preud'homme2018-07-301-0/+80
| | | | | | | | | | | | This reapplies commit r338206 reverted by r338214 since the bug that r338206 uncovered has been fixed in r338268. Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338269
* Fix uninitialized read in ARM's PrintAsmOperandThomas Preud'homme2018-07-301-0/+8
| | | | | | | | | | | | | | | | | Summary: Fix read of uninitialized RC variable in ARM's PrintAsmOperand when hasRegClassConstraint returns false. This was causing inline-asm-operand-implicit-cast test to fail in r338206. Reviewers: t.p.northover, weimingz, javed.absar, chill Reviewed By: chill Subscribers: chill, eraman, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D49984 llvm-svn: 338268
* [ARM] Fix over-alignment in arguments that are HA of 128-bit vectorsPetr Pavlu2018-07-301-0/+16
| | | | | | | | | | | | | | | | | | | | Code in `CC_ARM_AAPCS_Custom_Aggregate()` is responsible for handling homogeneous aggregates for `CC_ARM_AAPCS_VFP`. When an aggregate ends up fully on stack, the function tries to pack all resulting items of the aggregate as tightly as possible according to AAPCS. Once the first item was laid out, the alignment used for consecutive items was the size of one item. This logic went wrong for 128-bit vectors because their alignment is normally only 64 bits, and so could result in inserting unexpected padding between the first and second element. The patch fixes the problem by updating the alignment with the item size only if this results in reducing it. Differential Revision: https://reviews.llvm.org/D49720 llvm-svn: 338233
* revert r338206 because the test does not passSanjay Patel2018-07-291-80/+0
| | | | | | | Example of bot failure: http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll llvm-svn: 338214
* Fix crash on inline asm with 64bit matching input in 32bit GPRThomas Preud'homme2018-07-281-0/+80
| | | | | | | | | Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338206
* [DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B)Craig Topper2018-07-281-2/+2
| | | | | | | | This can be useful since addition is commutable, and subtraction is not. This matches a transform that is also done by InstCombine. llvm-svn: 338181
* [ARM] Add new target feature to fuse literal generationEvandro Menezes2018-07-271-0/+39
| | | | | | | | | | This feature enables the fusion of such operations on Cortex A57 and Cortex A72, as recommended in their Software Optimisation Guides, sections 4.14 and 4.11, respectively. Differential revision: https://reviews.llvm.org/D49563 llvm-svn: 338147
* Fix PR34170: Crash on inline asm with 64bit output in 32bit GPRThomas Preud'homme2018-07-251-0/+42
| | | | | | | | Add support for inline assembly with output operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). llvm-svn: 337903
* [ARM] Disable ARMCodeGenPrepare by defaultSam Parker2018-07-243-11/+11
| | | | | | | | ARM Stage 2 builders have been suspiciously broken since the pass was committed. Disabling to hopefully fix the bots and give me time to debug. llvm-svn: 337821
* [ARM] ARMCodeGenPrepare backend passSam Parker2018-07-233-0/+905
| | | | | | | | | | | | | | | | | | | | | | Arm specific codegen prepare is implemented to perform type promotion on icmp operands, which can enable the removal of uxtb and uxth (unsigned extend) instructions. This is possible because performing type promotion before ISel alleviates this duty from the DAG builder which has to perform legalisation, but has a limited view on data ranges. The pass visits any instruction operand of an icmp and creates a worklist to traverse the use-def tree to determine whether the values can simply be promoted. Our concern is values in the registers overflowing the narrow (i8, i16) data range, so instructions marked with nuw can be promoted easily. For add and sub instructions, we are able to use the parallel dsp instructions to operate on scalar data types and avoid overflowing bits. Underflowing adds and subs are also permitted when the result is only used by an unsigned icmp. Differential Revision: https://reviews.llvm.org/D48832 llvm-svn: 337687
* [ARM] Add new feature to enable optimizing the VFP registersEvandro Menezes2018-07-201-16/+10
| | | | | | | | | Enable the optimization of operations on DPR and SPR via a feature instead of checking the target. Differential revision: https://reviews.llvm.org/D49463 llvm-svn: 337575
* ARM: switch armv7em MachO triple to hard-float defaults and libcalls.Tim Northover2018-07-192-1/+38
| | | | | | | | | We were emitting incorrect calls to libm functions that LLVM had decided it knew about because the default is soft-float. Recommitted without breaking ELF this time. llvm-svn: 337450
* Revert "ARM: switch armv7em triple to hard-float defaults and libcalls."Tim Northover2018-07-182-37/+1
| | | | | | This reverts commit r337385 until it can be targeted at MachO only. llvm-svn: 337424
* ARM: switch armv7em triple to hard-float defaults and libcalls.Tim Northover2018-07-182-1/+37
| | | | | | | We were emitting incorrect calls to libm functions that LLVM had decided it knew about because the default is soft-float. llvm-svn: 337385
* [DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELTSimon Pilgrim2018-07-171-2/+0
| | | | | | | | If we are only extracting vector elements via EXTRACT_VECTOR_ELT(s) we may be able to use SimplifyDemandedVectorElts to avoid unnecessary vector ops. Differential Revision: https://reviews.llvm.org/D49262 llvm-svn: 337258
* [ARM] Regenerated arg endian testSimon Pilgrim2018-07-131-48/+224
| | | | | | As requested on D49262 llvm-svn: 336980
* [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm testsJoel E. Denny2018-07-111-3/+3
| | | | | | | | | | | | | | | | | | | See https://reviews.llvm.org/D47106 for details. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D47171 This commit drops that patch's changes to: llvm/test/CodeGen/NVPTX/f16x2-instructions.ll llvm/test/CodeGen/NVPTX/param-load-store.ll For some reason, the dos line endings there prevent me from commiting via the monorepo. A follow-up commit (not via the monorepo) will finish the patch. llvm-svn: 336843
* [ARM] ParallelDSP: multiple reduction stmts in loopSjoerd Meijer2018-07-111-1/+76
| | | | | | | | | | This fixes an issue that we were not properly supporting multiple reduction stmts in a loop, and not generating SMLADs for these cases. The alias analysis checks were done too early, making it too conservative. Differential revision: https://reviews.llvm.org/D49125 llvm-svn: 336795
* [ARM] Treat cmn immediates as legal in isLegalICmpImmediate.Eli Friedman2018-07-102-5/+5
| | | | | | | | | | | | | The original code attempted to do this, but the std::abs() call didn't actually do anything due to implicit type conversions. Fix the type conversions, and perform the correct check for negative immediates. This probably has very little practical impact, but it's worth fixing just to avoid confusion in the future, I think. Differential Revision: https://reviews.llvm.org/D48907 llvm-svn: 336742
* Revert 336426 (and follow-ups 428, 440), it very likely caused PR38084.Nico Weber2018-07-061-98/+0
| | | | llvm-svn: 336453
* [ARM] ParallelDSP: added statistics, NFC.Sjoerd Meijer2018-07-0613-17/+18
| | | | | | | | | Added statistics for the number of SMLAD instructions created, and als renamed the pass name to -arm-parallel-dsp. Differential Revision: https://reviews.llvm.org/D48971 llvm-svn: 336441
* Commit rL336426 cause buildbot failuresDiogo N. Sampaio2018-07-061-3/+3
| | | | | | | | http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50537/testReport/junit/LLVM/CodeGen_AArch64/FoldRedundantShiftedMasking_ll/ This removes the comments of the function label causing this error. llvm-svn: 336440
* [SelectionDAG] https://reviews.llvm.org/D48278Diogo N. Sampaio2018-07-061-0/+98
| | | | | | | | | | | | | | | | D48278 Allow to reduce redundant shift masks. For example: x1 = x & 0xAB00 x2 = (x >> 8) & 0xAB can be reduced to: x1 = x & 0xAB00 x2 = x1 >> 8 It only allows folding when the masks and shift values are constants. llvm-svn: 336426
* [NEON] Fix combining of vldx_dup intrinsics with updating of base addressesIvan A. Kosarev2018-07-051-0/+43
| | | | | | | | | | | | | Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325
* Partial revert of "NFC - Various typo fixes in tests"Mikael Holmen2018-07-051-10/+11
| | | | | | | | This partially reverts r336268 since it causes buildbot failures. Added FIXME at the places where the CHECKs are misspelled. llvm-svn: 336323
* [ARM] ParallelDSP: only support i16 loads for nowSjoerd Meijer2018-07-051-1/+46
| | | | | | | | | We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319
* NFC - Various typo fixes in testsGabor Buella2018-07-046-19/+19
| | | | llvm-svn: 336268
OpenPOWER on IntegriCloud