summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/Thumb2/float-intrinsics-float.ll
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectorsDavid Green2020-01-051-1/+1
| | | | | | | | | | | This adds extra scalar handling to isFMAFasterThanFMulAndFAdd, allowing the target independent code to handle more folds in more situations (for example if the fast math flags are present, but the global AllowFPOpFusion option isnt). It also splits apart the HasSlowFPVMLx into HasSlowFPVFMx, to allow VFMA and VMLA to be controlled separately if needed. Differential Revision: https://reviews.llvm.org/D72139
* [ARM] Replace fp-only-sp and d16 with fp64 and d32.Simon Tatham2019-05-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list *before* rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 llvm-svn: 361845
* [ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33Sjoerd Meijer2018-09-241-3/+2
| | | | | | | | | | | | A sequence of VMUL and VADD instructions always give the same or better performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33. Executing the VMUL and VADD back-to-back requires the same cycles, but having separate instructions allows scheduling to avoid the hazard between these 2 instructions. Differential Revision: https://reviews.llvm.org/D52289 llvm-svn: 342874
* ARM: match GCC's behaviour for builtinsSaleem Abdulrasool2017-01-131-1/+1
| | | | | | | | | | | | GCC changes the CC between the user-code and the builtins based on the value of `-target` rather than `-mfloat-abi`. When a HF target is used, the VFP variant of the AAPCS CC is used. Otherwise, the AAPCS variant is used. In all cases, the AEABI functions use the AAPCS CC. Adjust the calling convention based on the target. Resolves PR30543! llvm-svn: 291909
* CodeGen: ensure that libcalls are always AAPCS CCSaleem Abdulrasool2016-09-071-1/+1
| | | | | | | The original commit was too aggressive about marking LibCalls as AAPCS. The libcalls contain libc/libm/libunwind calls which are not AAPCS, but C. llvm-svn: 280833
* Revert "CodeGen: ensure that libcalls are always AAPCS CC"Saleem Abdulrasool2016-09-071-58/+57
| | | | | | | This reverts SVN r280683. Revert until I figure out why this is breaking lli tests. llvm-svn: 280778
* CodeGen: ensure that libcalls are always AAPCS CCSaleem Abdulrasool2016-09-061-57/+58
| | | | | | | | | | | | | All of the builtins are designed to be invoked with ARM AAPCS CC even on ARM AAPCS VFP CC hosts. Tweak the default initialisation to ARM AAPCS CC rather than C CC for ARM/thumb targets. The changes to the tests are necessary to ensure that the calling convention for the lowered library calls are honoured. Furthermore, these adjustments cause certain branch invocations to change to branch-and-link since the returned value needs to be moved across registers (d0 -> r0, r1). llvm-svn: 280683
* [ARM] Use correct half-precision functions in EABI modeOliver Stannard2015-10-071-2/+2
| | | | | | | | | The ARM RTABI defines the half- to single-precision float conversion functions with an __aeabi prefix, but libgcc only has them with a __gnu prefix. Therefore we need to emit the __aeabi version when compiling with an eabi or eabihf triple, and the __gnu version with a gnueabi or gnueabihf triple. llvm-svn: 249565
* [ARM] Allow selecting VRINT[APMXZR] and VCVT[BT] instructions for FPv5Oliver Stannard2014-10-011-16/+23
| | | | | | | | | | Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions when targeting ARMv8, but they are actually present on any target with FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an M-profile core, but they have the same instructions so we model them both as FPARMv8 in the ARM backend. llvm-svn: 218763
* [ARM] Add support for Cortex-M7, FPv5-SP and FPv5-DP (LLVM)Oliver Stannard2014-10-011-5/+9
| | | | | | | | | The Cortex-M7 has 3 options for its FPU: none, FPv5-SP-D16 and FPv5-DP-D16. FPv5 has the same instructions as FP-ARMv8, so it can be modelled using the same target feature, and all double-precision operations are already disabled by the fp-only-sp target features. llvm-svn: 218747
* [ARM] Enable DP copy, load and store instructions for FPv4-SPOliver Stannard2014-08-211-0/+210
The FPv4-SP floating-point unit is generally referred to as single-precision only, but it does have double-precision registers and load, store and GPR<->DPR move instructions which operate on them. This patch enables the use of these registers, the main advantage of which is that we now comply with the AAPCS-VFP calling convention. This partially reverts r209650, which added some AAPCS-VFP support, but did not handle return values or alignment of double arguments in registers. This patch also adds tests for Thumb2 code generation for floating-point instructions and intrinsics, which previously only existed for ARM. llvm-svn: 216172
OpenPOWER on IntegriCloud