diff options
| author | Simon Pilgrim <llvm-dev@redking.me.uk> | 2017-09-22 09:50:52 +0000 |
|---|---|---|
| committer | Simon Pilgrim <llvm-dev@redking.me.uk> | 2017-09-22 09:50:52 +0000 |
| commit | 2b1c3bb25daddd3d34a85d021f1b29a91dc932e5 (patch) | |
| tree | 36fa731396db565f32bc29ec92c79975b2cb12e7 /llvm/lib/Target/ARM | |
| parent | 489604cd1196dd899713cb31c4b7c0f1b546c56b (diff) | |
| download | bcm5719-llvm-2b1c3bb25daddd3d34a85d021f1b29a91dc932e5.tar.gz bcm5719-llvm-2b1c3bb25daddd3d34a85d021f1b29a91dc932e5.zip | |
[ARM] Add missing selection patterns for vnmla
For the following function:
double fn1(double d0, double d1, double d2) {
double a = -d0 - d1 * d2;
return a;
}
on ARM, LLVM generates code along the lines of
vneg.f64 d0, d0
vmls.f64 d0, d1, d2
i.e., a negate and a multiply-subtract.
The attached patch adds instruction selection patterns to allow it to generate the single instruction
vnmla.f64 d0, d1, d2
(multiply-add with negation) instead, like GCC does.
Committed on behalf of @gergo- (Gergö Barany)
Differential Revision: https://reviews.llvm.org/D35911
llvm-svn: 313972
Diffstat (limited to 'llvm/lib/Target/ARM')
| -rw-r--r-- | llvm/lib/Target/ARM/ARMInstrVFP.td | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/llvm/lib/Target/ARM/ARMInstrVFP.td b/llvm/lib/Target/ARM/ARMInstrVFP.td index 873fe5e6e0f..b43216fb5c5 100644 --- a/llvm/lib/Target/ARM/ARMInstrVFP.td +++ b/llvm/lib/Target/ARM/ARMInstrVFP.td @@ -1857,6 +1857,7 @@ def VNMLAH : AHbI<0b11100, 0b01, 1, 0, RegConstraint<"$Sdin = $Sd">, Requires<[HasFullFP16,UseFPVMLx,DontUseFusedMAC]>; +// (-(a * b) - dst) -> -(dst + (a * b)) def : Pat<(fsub_mlx (fneg (fmul_su DPR:$a, (f64 DPR:$b))), DPR:$dstin), (VNMLAD DPR:$dstin, DPR:$a, DPR:$b)>, Requires<[HasVFP2,HasDPVFP,UseFPVMLx,DontUseFusedMAC]>; @@ -1864,6 +1865,14 @@ def : Pat<(fsub_mlx (fneg (fmul_su SPR:$a, SPR:$b)), SPR:$dstin), (VNMLAS SPR:$dstin, SPR:$a, SPR:$b)>, Requires<[HasVFP2,DontUseNEONForFP,UseFPVMLx,DontUseFusedMAC]>; +// (-dst - (a * b)) -> -(dst + (a * b)) +def : Pat<(fsub_mlx (fneg DPR:$dstin), (fmul_su DPR:$a, (f64 DPR:$b))), + (VNMLAD DPR:$dstin, DPR:$a, DPR:$b)>, + Requires<[HasVFP2,HasDPVFP,UseFPVMLx,DontUseFusedMAC]>; +def : Pat<(fsub_mlx (fneg SPR:$dstin), (fmul_su SPR:$a, SPR:$b)), + (VNMLAS SPR:$dstin, SPR:$a, SPR:$b)>, + Requires<[HasVFP2,DontUseNEONForFP,UseFPVMLx,DontUseFusedMAC]>; + def VNMLSD : ADbI<0b11100, 0b01, 0, 0, (outs DPR:$Dd), (ins DPR:$Ddin, DPR:$Dn, DPR:$Dm), IIC_fpMAC64, "vnmls", ".f64\t$Dd, $Dn, $Dm", |

