diff options
author | Charlie Turner <charlie.turner@arm.com> | 2015-11-17 17:25:15 +0000 |
---|---|---|
committer | Charlie Turner <charlie.turner@arm.com> | 2015-11-17 17:25:15 +0000 |
commit | 7968b981bf094f7bb2942a8389de4bf496676cc4 (patch) | |
tree | 2ab42375471481e475a82934e37160e0a6c38e0c /llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | |
parent | 431e1143ec79b0f05a98a32ad028a1e26e1ecd2e (diff) | |
download | bcm5719-llvm-7968b981bf094f7bb2942a8389de4bf496676cc4.tar.gz bcm5719-llvm-7968b981bf094f7bb2942a8389de4bf496676cc4.zip |
[ARM] Don't pessimize i32 vselect.
The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad
scalarization that is still happening there.
I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements.
From my benchmarks, I saw these improvements in A57 (T32)
spec.cpu2000.ref.177_mesa 5.95%
lnt.SingleSource/Benchmarks/Shootout/strcat 12.93%
lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89%
I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change
Differential Revision: http://reviews.llvm.org/D14743
llvm-svn: 253349
Diffstat (limited to 'llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp')
-rw-r--r-- | llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | 3 |
1 files changed, 0 insertions, 3 deletions
diff --git a/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp b/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp index 45a45a7013c..582a057e923 100644 --- a/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp +++ b/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp @@ -274,9 +274,6 @@ int ARMTTIImpl::getCmpSelInstrCost(unsigned Opcode, Type *ValTy, Type *CondTy) { if (ST->hasNEON() && ValTy->isVectorTy() && ISD == ISD::SELECT) { // Lowering of some vector selects is currently far from perfect. static const TypeConversionCostTblEntry NEONVectorSelectTbl[] = { - { ISD::SELECT, MVT::v16i1, MVT::v16i16, 2*16 + 1 + 3*1 + 4*1 }, - { ISD::SELECT, MVT::v8i1, MVT::v8i32, 4*8 + 1*3 + 1*4 + 1*2 }, - { ISD::SELECT, MVT::v16i1, MVT::v16i32, 4*16 + 1*6 + 1*8 + 1*4 }, { ISD::SELECT, MVT::v4i1, MVT::v4i64, 4*4 + 1*2 + 1 }, { ISD::SELECT, MVT::v8i1, MVT::v8i64, 50 }, { ISD::SELECT, MVT::v16i1, MVT::v16i64, 100 } |