[X86] Don't report gather is legal on Skylake CPUs when AVX2/AVX512 is disabled. Allow gather on SKX/CNL/ICL when AVX512 is disabled by using AVX2 instructions.

Summary: This adds a new fast gather feature bit to cover all CPUs that support fast gather that we can use independent of whether the AVX512 feature is enabled. I'm only using this new bit to qualify AVX2 codegen. AVX512 is still implicitly assuming fast gather to keep tests working and to match the scatter behavior. Test command lines have been added for these two cases. Reviewers: magabari, delena, RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40282 llvm-svn: 318983
author: Craig Topper <craig.topper@intel.com> 2017-11-25 18:09:37 +0000
committer: Craig Topper <craig.topper@intel.com> 2017-11-25 18:09:37 +0000
commit: ea37e201ec2f9c3d8b2c9bb37ff48cacdd992f55 (patch)
tree: 6bebfa5efe92abba04700e0a8ba456b2e9e7def6 /llvm/lib/Target/X86/X86Subtarget.cpp
parent: e7426556c16d86d52356b23538708dbea2008a76 (diff)
download: bcm5719-llvm-ea37e201ec2f9c3d8b2c9bb37ff48cacdd992f55.tar.gz
bcm5719-llvm-ea37e201ec2f9c3d8b2c9bb37ff48cacdd992f55.zip
1 files changed, 6 insertions, 6 deletions
diff --git a/llvm/lib/Target/X86/X86Subtarget.cpp b/llvm/lib/Target/X86/X86Subtarget.cpp
index 8543d189cdb..0f995404618 100644
--- a/llvm/lib/Target/X86/X86Subtarget.cpp
+++ b/llvm/lib/Target/X86/X86Subtarget.cpp
@@ -270,14 +270,13 @@ void X86Subtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {
            isTargetKFreeBSD() || In64BitMode)
     stackAlignment = 16;
 
-  // Gather is available since Haswell (AVX2 set). So technically, we can
-  // generate Gathers on all AVX2 processors. But the overhead on HSW is high.
-  // Skylake Client processor has faster Gathers than HSW and performance is
-  // similar to Skylake Server (AVX-512). The specified overhead is relative to
-  // the Load operation. "2" is the number provided by Intel architects. This
+  // Some CPUs have more overhead for gather. The specified overhead is relative
+  // to the Load operation. "2" is the number provided by Intel architects. This
   // parameter is used for cost estimation of Gather Op and comparison with
   // other alternatives.
-  if (X86ProcFamily == IntelSkylake || hasAVX512())
+  // TODO: Remove the explicit hasAVX512()?, That would mean we would only
+  // enable gather with a -march.
+  if (hasAVX512() || (hasAVX2() && hasFastGather()))
     GatherOverhead = 2;
   if (hasAVX512())
     ScatterOverhead = 2;
@@ -345,6 +344,7 @@ void X86Subtarget::initializeEnvironment() {
   HasCmpxchg16b = false;
   UseLeaForSP = false;
   HasFastPartialYMMorZMMWrite = false;
+  HasFastGather = false;
   HasFastScalarFSQRT = false;
   HasFastVectorFSQRT = false;
   HasFastLZCNT = false;
author	Craig Topper <craig.topper@intel.com>	2017-11-25 18:09:37 +0000
committer	Craig Topper <craig.topper@intel.com>	2017-11-25 18:09:37 +0000
commit	ea37e201ec2f9c3d8b2c9bb37ff48cacdd992f55 (patch)
tree	6bebfa5efe92abba04700e0a8ba456b2e9e7def6 /llvm/lib/Target/X86/X86Subtarget.cpp
parent	e7426556c16d86d52356b23538708dbea2008a76 (diff)
download	bcm5719-llvm-ea37e201ec2f9c3d8b2c9bb37ff48cacdd992f55.tar.gz bcm5719-llvm-ea37e201ec2f9c3d8b2c9bb37ff48cacdd992f55.zip