diff options
| author | Craig Topper <craig.topper@intel.com> | 2019-08-20 20:20:04 +0000 |
|---|---|---|
| committer | Craig Topper <craig.topper@intel.com> | 2019-08-20 20:20:04 +0000 |
| commit | 3a2b08e6c90c10fee341f3be257397dbb6034ecb (patch) | |
| tree | fab83a46260132af4a7e58e3a3f5d16fee014c2e /llvm/lib/Target | |
| parent | 8f5e1755ca385566c0352a9bd292218cebfd3d0b (diff) | |
| download | bcm5719-llvm-3a2b08e6c90c10fee341f3be257397dbb6034ecb.tar.gz bcm5719-llvm-3a2b08e6c90c10fee341f3be257397dbb6034ecb.zip | |
[X86] Add a DAG combine to transform (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) -> (i8 (trunc (i16 (bitcast (v16i1 X))))) on KNL target
Without AVX512DQ we don't have KMOVB so we can't really copy 8-bits of a k-register to a GPR. We have to copy 16 bits instead. We do this even if the DAG copy is from v8i1->v16i1. If we detect the (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) we should rewrite the types to match the copy we do support. By doing this, we can help known bits to propagate without losing the upper 8 bits of the input to the extract_subvector. This allows some zero extends to be removed since we have an isel pattern to use kmovw for (zero_extend (i16 (bitcast (v16i1 X))).
Differential Revision: https://reviews.llvm.org/D66489
llvm-svn: 369434
Diffstat (limited to 'llvm/lib/Target')
| -rw-r--r-- | llvm/lib/Target/X86/X86ISelLowering.cpp | 12 |
1 files changed, 12 insertions, 0 deletions
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index 88a3b27aa98..ea4fe8c6712 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -35402,6 +35402,18 @@ static SDValue combineBitcast(SDNode *N, SelectionDAG &DAG, } } + // Look for (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) and + // replace with (i8 (trunc (i16 (bitcast (v16i1 X))))). This can occur + // due to insert_subvector legalization on KNL. By promoting the copy to i16 + // we can help with known bits propagation from the vXi1 domain to the + // scalar domain. + if (VT == MVT::i8 && SrcVT == MVT::v8i1 && Subtarget.hasAVX512() && + !Subtarget.hasDQI() && N0.getOpcode() == ISD::EXTRACT_SUBVECTOR && + N0.getOperand(0).getValueType() == MVT::v16i1 && + isNullConstant(N0.getOperand(1))) + return DAG.getNode(ISD::TRUNCATE, SDLoc(N), VT, + DAG.getBitcast(MVT::i16, N0.getOperand(0))); + // Since MMX types are special and don't usually play with other vector types, // it's better to handle them early to be sure we emit efficient code by // avoiding store-load conversions. |

