diff options
author | Simon Pilgrim <llvm-dev@redking.me.uk> | 2016-04-28 12:22:53 +0000 |
---|---|---|
committer | Simon Pilgrim <llvm-dev@redking.me.uk> | 2016-04-28 12:22:53 +0000 |
commit | bd4a3be7d29e83d6b35c588abf485ee707402275 (patch) | |
tree | fea83449dc8dd5d6318aaa48567f0648f20967f8 /llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | |
parent | 1e73ef38827800c7d2de8ea3f75c093f956f50ad (diff) | |
download | bcm5719-llvm-bd4a3be7d29e83d6b35c588abf485ee707402275.tar.gz bcm5719-llvm-bd4a3be7d29e83d6b35c588abf485ee707402275.zip |
[InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits
The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits.
This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero.
Differential Revision: http://reviews.llvm.org/D19614
llvm-svn: 267873
Diffstat (limited to 'llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp')
-rw-r--r-- | llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp index 3930cc010a6..3c0a28c4b7f 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp @@ -768,6 +768,28 @@ Value *InstCombiner::SimplifyDemandedUseBits(Value *V, APInt DemandedMask, // TODO: Could compute known zero/one bits based on the input. break; } + case Intrinsic::x86_sse_movmsk_ps: + case Intrinsic::x86_sse2_movmsk_pd: + case Intrinsic::x86_sse2_pmovmskb_128: + case Intrinsic::x86_avx_movmsk_ps_256: + case Intrinsic::x86_avx_movmsk_pd_256: + case Intrinsic::x86_avx2_pmovmskb: { + // MOVMSK copies the vector elements' sign bits to the low bits + // and zeros the high bits. + auto Arg = II->getArgOperand(0); + auto ArgType = cast<VectorType>(Arg->getType()); + unsigned ArgWidth = ArgType->getNumElements(); + + // If we don't need any of low bits then return zero, + // we know that DemandedMask is non-zero already. + APInt DemandedElts = DemandedMask.zextOrTrunc(ArgWidth); + if (DemandedElts == 0) + return ConstantInt::getNullValue(VTy); + + // We know that the upper bits are set to zero. + KnownZero = APInt::getHighBitsSet(BitWidth, BitWidth - ArgWidth); + return nullptr; + } case Intrinsic::x86_sse42_crc32_64_64: KnownZero = APInt::getHighBitsSet(64, 32); return nullptr; |