diff options
| author | Elena Demikhovsky <elena.demikhovsky@intel.com> | 2016-02-25 07:05:12 +0000 |
|---|---|---|
| committer | Elena Demikhovsky <elena.demikhovsky@intel.com> | 2016-02-25 07:05:12 +0000 |
| commit | e5bbca6ae2946c47c407f27b02aab5b5cfb0ecd7 (patch) | |
| tree | 66fadcce139df6398ff351898915e687cf62bdb4 /llvm/lib | |
| parent | 26e077178de8278f120810970370938f77f91fd4 (diff) | |
| download | bcm5719-llvm-e5bbca6ae2946c47c407f27b02aab5b5cfb0ecd7.tar.gz bcm5719-llvm-e5bbca6ae2946c47c407f27b02aab5b5cfb0ecd7.zip | |
Optimized loading (zextload) of i1 value from memory.
This patch is a partial revert of https://llvm.org/svn/llvm-project/llvm/trunk@237793.
Extra "and" causes performance degradation.
We assume that i1 is stored in zero-extended form. And store operation is responsible for zeroing upper bits.
Differential Revision: http://reviews.llvm.org/D17541
llvm-svn: 261828
Diffstat (limited to 'llvm/lib')
| -rw-r--r-- | llvm/lib/Target/X86/X86InstrAVX512.td | 8 | ||||
| -rw-r--r-- | llvm/lib/Target/X86/X86InstrCompiler.td | 11 |
2 files changed, 6 insertions, 13 deletions
diff --git a/llvm/lib/Target/X86/X86InstrAVX512.td b/llvm/lib/Target/X86/X86InstrAVX512.td index 12e8887b629..c3142ccfec6 100644 --- a/llvm/lib/Target/X86/X86InstrAVX512.td +++ b/llvm/lib/Target/X86/X86InstrAVX512.td @@ -2193,14 +2193,6 @@ def : Pat<(store (i1 -1), addr:$dst), (MOV8mi addr:$dst, (i8 1))>; def : Pat<(store (i1 1), addr:$dst), (MOV8mi addr:$dst, (i8 1))>; def : Pat<(store (i1 0), addr:$dst), (MOV8mi addr:$dst, (i8 0))>; -def truncstorei1 : PatFrag<(ops node:$val, node:$ptr), - (truncstore node:$val, node:$ptr), [{ - return cast<StoreSDNode>(N)->getMemoryVT() == MVT::i1; -}]>; - -def : Pat<(truncstorei1 GR8:$src, addr:$dst), - (MOV8mr addr:$dst, GR8:$src)>; - // With AVX-512 only, 8-bit mask is promoted to 16-bit mask. let Predicates = [HasAVX512, NoDQI] in { // GR from/to 8-bit mask without native support diff --git a/llvm/lib/Target/X86/X86InstrCompiler.td b/llvm/lib/Target/X86/X86InstrCompiler.td index c709c8aca9f..6e63694a739 100644 --- a/llvm/lib/Target/X86/X86InstrCompiler.td +++ b/llvm/lib/Target/X86/X86InstrCompiler.td @@ -1139,12 +1139,13 @@ defm : CMOVmr<X86_COND_O , CMOVNO16rm, CMOVNO32rm, CMOVNO64rm>; defm : CMOVmr<X86_COND_NO, CMOVO16rm , CMOVO32rm , CMOVO64rm>; // zextload bool -> zextload byte -def : Pat<(zextloadi8i1 addr:$src), (AND8ri (MOV8rm addr:$src), (i8 1))>; -def : Pat<(zextloadi16i1 addr:$src), (AND16ri8 (MOVZX16rm8 addr:$src), (i16 1))>; -def : Pat<(zextloadi32i1 addr:$src), (AND32ri8 (MOVZX32rm8 addr:$src), (i32 1))>; +// i1 stored in one byte in zero-extended form. +// Upper bits cleanup should be executed before Store. +def : Pat<(zextloadi8i1 addr:$src), (MOV8rm addr:$src)>; +def : Pat<(zextloadi16i1 addr:$src), (MOVZX16rm8 addr:$src)>; +def : Pat<(zextloadi32i1 addr:$src), (MOVZX32rm8 addr:$src)>; def : Pat<(zextloadi64i1 addr:$src), - (SUBREG_TO_REG (i64 0), - (AND32ri8 (MOVZX32rm8 addr:$src), (i32 1)), sub_32bit)>; + (SUBREG_TO_REG (i64 0), (MOVZX32rm8 addr:$src), sub_32bit)>; // extload bool -> extload byte // When extloading from 16-bit and smaller memory locations into 64-bit |

