diff options
| author | Matt Arsenault <Matthew.Arsenault@amd.com> | 2016-12-09 06:19:12 +0000 |
|---|---|---|
| committer | Matt Arsenault <Matthew.Arsenault@amd.com> | 2016-12-09 06:19:12 +0000 |
| commit | 27c062932a8c3b44fe5d4c4fdbc0310cc32b61c6 (patch) | |
| tree | 88180af32faea184738d4ee76428f3b7a0014fe5 /llvm/lib/Target/AMDGPU/VOP2Instructions.td | |
| parent | 2c84f908afc851a1ba83c6a8471a62db81475ef8 (diff) | |
| download | bcm5719-llvm-27c062932a8c3b44fe5d4c4fdbc0310cc32b61c6.tar.gz bcm5719-llvm-27c062932a8c3b44fe5d4c4fdbc0310cc32b61c6.zip | |
AMDGPU: Select i16 instructions to VOP3 forms
These were selecting directly to the VOP2 form instead
of VOP3 like the i32 instructions. Fixes regressions in
future commits where an immediate isn't folded because it was
initially used for the second operand.
Because uniform 16-bit operations are promoted to i32, it's
difficult to get a simple testcase where this matters. Fold
failures in SIFoldOperands here tend to be hidden by commute
and fold in SIShrinkInstructions.
llvm-svn: 289189
Diffstat (limited to 'llvm/lib/Target/AMDGPU/VOP2Instructions.td')
| -rw-r--r-- | llvm/lib/Target/AMDGPU/VOP2Instructions.td | 20 |
1 files changed, 10 insertions, 10 deletions
diff --git a/llvm/lib/Target/AMDGPU/VOP2Instructions.td b/llvm/lib/Target/AMDGPU/VOP2Instructions.td index 348ebe1e0d7..0e87f90b62b 100644 --- a/llvm/lib/Target/AMDGPU/VOP2Instructions.td +++ b/llvm/lib/Target/AMDGPU/VOP2Instructions.td @@ -416,27 +416,27 @@ class ZExt_i16_i1_Pat <SDNode ext> : Pat < let Predicates = [isVI] in { -defm : Arithmetic_i16_Pats<add, V_ADD_U16_e32>; -defm : Arithmetic_i16_Pats<mul, V_MUL_LO_U16_e32>; -defm : Arithmetic_i16_Pats<sub, V_SUB_U16_e32>; -defm : Arithmetic_i16_Pats<smin, V_MIN_I16_e32>; -defm : Arithmetic_i16_Pats<smax, V_MAX_I16_e32>; -defm : Arithmetic_i16_Pats<umin, V_MIN_U16_e32>; -defm : Arithmetic_i16_Pats<umax, V_MAX_U16_e32>; +defm : Arithmetic_i16_Pats<add, V_ADD_U16_e64>; +defm : Arithmetic_i16_Pats<mul, V_MUL_LO_U16_e64>; +defm : Arithmetic_i16_Pats<sub, V_SUB_U16_e64>; +defm : Arithmetic_i16_Pats<smin, V_MIN_I16_e64>; +defm : Arithmetic_i16_Pats<smax, V_MAX_I16_e64>; +defm : Arithmetic_i16_Pats<umin, V_MIN_U16_e64>; +defm : Arithmetic_i16_Pats<umax, V_MAX_U16_e64>; def : Pat < (and i16:$src0, i16:$src1), - (V_AND_B32_e32 $src0, $src1) + (V_AND_B32_e64 $src0, $src1) >; def : Pat < (or i16:$src0, i16:$src1), - (V_OR_B32_e32 $src0, $src1) + (V_OR_B32_e64 $src0, $src1) >; def : Pat < (xor i16:$src0, i16:$src1), - (V_XOR_B32_e32 $src0, $src1) + (V_XOR_B32_e64 $src0, $src1) >; defm : Bits_OpsRev_i16_Pats<shl, V_LSHLREV_B16_e32>; |

