diff options
| author | Michael Bedy <mjbedy@gmail.com> | 2018-03-30 05:03:36 +0000 |
|---|---|---|
| committer | Michael Bedy <mjbedy@gmail.com> | 2018-03-30 05:03:36 +0000 |
| commit | 59e5ef793c587cee64b8b10c4be48ec0aef0a126 (patch) | |
| tree | 3b312935127b05e89fe73ec4c1785f487cdc54f3 /llvm/test | |
| parent | a7d614c3e58400e9b053b3eaba969a7e77b26945 (diff) | |
| download | bcm5719-llvm-59e5ef793c587cee64b8b10c4be48ec0aef0a126.tar.gz bcm5719-llvm-59e5ef793c587cee64b8b10c4be48ec0aef0a126.zip | |
[AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE.
Summary:
The phase attempts to transform operations that extract a portion of a value
into an SDWA src operand in cases where that value is used only once. It
was not prepared for this use to be the preserved portion of a value for
dst:UNUSED_PRESERVE, resulting in a crash or assert.
This change either rejects the illegal SDWA attempt, or in the case where
dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded
extract instruction.
Reviewers: arsenm, #amdgpu
Reviewed By: arsenm, #amdgpu
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D44364
llvm-svn: 328856
Diffstat (limited to 'llvm/test')
| -rw-r--r-- | llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir | 88 |
1 files changed, 88 insertions, 0 deletions
diff --git a/llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir b/llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir index 6c480d0dd69..7263baade02 100644 --- a/llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir +++ b/llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir @@ -54,3 +54,91 @@ body: | FLAT_STORE_DWORD %0, %13, 0, 0, 0, implicit $exec, implicit $flat_scr :: (store 4) $sgpr30_sgpr31 = COPY %2 S_SETPC_B64_return $sgpr30_sgpr31 + +--- +# SDWA-LABEL: sdwa_preserve_keep +# SDWA: flat_load_dword [[FIRST:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}] +# SDWA: flat_load_dword [[SECOND:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}] + +# SDWA: v_and_b32_e32 [[AND:v[0-9]+]], 0xff, [[FIRST]] +# SDWA: v_mov_b32_sdwa [[AND]], [[SECOND]] dst_sel:WORD_1 dst_unused:UNUSED_PRESERVE src0_sel:WORD_0 + +# SDWA: flat_store_dword v[{{[0-9]+}}:{{[0-9]+}}], [[AND]] + +name: sdwa_preserve_keep +tracksRegLiveness: true +registers: + - { id: 0, class: vreg_64 } + - { id: 1, class: vreg_64 } + - { id: 2, class: sreg_64 } + - { id: 3, class: vgpr_32 } + - { id: 4, class: vgpr_32 } + - { id: 5, class: sreg_32_xm0_xexec } + - { id: 6, class: vgpr_32 } + - { id: 7, class: vgpr_32 } + - { id: 8, class: sreg_32_xm0 } + - { id: 9, class: vgpr_32 } + - { id: 10, class: sreg_32_xm0 } + - { id: 11, class: vgpr_32 } + - { id: 17, class: vgpr_32 } +body: | + bb.0: + liveins: $vgpr0_vgpr1, $vgpr2_vgpr3, $sgpr30_sgpr31 + + %2 = COPY $sgpr30_sgpr31 + %1 = COPY $vgpr2_vgpr3 + %0 = COPY $vgpr0_vgpr1 + %3 = FLAT_LOAD_DWORD %0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4) + %4 = FLAT_LOAD_DWORD %1, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4) + + %9:vgpr_32 = V_LSHRREV_B16_e64 8, %3, implicit $exec + %10:sreg_32_xm0 = S_MOV_B32 255 + %11:vgpr_32 = V_AND_B32_e64 %3, killed %10, implicit $exec + %17:vgpr_32 = V_MOV_B32_sdwa 0, %4, 0, 5, 2, 4, implicit $exec, implicit %11(tied-def 0) + FLAT_STORE_DWORD %0, %17, 0, 0, 0, implicit $exec, implicit $flat_scr :: (store 4) + S_ENDPGM + +... +--- +# SDWA-LABEL: sdwa_preserve_remove +# SDWA: flat_load_dword [[FIRST:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}] +# SDWA: flat_load_dword [[SECOND:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}] + +# SDWA: v_mov_b32_sdwa [[FIRST]], [[SECOND]] dst_sel:WORD_1 dst_unused:UNUSED_PRESERVE src0_sel:WORD_0 + +# SDWA: flat_store_dword v[{{[0-9]+}}:{{[0-9]+}}], [[FIRST]] + +name: sdwa_preserve_remove +tracksRegLiveness: true +registers: + - { id: 0, class: vreg_64 } + - { id: 1, class: vreg_64 } + - { id: 2, class: sreg_64 } + - { id: 3, class: vgpr_32 } + - { id: 4, class: vgpr_32 } + - { id: 5, class: sreg_32_xm0_xexec } + - { id: 6, class: vgpr_32 } + - { id: 7, class: vgpr_32 } + - { id: 8, class: sreg_32_xm0 } + - { id: 9, class: vgpr_32 } + - { id: 10, class: sreg_32_xm0 } + - { id: 11, class: vgpr_32 } + - { id: 17, class: vgpr_32 } +body: | + bb.0: + liveins: $vgpr0_vgpr1, $vgpr2_vgpr3, $sgpr30_sgpr31 + + %2 = COPY $sgpr30_sgpr31 + %1 = COPY $vgpr2_vgpr3 + %0 = COPY $vgpr0_vgpr1 + %3 = FLAT_LOAD_DWORD %0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4) + %4 = FLAT_LOAD_DWORD %1, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4) + + %9:vgpr_32 = V_LSHRREV_B16_e64 8, %3, implicit $exec + %10:sreg_32_xm0 = S_MOV_B32 65535 + %11:vgpr_32 = V_AND_B32_e64 %3, killed %10, implicit $exec + %17:vgpr_32 = V_MOV_B32_sdwa 0, %4, 0, 5, 2, 4, implicit $exec, implicit %11(tied-def 0) + FLAT_STORE_DWORD %0, %17, 0, 0, 0, implicit $exec, implicit $flat_scr :: (store 4) + S_ENDPGM + +... |

