summaryrefslogtreecommitdiffstats
path: root/llvm/test
diff options
context:
space:
mode:
authorMichael Bedy <mjbedy@gmail.com>2018-03-30 05:03:36 +0000
committerMichael Bedy <mjbedy@gmail.com>2018-03-30 05:03:36 +0000
commit59e5ef793c587cee64b8b10c4be48ec0aef0a126 (patch)
tree3b312935127b05e89fe73ec4c1785f487cdc54f3 /llvm/test
parenta7d614c3e58400e9b053b3eaba969a7e77b26945 (diff)
downloadbcm5719-llvm-59e5ef793c587cee64b8b10c4be48ec0aef0a126.tar.gz
bcm5719-llvm-59e5ef793c587cee64b8b10c4be48ec0aef0a126.zip
[AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE.
Summary: The phase attempts to transform operations that extract a portion of a value into an SDWA src operand in cases where that value is used only once. It was not prepared for this use to be the preserved portion of a value for dst:UNUSED_PRESERVE, resulting in a crash or assert. This change either rejects the illegal SDWA attempt, or in the case where dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded extract instruction. Reviewers: arsenm, #amdgpu Reviewed By: arsenm, #amdgpu Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D44364 llvm-svn: 328856
Diffstat (limited to 'llvm/test')
-rw-r--r--llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir88
1 files changed, 88 insertions, 0 deletions
diff --git a/llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir b/llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir
index 6c480d0dd69..7263baade02 100644
--- a/llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir
+++ b/llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir
@@ -54,3 +54,91 @@ body: |
FLAT_STORE_DWORD %0, %13, 0, 0, 0, implicit $exec, implicit $flat_scr :: (store 4)
$sgpr30_sgpr31 = COPY %2
S_SETPC_B64_return $sgpr30_sgpr31
+
+---
+# SDWA-LABEL: sdwa_preserve_keep
+# SDWA: flat_load_dword [[FIRST:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}]
+# SDWA: flat_load_dword [[SECOND:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}]
+
+# SDWA: v_and_b32_e32 [[AND:v[0-9]+]], 0xff, [[FIRST]]
+# SDWA: v_mov_b32_sdwa [[AND]], [[SECOND]] dst_sel:WORD_1 dst_unused:UNUSED_PRESERVE src0_sel:WORD_0
+
+# SDWA: flat_store_dword v[{{[0-9]+}}:{{[0-9]+}}], [[AND]]
+
+name: sdwa_preserve_keep
+tracksRegLiveness: true
+registers:
+ - { id: 0, class: vreg_64 }
+ - { id: 1, class: vreg_64 }
+ - { id: 2, class: sreg_64 }
+ - { id: 3, class: vgpr_32 }
+ - { id: 4, class: vgpr_32 }
+ - { id: 5, class: sreg_32_xm0_xexec }
+ - { id: 6, class: vgpr_32 }
+ - { id: 7, class: vgpr_32 }
+ - { id: 8, class: sreg_32_xm0 }
+ - { id: 9, class: vgpr_32 }
+ - { id: 10, class: sreg_32_xm0 }
+ - { id: 11, class: vgpr_32 }
+ - { id: 17, class: vgpr_32 }
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1, $vgpr2_vgpr3, $sgpr30_sgpr31
+
+ %2 = COPY $sgpr30_sgpr31
+ %1 = COPY $vgpr2_vgpr3
+ %0 = COPY $vgpr0_vgpr1
+ %3 = FLAT_LOAD_DWORD %0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4)
+ %4 = FLAT_LOAD_DWORD %1, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4)
+
+ %9:vgpr_32 = V_LSHRREV_B16_e64 8, %3, implicit $exec
+ %10:sreg_32_xm0 = S_MOV_B32 255
+ %11:vgpr_32 = V_AND_B32_e64 %3, killed %10, implicit $exec
+ %17:vgpr_32 = V_MOV_B32_sdwa 0, %4, 0, 5, 2, 4, implicit $exec, implicit %11(tied-def 0)
+ FLAT_STORE_DWORD %0, %17, 0, 0, 0, implicit $exec, implicit $flat_scr :: (store 4)
+ S_ENDPGM
+
+...
+---
+# SDWA-LABEL: sdwa_preserve_remove
+# SDWA: flat_load_dword [[FIRST:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}]
+# SDWA: flat_load_dword [[SECOND:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}]
+
+# SDWA: v_mov_b32_sdwa [[FIRST]], [[SECOND]] dst_sel:WORD_1 dst_unused:UNUSED_PRESERVE src0_sel:WORD_0
+
+# SDWA: flat_store_dword v[{{[0-9]+}}:{{[0-9]+}}], [[FIRST]]
+
+name: sdwa_preserve_remove
+tracksRegLiveness: true
+registers:
+ - { id: 0, class: vreg_64 }
+ - { id: 1, class: vreg_64 }
+ - { id: 2, class: sreg_64 }
+ - { id: 3, class: vgpr_32 }
+ - { id: 4, class: vgpr_32 }
+ - { id: 5, class: sreg_32_xm0_xexec }
+ - { id: 6, class: vgpr_32 }
+ - { id: 7, class: vgpr_32 }
+ - { id: 8, class: sreg_32_xm0 }
+ - { id: 9, class: vgpr_32 }
+ - { id: 10, class: sreg_32_xm0 }
+ - { id: 11, class: vgpr_32 }
+ - { id: 17, class: vgpr_32 }
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1, $vgpr2_vgpr3, $sgpr30_sgpr31
+
+ %2 = COPY $sgpr30_sgpr31
+ %1 = COPY $vgpr2_vgpr3
+ %0 = COPY $vgpr0_vgpr1
+ %3 = FLAT_LOAD_DWORD %0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4)
+ %4 = FLAT_LOAD_DWORD %1, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4)
+
+ %9:vgpr_32 = V_LSHRREV_B16_e64 8, %3, implicit $exec
+ %10:sreg_32_xm0 = S_MOV_B32 65535
+ %11:vgpr_32 = V_AND_B32_e64 %3, killed %10, implicit $exec
+ %17:vgpr_32 = V_MOV_B32_sdwa 0, %4, 0, 5, 2, 4, implicit $exec, implicit %11(tied-def 0)
+ FLAT_STORE_DWORD %0, %17, 0, 0, 0, implicit $exec, implicit $flat_scr :: (store 4)
+ S_ENDPGM
+
+...
OpenPOWER on IntegriCloud