summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
diff options
context:
space:
mode:
authorMatt Arsenault <Matthew.Arsenault@amd.com>2017-01-24 22:18:39 +0000
committerMatt Arsenault <Matthew.Arsenault@amd.com>2017-01-24 22:18:39 +0000
commitbf67cf7e4b42207e9e48b1de16d11c49a47279cc (patch)
tree6ebc099c1dcf5e95cf206a45b955b1040cdab276 /llvm/test/CodeGen
parentf1cf0278e8a90628eea80ace88f54f9035f3730d (diff)
downloadbcm5719-llvm-bf67cf7e4b42207e9e48b1de16d11c49a47279cc.tar.gz
bcm5719-llvm-bf67cf7e4b42207e9e48b1de16d11c49a47279cc.zip
AMDGPU: Remove spurious out branches after a kill
The sequence like this: v_cmpx_le_f32_e32 vcc, 0, v0 s_branch BB0_30 s_cbranch_execnz BB0_30 ; BB#29: exp null off, off, off, off done vm s_endpgm BB0_30: ; %endif110 is likely wrong. The s_branch instruction will unconditionally jump to BB0_30 and the skip block (exp done + endpgm) inserted for performing the kill instruction will never be executed. This results in a GPU hang with Star Ruler 2. The s_branch instruction is added during the "Control Flow Optimizer" pass which seems to re-organize the basic blocks, and we assume that SI_KILL_TERMINATOR is always the last instruction inside a basic block. Thus, after inserting a skip block we just go to the next BB without looking at the subsequent instructions after the kill, and the s_branch op is never removed. Instead, we should remove the unconditional out branches and let skip the two instructions if the exec mask is non-zero. This patch fixes the GPU hang and doesn't introduce any regressions with "make check". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99019 Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com> llvm-svn: 292985
Diffstat (limited to 'llvm/test/CodeGen')
-rw-r--r--llvm/test/CodeGen/AMDGPU/insert-skips-kill-uncond.mir40
1 files changed, 40 insertions, 0 deletions
diff --git a/llvm/test/CodeGen/AMDGPU/insert-skips-kill-uncond.mir b/llvm/test/CodeGen/AMDGPU/insert-skips-kill-uncond.mir
new file mode 100644
index 00000000000..bd5f296affb
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/insert-skips-kill-uncond.mir
@@ -0,0 +1,40 @@
+# RUN: llc -march=amdgcn -mcpu=polaris10 -run-pass si-insert-skips -amdgpu-skip-threshold=1 %s -o - | FileCheck %s
+# https://bugs.freedesktop.org/show_bug.cgi?id=99019
+--- |
+ define amdgpu_ps void @kill_uncond_branch() {
+ ret void
+ }
+...
+---
+
+# CHECK-LABEL: name: kill_uncond_branch
+
+# CHECK: bb.0:
+# CHECK: S_CBRANCH_VCCNZ %bb.1, implicit %vcc
+
+# CHECK: bb.1:
+# CHECK: V_CMPX_LE_F32_e32
+# CHECK-NEXT: S_CBRANCH_EXECNZ %bb.2, implicit %exec
+
+# CHECK: bb.3:
+# CHECK-NEXT: EXP_DONE
+# CHECK: S_ENDPGM
+
+# CHECK: bb.2:
+# CHECK: S_ENDPGM
+
+name: kill_uncond_branch
+
+body: |
+ bb.0:
+ successors: %bb.1
+ S_CBRANCH_VCCNZ %bb.1, implicit %vcc
+
+ bb.1:
+ successors: %bb.2
+ %vgpr0 = V_MOV_B32_e32 0, implicit %exec
+ SI_KILL_TERMINATOR %vgpr0, implicit-def %exec, implicit-def %vcc, implicit %exec
+ S_BRANCH %bb.2
+
+ bb.2:
+ S_ENDPGM
OpenPOWER on IntegriCloud