diff options
| author | Matt Arsenault <Matthew.Arsenault@amd.com> | 2017-01-24 22:18:39 +0000 |
|---|---|---|
| committer | Matt Arsenault <Matthew.Arsenault@amd.com> | 2017-01-24 22:18:39 +0000 |
| commit | bf67cf7e4b42207e9e48b1de16d11c49a47279cc (patch) | |
| tree | 6ebc099c1dcf5e95cf206a45b955b1040cdab276 /llvm/test/CodeGen | |
| parent | f1cf0278e8a90628eea80ace88f54f9035f3730d (diff) | |
| download | bcm5719-llvm-bf67cf7e4b42207e9e48b1de16d11c49a47279cc.tar.gz bcm5719-llvm-bf67cf7e4b42207e9e48b1de16d11c49a47279cc.zip | |
AMDGPU: Remove spurious out branches after a kill
The sequence like this:
v_cmpx_le_f32_e32 vcc, 0, v0
s_branch BB0_30
s_cbranch_execnz BB0_30
; BB#29:
exp null off, off, off, off done vm
s_endpgm
BB0_30:
; %endif110
is likely wrong. The s_branch instruction will unconditionally jump
to BB0_30 and the skip block (exp done + endpgm) inserted for
performing the kill instruction will never be executed. This results
in a GPU hang with Star Ruler 2.
The s_branch instruction is added during the "Control Flow Optimizer"
pass which seems to re-organize the basic blocks, and we assume
that SI_KILL_TERMINATOR is always the last instruction inside a
basic block. Thus, after inserting a skip block we just go to the
next BB without looking at the subsequent instructions after the
kill, and the s_branch op is never removed.
Instead, we should remove the unconditional out branches and let
skip the two instructions if the exec mask is non-zero.
This patch fixes the GPU hang and doesn't introduce any regressions
with "make check".
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99019
Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com>
llvm-svn: 292985
Diffstat (limited to 'llvm/test/CodeGen')
| -rw-r--r-- | llvm/test/CodeGen/AMDGPU/insert-skips-kill-uncond.mir | 40 |
1 files changed, 40 insertions, 0 deletions
diff --git a/llvm/test/CodeGen/AMDGPU/insert-skips-kill-uncond.mir b/llvm/test/CodeGen/AMDGPU/insert-skips-kill-uncond.mir new file mode 100644 index 00000000000..bd5f296affb --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/insert-skips-kill-uncond.mir @@ -0,0 +1,40 @@ +# RUN: llc -march=amdgcn -mcpu=polaris10 -run-pass si-insert-skips -amdgpu-skip-threshold=1 %s -o - | FileCheck %s +# https://bugs.freedesktop.org/show_bug.cgi?id=99019 +--- | + define amdgpu_ps void @kill_uncond_branch() { + ret void + } +... +--- + +# CHECK-LABEL: name: kill_uncond_branch + +# CHECK: bb.0: +# CHECK: S_CBRANCH_VCCNZ %bb.1, implicit %vcc + +# CHECK: bb.1: +# CHECK: V_CMPX_LE_F32_e32 +# CHECK-NEXT: S_CBRANCH_EXECNZ %bb.2, implicit %exec + +# CHECK: bb.3: +# CHECK-NEXT: EXP_DONE +# CHECK: S_ENDPGM + +# CHECK: bb.2: +# CHECK: S_ENDPGM + +name: kill_uncond_branch + +body: | + bb.0: + successors: %bb.1 + S_CBRANCH_VCCNZ %bb.1, implicit %vcc + + bb.1: + successors: %bb.2 + %vgpr0 = V_MOV_B32_e32 0, implicit %exec + SI_KILL_TERMINATOR %vgpr0, implicit-def %exec, implicit-def %vcc, implicit %exec + S_BRANCH %bb.2 + + bb.2: + S_ENDPGM |

