summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIInsertSkips.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Rename SI_RETURNMatt Arsenault2017-03-211-2/+2
| | | | | | | | This is used for a specific type of return to a shader part's epilog code. Rename to try avoiding confusion from a true call's return. llvm-svn: 298452
* AMDGPU: Remove spurious out branches after a killMatt Arsenault2017-01-241-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sequence like this: v_cmpx_le_f32_e32 vcc, 0, v0 s_branch BB0_30 s_cbranch_execnz BB0_30 ; BB#29: exp null off, off, off, off done vm s_endpgm BB0_30: ; %endif110 is likely wrong. The s_branch instruction will unconditionally jump to BB0_30 and the skip block (exp done + endpgm) inserted for performing the kill instruction will never be executed. This results in a GPU hang with Star Ruler 2. The s_branch instruction is added during the "Control Flow Optimizer" pass which seems to re-organize the basic blocks, and we assume that SI_KILL_TERMINATOR is always the last instruction inside a basic block. Thus, after inserting a skip block we just go to the next BB without looking at the subsequent instructions after the kill, and the s_branch op is never removed. Instead, we should remove the unconditional out branches and let skip the two instructions if the exec mask is non-zero. This patch fixes the GPU hang and doesn't introduce any regressions with "make check". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99019 Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com> llvm-svn: 292985
* [AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-01-211-18/+31
| | | | | | other minor fixes (NFC). llvm-svn: 292688
* [CodeGen] Rename MachineInstrBuilder::addOperand. NFCDiana Picus2017-01-131-2/+2
| | | | | | | | | | | Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891
* AMDGPU: Refactor exp instructionsMatt Arsenault2016-12-051-6/+5
| | | | | | | | | | | | | | | Structure the definitions a bit more like the other classes. The main change here is to split EXP with the done bit set to a separate opcode, so we can set mayLoad = 1 so that it won't be reordered before the other exp stores, since this has the special constraint that if the done bit is set then this should be the last exp in she shader. Previously all exp instructions were inferred to have unmodeled side effects. llvm-svn: 288695
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-011-1/+1
| | | | llvm-svn: 283004
* AMDGPU: Split SILowerControlFlow into two piecesMatt Arsenault2016-08-221-0/+330
Do most of the lowering in a pre-RA pass. Keep the skip jump insertion late, plus a few other things that require more work to move out. One concern I have is now there may be COPY instructions which do not have the necessary implicit exec uses if they will be lowered to v_mov_b32. This has a positive effect on SGPR usage in shader-db. llvm-svn: 279464
OpenPOWER on IntegriCloud