summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Fix trying to skip from a block with no successorsMatt Arsenault2016-07-151-2/+3
| | | | | | Found while reducing bug 28550 llvm-svn: 275509
* AMDGPU: Follow up to r275203Matt Arsenault2016-07-121-24/+27
| | | | | | I meant to squash this into it. llvm-svn: 275220
* AMDGPU: Fix verifier error with kill intrinsicMatt Arsenault2016-07-121-65/+122
| | | | | | | Don't create a terminator in the middle of the block. We should probably get rid of this intrinsic. llvm-svn: 275203
* Revert "AMDGPU: Remove unused control flow intrinsic"Matt Arsenault2016-07-091-0/+19
| | | | llvm-svn: 274978
* AMDGPU: Improve offset folding for register indexingMatt Arsenault2016-07-091-22/+40
| | | | llvm-svn: 274954
* AMDGPU: Remove unused control flow intrinsicMatt Arsenault2016-07-081-19/+0
| | | | llvm-svn: 274939
* AMDGPU: Minor adjustment to r274817Matt Arsenault2016-07-081-1/+1
| | | | | | | | | | The commit message is inaccurate, modifiesRegister will check for partial defs of exec. We currently don't ever emit partial defs of exec, so it doesn't really matter. llvm-svn: 274886
* AMDGPU: Move si_mask_branch register operand to be a useMatt Arsenault2016-07-081-4/+6
| | | | llvm-svn: 274818
* AMDGPU: Cleanup. Use definesRegister instead of manual loopMatt Arsenault2016-07-081-6/+2
| | | | | | | Also this will be more precise since it will check exec_lo/exec_hi writes. llvm-svn: 274817
* AMDGPU: Fix return of non-void-returning shadersNicolai Haehnle2016-07-061-6/+4
| | | | | | | | | | | | | | | | | Summary: Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that ensures that a non-void-returning shader falls off the end of the last basic block was effectively disabled, since SI_RETURN is now used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21975 llvm-svn: 274612
* AMDGPU: Add m0 vgpr load loop block as successorMatt Arsenault2016-06-301-0/+1
| | | | | | | This shows up as a verifier error when I move this earlier, not sure why it didn't before. llvm-svn: 274275
* AMDGPU: Fix out of bounds indirect indexing errorsMatt Arsenault2016-06-281-8/+19
| | | | | | | This was producing acceses to registers beyond the super register's limits, resulting in verifier failures. llvm-svn: 273977
* AMDGPU: Fix verifier errors with undef vector indicesMatt Arsenault2016-06-271-27/+37
| | | | | | Also fix pointlessly adding exec to liveins. llvm-svn: 273916
* AMDGPU: Cleanup subtarget handling.Matt Arsenault2016-06-241-3/+4
| | | | | | | | | Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
* AMDGPU: Fix liveness when expanding m0 loopMatt Arsenault2016-06-221-17/+60
| | | | llvm-svn: 273514
* AMDGPU: Fix verifier errors in SILowerControlFlowMatt Arsenault2016-06-221-66/+127
| | | | | | | | | | | | | The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking. Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return. llvm-svn: 273467
* AMDGPU: Also look for s_cbranch_vcczMatt Arsenault2016-05-191-1/+2
| | | | llvm-svn: 270091
* AMDGPU: Fix crash with unreachable terminators.Matt Arsenault2016-04-291-12/+27
| | | | | | | | | | If a block has no successors because it ends in unreachable, this was accessing an invalid iterator. Also stop counting instructions that don't emit any real instructions. llvm-svn: 268119
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-061-6/+4
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* AMDGPU: Add SIWholeQuadMode passNicolai Haehnle2016-03-211-12/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Whole quad mode is already enabled for pixel shaders that compute derivatives, but it must be suspended for instructions that cause a shader to have side effects (i.e. stores and atomics). This pass addresses the issue by storing the real (initial) live mask in a register, masking EXEC before instructions that require exact execution and (re-)enabling WQM where required. This pass is run before register coalescing so that we can use machine SSA for analysis. The changes in this patch expose a problem with the second machine scheduling pass: target independent instructions like COPY implicitly use EXEC when they operate on VGPRs, but this fact is not encoded in the MIR. This can lead to miscompilation because instructions are moved past changes to EXEC. This patch fixes the problem by adding use-implicit operands to target independent instructions. Some general codegen passes are relaxed to work with such implicit use operands. Reviewers: arsenm, tstellarAMD, mareko Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18162 llvm-svn: 263982
* AMDGPU/SI: Fix threshold calculation for branching when exec is zeroTom Stellard2016-03-211-3/+5
| | | | | | | | | | | | | | | | | | | Summary: When control flow is implemented using the exec mask, the compiler will insert branch instructions to skip over the masked section when exec is zero if the section contains more than a certain number of instructions. The previous code would only count instructions in successor blocks, and this patch modifies the code to start counting instructions in all blocks between the start and end of the branch. Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18282 llvm-svn: 263969
* AMDGPU: add missing braces around multi-line if blockNicolai Haehnle2016-03-181-1/+2
| | | | | | This fixes an issue with rL263658 pointed out by Tom Stellard. llvm-svn: 263823
* AMDGPU: Prevent uniform loops from becoming infiniteNicolai Haehnle2016-03-161-0/+6
| | | | | | | | | | | | | | Summary: Uniform loops where the branch leaving the loop is predicated on VCCNZ must be skipped if EXEC = 0, otherwise they will be infinite. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18137 llvm-svn: 263658
* AMDGPU/SI: Incomplete shader binaries need to finish execution at the endMarek Olsak2016-03-141-0/+24
| | | | | | | | | | Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D18058 llvm-svn: 263441
* AMDGPU: Set flat_scratch from flat_scratch_init regMatt Arsenault2016-02-121-35/+3
| | | | | | | | | | | | | | This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. llvm-svn: 260658
* AMDGPU: Initialize SILowerControlFlowMatt Arsenault2016-02-121-28/+36
| | | | llvm-svn: 260645
* AMDGPU: Remove trailing whitespaceMatt Arsenault2016-02-121-4/+4
| | | | llvm-svn: 260644
* AMDGPU: Fix adding redundant m0 usesMatt Arsenault2015-10-211-2/+0
| | | | | | BuildMI already adds these since they are defined correctly now. llvm-svn: 250961
* AMDGPU: Add MachineInstr overloads for instruction format testsMatt Arsenault2015-10-201-2/+2
| | | | llvm-svn: 250797
* AMDGPU: Use explicit register size indirect pseudosMatt Arsenault2015-10-071-1/+5
| | | | | | | | | | | | | | | | | This stops using an unknown reg class operand. Currently build_vector selection has a broken looking check where it tries to use a VGPR reg class and an SGPR one if it sees an SGPR use. With the source operand has an explicit VGPR class, illegal copies will be inserted that SIFixSGPRCopies will take care of normally later, which will allow removing the weird check of build_vector users. Without this, when removed v_movrels_b32 would still be emitted even though all of the values were only stored in SGPRs. llvm-svn: 249494
* AMDGPU: Fix recomputing dominator tree unnecessarilyMatt Arsenault2015-09-251-0/+4
| | | | | | | SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands. llvm-svn: 248587
* AMDGPU/SI: Remove VCCRegMatt Arsenault2015-08-081-4/+4
| | | | llvm-svn: 244380
* AMDGPU/SI: Remove EXECRegMatt Arsenault2015-08-051-8/+4
| | | | | | For the same reasons as the other physical registers. llvm-svn: 244062
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+605
llvm-svn: 239657
OpenPOWER on IntegriCloud