summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-061-6/+4
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* AMDGPU: Add SIWholeQuadMode passNicolai Haehnle2016-03-211-12/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Whole quad mode is already enabled for pixel shaders that compute derivatives, but it must be suspended for instructions that cause a shader to have side effects (i.e. stores and atomics). This pass addresses the issue by storing the real (initial) live mask in a register, masking EXEC before instructions that require exact execution and (re-)enabling WQM where required. This pass is run before register coalescing so that we can use machine SSA for analysis. The changes in this patch expose a problem with the second machine scheduling pass: target independent instructions like COPY implicitly use EXEC when they operate on VGPRs, but this fact is not encoded in the MIR. This can lead to miscompilation because instructions are moved past changes to EXEC. This patch fixes the problem by adding use-implicit operands to target independent instructions. Some general codegen passes are relaxed to work with such implicit use operands. Reviewers: arsenm, tstellarAMD, mareko Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18162 llvm-svn: 263982
* AMDGPU/SI: Fix threshold calculation for branching when exec is zeroTom Stellard2016-03-211-3/+5
| | | | | | | | | | | | | | | | | | | Summary: When control flow is implemented using the exec mask, the compiler will insert branch instructions to skip over the masked section when exec is zero if the section contains more than a certain number of instructions. The previous code would only count instructions in successor blocks, and this patch modifies the code to start counting instructions in all blocks between the start and end of the branch. Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18282 llvm-svn: 263969
* AMDGPU: add missing braces around multi-line if blockNicolai Haehnle2016-03-181-1/+2
| | | | | | This fixes an issue with rL263658 pointed out by Tom Stellard. llvm-svn: 263823
* AMDGPU: Prevent uniform loops from becoming infiniteNicolai Haehnle2016-03-161-0/+6
| | | | | | | | | | | | | | Summary: Uniform loops where the branch leaving the loop is predicated on VCCNZ must be skipped if EXEC = 0, otherwise they will be infinite. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18137 llvm-svn: 263658
* AMDGPU/SI: Incomplete shader binaries need to finish execution at the endMarek Olsak2016-03-141-0/+24
| | | | | | | | | | Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D18058 llvm-svn: 263441
* AMDGPU: Set flat_scratch from flat_scratch_init regMatt Arsenault2016-02-121-35/+3
| | | | | | | | | | | | | | This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. llvm-svn: 260658
* AMDGPU: Initialize SILowerControlFlowMatt Arsenault2016-02-121-28/+36
| | | | llvm-svn: 260645
* AMDGPU: Remove trailing whitespaceMatt Arsenault2016-02-121-4/+4
| | | | llvm-svn: 260644
* AMDGPU: Fix adding redundant m0 usesMatt Arsenault2015-10-211-2/+0
| | | | | | BuildMI already adds these since they are defined correctly now. llvm-svn: 250961
* AMDGPU: Add MachineInstr overloads for instruction format testsMatt Arsenault2015-10-201-2/+2
| | | | llvm-svn: 250797
* AMDGPU: Use explicit register size indirect pseudosMatt Arsenault2015-10-071-1/+5
| | | | | | | | | | | | | | | | | This stops using an unknown reg class operand. Currently build_vector selection has a broken looking check where it tries to use a VGPR reg class and an SGPR one if it sees an SGPR use. With the source operand has an explicit VGPR class, illegal copies will be inserted that SIFixSGPRCopies will take care of normally later, which will allow removing the weird check of build_vector users. Without this, when removed v_movrels_b32 would still be emitted even though all of the values were only stored in SGPRs. llvm-svn: 249494
* AMDGPU: Fix recomputing dominator tree unnecessarilyMatt Arsenault2015-09-251-0/+4
| | | | | | | SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands. llvm-svn: 248587
* AMDGPU/SI: Remove VCCRegMatt Arsenault2015-08-081-4/+4
| | | | llvm-svn: 244380
* AMDGPU/SI: Remove EXECRegMatt Arsenault2015-08-051-8/+4
| | | | | | For the same reasons as the other physical registers. llvm-svn: 244062
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+605
llvm-svn: 239657
OpenPOWER on IntegriCloud