summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPU.h
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Properly initialize SIShrinkInstructionsMatt Arsenault2016-06-091-0/+3
| | | | llvm-svn: 272336
* AMDGPU: Remove unused address spaceMatt Arsenault2016-05-311-2/+0
| | | | | | Also return a single StringRef instead of building a string. llvm-svn: 271296
* AMDGPU/EG,CM: Add instruction to read from constant AS (VTX2)Jan Vesely2016-05-131-1/+1
| | | | | | | | | | Reviewers: tstellard Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D19785 llvm-svn: 269473
* [AMDGPU][NFC] Rename SIInsertNops -> SIDebuggerInsertNopsKonstantin Zhuravlyov2016-05-101-3/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D20117 llvm-svn: 269098
* AMDGPU: Remove SIFixSGPRLiveRanges passNicolai Haehnle2016-04-141-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This pass is unnecessary and overly conservative. It was motivated by situations like def %vreg0:SGPR_32 ... if-block: .. def %vreg1:SGPR_32 ... else-block: ... use %vreg0:SGPR_32 ... and similar situations with uses after the non-uniform control flow, where we are not allowed to assign %vreg0 and %vreg1 to the same physical register, even though in the original, thread/workitem-based CFG, it looks like the live ranges of these registers do not overlap. However, by the time register allocation runs, we have moved to a wave-based CFG that accurately represents the fact that the wave may run through both the if- and the else-block. So the live ranges of %vreg0 and %vreg1 already overlap even without the SIFixSGPRLiveRanges pass. In addition to proving this change correct, I have tested it with Piglit and a small number of other tests. Reviewers: arsenm, tstellarAMD Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19041 llvm-svn: 266345
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-061-9/+0
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* AMDGPU: Add SIWholeQuadMode passNicolai Haehnle2016-03-211-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Whole quad mode is already enabled for pixel shaders that compute derivatives, but it must be suspended for instructions that cause a shader to have side effects (i.e. stores and atomics). This pass addresses the issue by storing the real (initial) live mask in a register, masking EXEC before instructions that require exact execution and (re-)enabling WQM where required. This pass is run before register coalescing so that we can use machine SSA for analysis. The changes in this patch expose a problem with the second machine scheduling pass: target independent instructions like COPY implicitly use EXEC when they operate on VGPRs, but this fact is not encoded in the MIR. This can lead to miscompilation because instructions are moved past changes to EXEC. This patch fixes the problem by adding use-implicit operands to target independent instructions. Some general codegen passes are relaxed to work with such implicit use operands. Reviewers: arsenm, tstellarAMD, mareko Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18162 llvm-svn: 263982
* AMDGPU: R600 code splitting cleanupMatt Arsenault2016-03-111-2/+2
| | | | | | | Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204
* AMDGPU: Insert two S_NOP instructions for every high level source statement.Tom Stellard2016-03-031-0/+4
| | | | | | | | | | | | | | Patch by: Konstantin Zhuravlyov Summary: Tools, such as debugger, need to pause execution based on user input (i.e. breakpoint). In order to do this, two S_NOP instructions are inserted for each high level source statement: one before first isa instruction of high level source statement, and one after last isa instruction of high level source statement. Further, debugger may replace S_NOP instructions with S_TRAP instructions based on user input. Reviewers: tstellarAMD, arsenm Subscribers: echristo, dblaikie, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17454 llvm-svn: 262579
* AMDGPU/SI: Detect uniform branches and emit s_cbranch instructionsTom Stellard2016-02-121-0/+1
| | | | | | | | | | Reviewers: arsenm Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16603 llvm-svn: 260765
* AMDGPU: Initialize SILowerControlFlowMatt Arsenault2016-02-121-1/+5
| | | | llvm-svn: 260645
* AMDGPU/SI: Correctly initialize SIInsertWaits passTom Stellard2016-02-051-1/+4
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16724 llvm-svn: 259894
* AMDGPU: Fix emitting invalid workitem intrinsics for HSAMatt Arsenault2016-01-301-1/+4
| | | | | | | | | | | | | | | | | | The AMDGPUPromoteAlloca pass was emitting the read.local.size calls, which with HSA was incorrectly selected to reading from the offset mesa uses off of the kernarg pointer. Error on intrinsics which aren't supported by HSA, and start emitting the correct IR to read the workgroup size out of the dispatch pointer. Also initialize the pass so it can be tested with opt, and start moving towards not depending on the subtarget as an argument. Start emitting errors for the intrinsics not handled with HSA. llvm-svn: 259297
* Correctly initialize SIAnnotateControlFlowTom Stellard2016-01-201-0/+3
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16304 llvm-svn: 258319
* Fix struct/class mismatch for MachineSchedContextHans Wennborg2016-01-131-1/+1
| | | | llvm-svn: 257648
* AMDGPU/SI: Add SI Machine SchedulerNicolai Haehnle2016-01-131-0/+4
| | | | | | | | | | | | | | | | Summary: It is off by default, but can be used with --misched=si Patch by: Axel Davy Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: nhaehnle, solenskiner, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D11885 llvm-svn: 257609
* AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF ↵Tom Stellard2015-12-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | instructions Summary: We were previously selecting all constant loads to SMRD instructions and legalizing the SMRDs with non-uniform addresses during the SIFixSGPRCopesPass. This new solution is more simple and also generates much better code, because the instruction selector is able to take advantage of all the MUBUF addressing modes that are legalization pass wasn't able to. We also no longer need to generate v_add_* instructions when we have a uniform pointer and a non-uniform offset, as this is now folded into the MUBUF instruction during instruction selection. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15425 llvm-svn: 255672
* AMDGPU/SI: Emit constant arrays in the .text sectionTom Stellard2015-12-101-2/+0
| | | | | | | | | | | | | | | Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204
* AMDGPU: Remove SIPrepareScratchRegsMatt Arsenault2015-11-301-1/+0
| | | | | | | | | | | | | | | | | | | | | | It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329
* AMDGPU: Add pass to detect used kernel featuresMatt Arsenault2015-11-061-0/+4
| | | | | | | | | | | Mark kernels that use certain features that require user SGPRs to support with kernel attributes. We need to know before instruction selection begins because it impacts the kernel calling convention lowering. For now this only detects the workitem intrinsics. llvm-svn: 252323
* AMDGPU: Initialize SIFixSGPRCopies so -print-after worksMatt Arsenault2015-11-031-1/+4
| | | | llvm-svn: 251995
* AMDGPU: Add pass to lower OpenCL image and sampler arguments.Tom Stellard2015-08-071-0/+1
| | | | | | | | | The pass adds new kernel arguments for image attributes, and resolves calls to dummy attribute and resource id getter functions. Patch by: Zoltan Gilian llvm-svn: 244372
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+148
| | | | llvm-svn: 239657
* Revert "AMDGPU: Add core backend files for R600/SI codegen v6"Tom Stellard2012-07-161-35/+0
| | | | | | This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303
* AMDGPU: Add core backend files for R600/SI codegen v6Tom Stellard2012-07-161-0/+35
llvm-svn: 160270
OpenPOWER on IntegriCloud