summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600/AMDGPU.h
Commit message (Collapse)AuthorAgeFilesLines
* R600 -> AMDGPU renameTom Stellard2015-06-131-148/+0
| | | | llvm-svn: 239657
* Fix clang-cl self-host -Wc++11-narrowing bugReid Kleckner2015-06-081-1/+1
| | | | | | | Use unsigned as the underlying storage type of the AMDGPU address space enum. llvm-svn: 239355
* R600/SI: Reimplement isLegalAddressingModeMatt Arsenault2015-06-041-1/+4
| | | | | | | | | | | Now that we sometimes know the address space, this can theoretically do a better job. This needs better test coverage, but this mostly depends on first updating the loop optimizatiosn to provide the address space. llvm-svn: 239053
* R600/SI: add pass to mark CF live ranges as non-spillableTom Stellard2015-05-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | Spilling can insert instructions almost anywhere, and this can mess up control flow lowering in a multitude of ways, due to instruction reordering. Let's sort this out the easy way: never spill registers involved with control flow, i.e. saved EXEC masks. Unfortunately, this does not work at all with optimizations disabled, as the register allocator ignores spill weights. This should be addressed in a future commit. The test was reduced from the "stacks" shader of [1]. Some issues trigger the machine verifier while another one is checked manually. [1] http://madebyevan.com/webgl-path-tracing/ v2: only insert pass with optimizations enabled, merge test runs. Patch by: Grigori Goronzy llvm-svn: 237152
* [PM] Remove a bunch of stale TTI creation method declarations. I nukedChandler Carruth2015-02-011-4/+0
| | | | | | | their definitions, but forgot to clean up all the declarations which are in different files. llvm-svn: 227698
* R600/SI: Use external symbols for scratch bufferTom Stellard2015-01-201-1/+5
| | | | | | | | We were passing the scratch buffer address to the shaders via user sgprs, but now we use external symbols and have the driver patch the shader using reloc information. llvm-svn: 226586
* R600/SI: Spill VGPRs to scratch space for compute shadersTom Stellard2015-01-141-0/+1
| | | | llvm-svn: 225988
* R600/SI: Add a stub GCNTargetMachineTom Stellard2015-01-061-0/+1
| | | | | | | | | | | | This is equivalent to the AMDGPUTargetMachine now, but it is the starting point for separating R600 and GCN functionality into separate targets. It is recommened that users start using the gcn triple for GCN-based GPUs, because using the r600 triple for these GPUs will be deprecated in the future. llvm-svn: 225277
* R600/SI: Add SIFoldOperands passTom Stellard2014-11-211-0/+4
| | | | | | | This pass attempts to fold the source operands of mov and copy instructions into their uses. llvm-svn: 222581
* Reapply: R600: Make sure to inline all internal functionsTom Stellard2014-11-031-0/+1
| | | | | | | | Function calls aren't supported yet. This was reverted due to build breakages, which should be fixed now. llvm-svn: 221173
* Revert "R600: Make sure to inline all internal functions"Reid Kleckner2014-10-311-1/+0
| | | | | | | | | This reverts commit r220996. It introduced layering violations causing link errors in many configurations. llvm-svn: 221020
* R600: Make sure to inline all internal functionsTom Stellard2014-10-311-0/+1
| | | | | | Function calls aren't supported yet. llvm-svn: 220996
* R600/SI: Add load / store machine optimizer pass.Matt Arsenault2014-10-101-0/+4
| | | | | | | | | | | | | Currently this only functions to match simple cases where ds_read2_* / ds_write2_* instructions can be used. In the future it might match some of the other weird load patterns, such as direct to LDS loads. Currently enabled only with a subtarget feature to enable easier testing. llvm-svn: 219533
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-131-3/+3
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* R600/SI: Add instruction shrinking passTom Stellard2014-07-211-0/+1
| | | | | | This pass converts 64-bit instructions to 32-bit when possible. llvm-svn: 213561
* R600/SI: Store constant initializer data in constant memoryTom Stellard2014-07-211-0/+8
| | | | | | | | | | | | This implements a solution for constant initializers suggested by Vadim Girlin, where we store the data after the shader code and then use the S_GETPC instruction to compute its address. This saves use the trouble of creating a new buffer for constant data and then having to pass the pointer to the kernel via user SGPRs or the input buffer. llvm-svn: 213530
* R600/SI: Adjsut SGPR live ranges before register allocationTom Stellard2014-07-021-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SGPRs are written by instructions that sometimes will ignore control flow, which means if you have code like: if (VGPR0) { SGPR0 = S_MOV_B32 0 } else { SGPR0 = S_MOV_B32 1 } The value of SGPR0 will 1 no matter what the condition is. In order to deal with this situation correctly, we need to view the program as if it were a single basic block when we calculate the live ranges for the SGPRs. They way we actually update the live range is by iterating over all of the segments in each LiveRange object and setting the end of each segment equal to the start of the next segment. So a live range like: [3888r,9312r:0)[10032B,10384B:0) 0@3888r will become: [3888r,10032B:0)[10032B,10384B:0) 0@3888r This change will allow us to use SALU instructions within branches. llvm-svn: 212215
* R600: Use LDS and vectors for private memoryTom Stellard2014-06-171-0/+2
| | | | llvm-svn: 211110
* R600: Remove AMDIL instruction and register definitionsTom Stellard2014-06-131-1/+0
| | | | | | Most of these are no longer used any more. llvm-svn: 210915
* R600: Add definition for flat address space ID.Matt Arsenault2014-05-221-3/+4
| | | | | | | | Use 4 since that's probably what it will be for spir. Move ADDRESS_NONE to the end to keep the constant_buffer_* values unchanged, since apparently a bunch of r600 tests use those directly. llvm-svn: 209463
* R600/SI: Use VALU instructions for copying i1 valuesTom Stellard2014-04-301-0/+4
| | | | | | | | | We can't use SALU instructions for this since they ignore the EXEC mask and are always executed. This fixes several OpenCV tests. llvm-svn: 207661
* Fix known typosAlp Toker2014-01-241-1/+1
| | | | | | | Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018
* R600: Register AMDGPUCFGStructurizer passTom Stellard2013-12-111-1/+1
| | | | | | | This enables -print-before-all to dump MachineInstrs after it is run. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197057
* R600: Register R600EmitClauseMarkers passTom Stellard2013-12-111-1/+1
| | | | | | | This enables -print-before-all to dump MachineInstrs after it is run. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197056
* R600: Simplify handling of private address spaceTom Stellard2013-10-221-1/+0
| | | | | | | | | | | | | | | | | | The AMDGPUIndirectAddressing pass was previously responsible for lowering private loads and stores to indirect addressing instructions. However, this pass was buggy and way too complicated. The only advantage it had over the new simplified code was that it saved one instruction per direct write to private memory. This optimization likely has a minimal impact on performance, and we may be able to duplicate it using some other transformation. For the private address space, we now: 1. Lower private loads/store to Register(Load|Store) instructions 2. Reserve part of the register file as 'private memory' 3. After regalloc lower the Register(Load|Store) instructions to MOV instructions that use indirect addressing. llvm-svn: 193179
* R600: add a pass that merges clauses.Vincent Lejeune2013-10-011-0/+1
| | | | llvm-svn: 191790
* R600/SI: Convert v16i8 resource descriptors to i128Tom Stellard2013-08-141-0/+1
| | | | | | | | | | | | | Now that compute support is better on SI, we can't continue using v16i8 for descriptors since this is also a legal type in OpenCL. This patch fixes numerous hangs with the piglit OpenCL test and since we now use a target specific DAG node for LOAD_CONSTANT with the correct MemOperandFlags, this should also fix: https://bugs.freedesktop.org/show_bug.cgi?id=66805 llvm-svn: 188429
* R600/SI: Use VSrc_* register classes as the default classes for typesTom Stellard2013-08-061-0/+1
| | | | | | | | | | | | | | | | | Since the VSrc_* register classes contain both VGPRs and SGPRs, copies that used be emitted by isel like this: SGPR = COPY VGPR Will now be emitted like this: VSrC = COPY VGPR This patch also adds a pass that tries to identify and fix situations where a VGPR to SGPR copy may occur. Hopefully, these changes will make it impossible for the compiler to generate illegal VGPR to SGPR copies. llvm-svn: 187831
* SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch ↵Tom Stellard2013-07-271-0/+4
| | | | | | | | | | | | | | conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
* R600: Use KCache for kernel argumentsTom Stellard2013-07-231-0/+6
| | | | | Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186918
* R600: Simplify AMDILCFGStructurize by removing templates and assuming single ↵Vincent Lejeune2013-07-191-1/+0
| | | | | | exit llvm-svn: 186724
* R600: Rework subtarget info and remove AMDILDevice classesTom Stellard2013-06-071-5/+50
| | | | | | | | This should simplify the subtarget definitions and make it easier to add new ones. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183566
* R600: Remove unnecessary includeTom Stellard2013-06-071-1/+1
| | | | | Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183558
* R600: Add a pass that merge Vector RegisterVincent Lejeune2013-06-051-0/+1
| | | | | | | Previously commited @183279 but tests were failing, reverted @183286 It was broken because @183336 was missing, now it's there. llvm-svn: 183343
* Revert "R600: Add a pass that merge Vector Register"Rafael Espindola2013-06-051-1/+0
| | | | | | This reverts commit r183279. CodeGen/R600/texture-input-merge.ll was failing. llvm-svn: 183286
* R600: Add a pass that merge Vector RegisterVincent Lejeune2013-06-041-0/+1
| | | | llvm-svn: 183279
* R600: Improve texture handlingVincent Lejeune2013-05-171-0/+1
| | | | llvm-svn: 182125
* R600: Packetize instructionsVincent Lejeune2013-04-301-0/+1
| | | | llvm-svn: 180760
* R600: Add support for native control flowVincent Lejeune2013-04-011-0/+1
| | | | llvm-svn: 178505
* R600: Emit CF_ALU and use true kcache register.Vincent Lejeune2013-04-011-0/+1
| | | | llvm-svn: 178503
* R600/SI: rework input interpolation v2Christian Konig2013-03-071-1/+0
| | | | | | | | v2: update CMakeLists.txt as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 176626
* R600: Remove LowerConstCopyPass and lower CONST_COPY right after ISel.Vincent Lejeune2013-03-051-1/+0
| | | | | | | Maintaining CONST_COPY Instructions until Pre Emit may prevent some ifcvt case and taking them in account for scheduling is difficult for no real benefit. llvm-svn: 176488
* R600/SI: cleanup literal handling v3Christian Konig2013-02-161-1/+0
| | | | | | | | | | | | | | | | Seems to be allot simpler, and also paves the way for further improvements. v2: rebased on master, use 0 in BUFFER_LOAD_FORMAT_XYZW, use VGPR0 in dummy EXP, avoid compiler warning, break after encoding the first literal. v3: correctly use V_ADD_F32_e64 This is a candidate for the stable branch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 175354
* R600: Support for indirect addressing v4Tom Stellard2013-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Only implemented for R600 so far. SI is missing implementations of a few callbacks used by the Indirect Addressing pass and needs code to handle frame indices. At the moment R600 only supports array sizes of 16 dwords or less. Register packing of vector types is currently disabled, which means that a vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order to correctly pack registers in all cases, we will need to implement an analysis pass for R600 that determines the correct vector width for each array. v2: - Add support for i8 zext load from stack. - Coding style fixes v3: - Don't reserve registers for indirect addressing when it isn't being used. - Fix bug caused by LLVM limiting the number of SubRegIndex declarations. v4: - Fix 64-bit defines llvm-svn: 174525
* R600: rework handling of the constantsTom Stellard2013-01-231-0/+1
| | | | | | | | | | | | | | | | | | | | Remove Cxxx registers, add new special register - "ALU_CONST" and new operand for each alu src - "sel". ALU_CONST is used to designate that the new operand contains the value to override src.sel, src.kc_bank, src.chan for constants in the driver. Patch by: Vadim Girlin Vincent Lejeune: - Use pointers for constants - Fold CONST_ADDRESS when possible Tom Stellard: - Give CONSTANT_BUFFER_0 its own address space - Use integer types for constant loads Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173222
* R600: Proper insert S_WAITCNT instructionsTom Stellard2013-01-181-0/+1
| | | | | | | | | | | | | | | | Some instructions like memory reads/writes are executed asynchronously, so we need to insert S_WAITCNT instructions to block before accessing their results. Previously we have just inserted S_WAITCNT instructions after each async instruction, this patch fixes this and adds a prober insertion pass. Patch by: Christian König Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 172846
* R600: New control flow for SI v2Tom Stellard2012-12-191-1/+2
| | | | | | | | | | | | | | | | | | | | This patch replaces the control flow handling with a new pass which structurize the graph before transforming it to machine instruction. This has a couple of different advantages and currently fixes 20 piglit tests without a single regression. It is now a general purpose transformation that could be not only be used for SI/R6xx, but also for other hardware implementations that use a form of structurized control flow. v2: further cleanup, fixes and documentation Patch by: Christian König Signed-off-by: Christian König <deathsimple@vodafone.de> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170591
* Add R600 backendTom Stellard2012-12-111-0/+48
A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX llvm-svn: 169915
OpenPOWER on IntegriCloud