summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600/AMDGPUTargetMachine.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Use unique_ptr to manage objects owned by the ScheduleDAGMI.David Blaikie2014-04-211-1/+1
| | | | llvm-svn: 206784
* R600/SI: Handle MUBUF instructions in SIInstrInfo::moveToVALU()Tom Stellard2014-03-211-0/+3
| | | | llvm-svn: 204476
* R600: Make check clearer.Matt Arsenault2014-02-241-1/+1
| | | | | | | The check is clearer as southern islands or later, rather than checking for later than northern islands. llvm-svn: 202076
* [cleanup] Move the Dominators.h and Verifier.h headers into the IRChandler Carruth2014-01-131-1/+1
| | | | | | | | | | | | | | | | | | directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082
* Factor MI-Sched in preparation for post-ra scheduling support.Andrew Trick2013-12-281-1/+1
| | | | | | | | Factor the MachineFunctionPass into MachineSchedulerBase. Split the DAG class into ScheduleDAGMI and SchedulerDAGMILive. llvm-svn: 198119
* Small simplification, p0 is the same as p.Rafael Espindola2013-12-191-1/+1
| | | | llvm-svn: 197699
* R600/SI: Make private pointers be 32-bit.Matt Arsenault2013-12-191-6/+5
| | | | | | | | Different sized address spaces should theoretically work most of the time now, and since 64-bit add is currently disabled, using more 32-bit pointers fixes some cases. llvm-svn: 197659
* One last cleanup of LLVM's DataLayout strings.Rafael Espindola2013-12-161-2/+4
| | | | | | | Produce them in the same order on every target. The order is that of getStringRepresentation: e|E-i*-f*-v*-a*-s*-n*-S*. llvm-svn: 197411
* Structure R600's computeDataLayout more like every other target.Rafael Espindola2013-12-161-8/+5
| | | | | | While there, simplify "p3:32:32:32" to "p3:32:32". llvm-svn: 197407
* The preferred alignment defaults to the abi alignment. Omit if it is the same.Rafael Espindola2013-12-161-3/+3
| | | | llvm-svn: 197400
* Don't duplicate the DataLayout defaults for integer, floats and vectors.Rafael Espindola2013-12-161-6/+1
| | | | llvm-svn: 197398
* On DataLayout, omit the default of p:64:64:64.Rafael Espindola2013-12-161-3/+1
| | | | llvm-svn: 197397
* Turn AMDGPUSubtarget::getDataLayout into a static function.Rafael Espindola2013-12-141-1/+24
| | | | | | No functionality change. llvm-svn: 197310
* R600: Register AMDGPUCFGStructurizer passTom Stellard2013-12-111-1/+1
| | | | | | | This enables -print-before-all to dump MachineInstrs after it is run. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197057
* R600: Register R600EmitClauseMarkers passTom Stellard2013-12-111-1/+1
| | | | | | | This enables -print-before-all to dump MachineInstrs after it is run. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197056
* Add a RequireStructuredCFG Field to TargetMachine.Vincent Lejeune2013-12-071-0/+1
| | | | llvm-svn: 196634
* R600: Enable the IR structurizer by defaultTom Stellard2013-11-181-2/+1
| | | | llvm-svn: 195031
* R600: Add a SubtargetFeatture for disabling the ifcvt pass.Tom Stellard2013-11-181-1/+2
| | | | | | This is useful when writing test cases for the AMDIL structurizer. llvm-svn: 195029
* R600: Fix handling of vector kernel argumentsTom Stellard2013-10-231-2/+3
| | | | | | | | | | The SelectionDAGBuilder was promoting vector kernel arguments to legal types, but this won't work for R600 and SI since kernel arguments are stored in memory and can't be promoted. In order to handle vector arguments correctly we need to look at the original types from the LLVM IR function. llvm-svn: 193215
* R600: Simplify handling of private address spaceTom Stellard2013-10-221-6/+0
| | | | | | | | | | | | | | | | | | The AMDGPUIndirectAddressing pass was previously responsible for lowering private loads and stores to indirect addressing instructions. However, this pass was buggy and way too complicated. The only advantage it had over the new simplified code was that it saved one instruction per direct write to private memory. This optimization likely has a minimal impact on performance, and we may be able to duplicate it using some other transformation. For the private address space, we now: 1. Lower private loads/store to Register(Load|Store) instructions 2. Reserve part of the register file as 'private memory' 3. After regalloc lower the Register(Load|Store) instructions to MOV instructions that use indirect addressing. llvm-svn: 193179
* R600/SI: Add SinkingPass before ISelVincent Lejeune2013-10-131-0/+1
| | | | llvm-svn: 192556
* R600: Use StructurizeCFGPass for non SI targetsTom Stellard2013-10-101-1/+4
| | | | | | | | | | | | | | StructurizeCFG pass allows to make complex cfg reducible ; it allows a lot of shader from shadertoy (which exhibits complex control flow constructs) to works correctly with respect to CFG handling (and allow us to detect potential bug in other part of the backend). We provide a cmd line argument to disable the pass for debug purpose. Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 192363
* R600: add a pass that merges clauses.Vincent Lejeune2013-10-011-2/+3
| | | | llvm-svn: 191790
* Allow subtarget selection of the default MachineScheduler and document the ↵Andrew Trick2013-09-201-8/+10
| | | | | | | | | | | interface. The global registry is used to allow command line override of the scheduler selection, but does not work well as the normal selection API. For example, the same LLVM process should be able to target multiple targets or subtargets. llvm-svn: 191071
* R600/SI: Convert v16i8 resource descriptors to i128Tom Stellard2013-08-141-0/+1
| | | | | | | | | | | | | Now that compute support is better on SI, we can't continue using v16i8 for descriptors since this is also a legal type in OpenCL. This patch fixes numerous hangs with the piglit OpenCL test and since we now use a target specific DAG node for LOAD_CONSTANT with the correct MemOperandFlags, this should also fix: https://bugs.freedesktop.org/show_bug.cgi?id=66805 llvm-svn: 188429
* R600/SI: Use VSrc_* register classes as the default classes for typesTom Stellard2013-08-061-0/+2
| | | | | | | | | | | | | | | | | Since the VSrc_* register classes contain both VGPRs and SGPRs, copies that used be emitted by isel like this: SGPR = COPY VGPR Will now be emitted like this: VSrC = COPY VGPR This patch also adds a pass that tries to identify and fix situations where a VGPR to SGPR copy may occur. Hopefully, these changes will make it impossible for the compiler to generate illegal VGPR to SGPR copies. llvm-svn: 187831
* Factor FlattenCFG out from SimplifyCFGTom Stellard2013-08-061-1/+1
| | | | | | Patch by: Mei Ye llvm-svn: 187764
* SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch ↵Tom Stellard2013-07-271-0/+12
| | | | | | | | | | | | | | conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
* R600: Simplify AMDILCFGStructurize by removing templates and assuming single ↵Vincent Lejeune2013-07-191-1/+0
| | | | | | exit llvm-svn: 186724
* R600: Do not predicated basic block with multiple alu clauseVincent Lejeune2013-07-091-1/+4
| | | | | | | | | Test is not included as it is several 1000 lines long. To test this functionnality, a test case must generate at least 2 ALU clauses, where an ALU clause is ~110 instructions long. NOTE: This is a candidate for the stable branch. llvm-svn: 185943
* Move StructurizeCFG out of R600 to generic Transforms.Matt Arsenault2013-06-191-1/+1
| | | | | | Register it with PassManager llvm-svn: 184343
* R600: Rework subtarget info and remove AMDILDevice classesTom Stellard2013-06-071-10/+10
| | | | | | | | This should simplify the subtarget definitions and make it easier to add new ones. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183566
* R600: Add a pass that merge Vector RegisterVincent Lejeune2013-06-051-0/+5
| | | | | | | Previously commited @183279 but tests were failing, reverted @183286 It was broken because @183336 was missing, now it's there. llvm-svn: 183343
* Revert "R600: Add a pass that merge Vector Register"Rafael Espindola2013-06-051-5/+0
| | | | | | This reverts commit r183279. CodeGen/R600/texture-input-merge.ll was failing. llvm-svn: 183286
* R600: Add a pass that merge Vector RegisterVincent Lejeune2013-06-041-0/+5
| | | | llvm-svn: 183279
* Fix a leak on the r600 backend.Rafael Espindola2013-05-231-4/+4
| | | | | | This should bring the valgrind bot back to life. llvm-svn: 182561
* R600: Improve texture handlingVincent Lejeune2013-05-171-0/+2
| | | | llvm-svn: 182125
* Remove the MachineMove class.Rafael Espindola2013-05-131-0/+1
| | | | | | | | | | | | It was just a less powerful and more confusing version of MCCFIInstruction. A side effect is that, since MCCFIInstruction uses dwarf register numbers, calls to getDwarfRegNum are pushed out, which should allow further simplifications. I left the MachineModuleInfo::addFrameMove interface unchanged since this patch was already fairly big. llvm-svn: 181680
* R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen ↵Tom Stellard2013-05-101-1/+0
| | | | | | | | | | | | | | patterns The BFE optimization was the only one we were actually using, and it was emitting an intrinsic that we don't support. https://bugs.freedesktop.org/show_bug.cgi?id=64201 Reviewed-by: Christian König <christian.koenig@amd.com> NOTE: This is a candidate for the 3.3 branch. llvm-svn: 181580
* R600: Packetize instructionsVincent Lejeune2013-04-301-1/+2
| | | | llvm-svn: 180760
* R600: Add support for native control flowVincent Lejeune2013-04-011-0/+1
| | | | llvm-svn: 178505
* R600: Emit CF_ALU and use true kcache register.Vincent Lejeune2013-04-011-0/+1
| | | | llvm-svn: 178503
* R600/SI: rework input interpolation v2Christian Konig2013-03-071-5/+0
| | | | | | | | v2: update CMakeLists.txt as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 176626
* R600: initial scheduler codeVincent Lejeune2013-03-051-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a skeleton for a pre-RA MachineInstr scheduler strategy. Currently it only tries to expose more parallelism for ALU instructions (this also makes the distribution of GPR channels more uniform and increases the chances of ALU instructions to be packed together in a single VLIW group). Also it tries to reduce clause switching by grouping instruction of the same kind (ALU/FETCH/CF) together. Vincent Lejeune: - Support for VLIW4 Slot assignement - Recomputation of ScheduleDAG to get more parallelism opportunities Tom Stellard: - Fix assertion failure when trying to determine an instruction's slot based on its destination register's class - Fix some compiler warnings Vincent Lejeune: [v2] - Remove recomputation of ScheduleDAG (will be provided in a later patch) - Improve estimation of an ALU clause size so that heuristic does not emit cf instructions at the wrong position. - Make schedule heuristic smarter using SUnit Depth - Take constant read limitations into account Vincent Lejeune: [v3] - Fix some uninitialized values in ConstPair - Add asserts to ensure an ALU slot is always populated llvm-svn: 176498
* R600: Remove LowerConstCopyPass and lower CONST_COPY right after ISel.Vincent Lejeune2013-03-051-1/+0
| | | | | | | Maintaining CONST_COPY Instructions until Pre Emit may prevent some ifcvt case and taking them in account for scheduling is difficult for no real benefit. llvm-svn: 176488
* R600/SI: cleanup literal handling v3Christian Konig2013-02-161-1/+0
| | | | | | | | | | | | | | | | Seems to be allot simpler, and also paves the way for further improvements. v2: rebased on master, use 0 in BUFFER_LOAD_FORMAT_XYZW, use VGPR0 in dummy EXP, avoid compiler warning, break after encoding the first literal. v3: correctly use V_ADD_F32_e64 This is a candidate for the stable branch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 175354
* R600: Support for indirect addressing v4Tom Stellard2013-02-061-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Only implemented for R600 so far. SI is missing implementations of a few callbacks used by the Indirect Addressing pass and needs code to handle frame indices. At the moment R600 only supports array sizes of 16 dwords or less. Register packing of vector types is currently disabled, which means that a vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order to correctly pack registers in all cases, we will need to implement an analysis pass for R600 that determines the correct vector width for each array. v2: - Add support for i8 zext load from stack. - Coding style fixes v3: - Don't reserve registers for indirect addressing when it isn't being used. - Fix bug caused by LLVM limiting the number of SubRegIndex declarations. v4: - Fix 64-bit defines llvm-svn: 174525
* R600: Fold remaining CONST_COPY after expand pseudo instTom Stellard2013-02-051-1/+1
| | | | | | | Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174395
* R600: rework handling of the constantsTom Stellard2013-01-231-0/+1
| | | | | | | | | | | | | | | | | | | | Remove Cxxx registers, add new special register - "ALU_CONST" and new operand for each alu src - "sel". ALU_CONST is used to designate that the new operand contains the value to override src.sel, src.kc_bank, src.chan for constants in the driver. Patch by: Vadim Girlin Vincent Lejeune: - Use pointers for constants - Fold CONST_ADDRESS when possible Tom Stellard: - Give CONSTANT_BUFFER_0 its own address space - Use integer types for constant loads Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173222
* R600: Proper insert S_WAITCNT instructionsTom Stellard2013-01-181-0/+5
| | | | | | | | | | | | | | | | Some instructions like memory reads/writes are executed asynchronously, so we need to insert S_WAITCNT instructions to block before accessing their results. Previously we have just inserted S_WAITCNT instructions after each async instruction, this patch fixes this and adds a prober insertion pass. Patch by: Christian König Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 172846
OpenPOWER on IntegriCloud