summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600/R600MachineScheduler.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Remove a few more calls to TargetMachine::getSubtarget from theEric Christopher2015-02-191-1/+1
| | | | | | R600 port. llvm-svn: 229804
* [PM] Remove the old 'PassManager.h' header file at the top level ofChandler Carruth2015-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094
* Reuse a bunch of cached subtargets and remove getSubtarget callsEric Christopher2015-01-301-3/+2
| | | | | | without a Function argument. llvm-svn: 227638
* Fix float division-by-zero in R600 scheduler.Alexey Samsonov2014-09-171-14/+18
| | | | | | This bug was reported by UBSan. llvm-svn: 217967
* R600: Remove unused includeMatt Arsenault2014-08-041-1/+0
| | | | llvm-svn: 214728
* R600: Move AMDGPUInstrInfo from AMDGPUTargetMachine into AMDGPUSubtargetTom Stellard2014-06-131-0/+1
| | | | llvm-svn: 210869
* [C++] Use 'nullptr'. Target edition.Craig Topper2014-04-251-5/+5
| | | | llvm-svn: 207197
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-2/+2
| | | | | | | definition below all of the header #include lines, lib/Target/... edition. llvm-svn: 206842
* Factor MI-Sched in preparation for post-ra scheduling support.Andrew Trick2013-12-281-4/+3
| | | | | | | | Factor the MachineFunctionPass into MachineSchedulerBase. Split the DAG class into ScheduleDAGMI and SchedulerDAGMILive. llvm-svn: 198119
* R600: Fix scheduling of instructions that use the LDS output queueTom Stellard2013-11-151-32/+0
| | | | | | | | | | | | | | | | | | | | | | | | The LDS output queue is accessed via the OQAP register. The OQAP register cannot be live across clauses, so if value is written to the output queue, it must be retrieved before the end of the clause. With the machine scheduler, we cannot statisfy this constraint, because it lacks proper alias analysis and it will mark some LDS accesses as having a chain dependency on vertex fetches. Since vertex fetches require a new clauses, the dependency may end up spiltting OQAP uses and defs so the end up in different clauses. See the lds-output-queue.ll test for a more detailed explanation. To work around this issue, we now combine the LDS read and the OQAP copy into one instruction and expand it after register allocation. This patch also adds some checks to the EmitClauseMarker pass, so that it doesn't end a clause with a value still in the output queue and removes AR.X and OQAP handling from the scheduler (AR.X uses and defs were already being expanded post-RA, so the scheduler will never see them). Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 194755
* R600: Don't use trans slot for instructions that read LDS source registersTom Stellard2013-09-121-0/+4
| | | | | | | | | | | | | | | | This fixes some regressions in the piglit local memory store tests introduced by recent commits which made the scheduler aware of the trans slot. It's not possible to test this using lit, because there is no way to determine from the assembly dumps whether or not an instruction is in the trans slot. Even if this were possible, the test would be highly sensitive to changes in the scheduler and might generate confusing false negatives. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 190574
* R600: Non vector only instruction can be scheduled on trans unitVincent Lejeune2013-09-041-12/+21
| | | | llvm-svn: 189980
* Revert "R600: Non vector only instruction can be scheduled on trans unit"Tom Stellard2013-07-311-21/+12
| | | | | | This reverts commit 98ce62780ea7185ba710868bf83c8077e8d7f6d6. llvm-svn: 187526
* R600: Non vector only instruction can be scheduled on trans unitVincent Lejeune2013-07-311-12/+21
| | | | llvm-svn: 187514
* R600: Support schedule and packetization of trans-only instVincent Lejeune2013-06-291-7/+18
| | | | llvm-svn: 185268
* R600: Add local memory support via LDSTom Stellard2013-06-281-2/+10
| | | | | Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185162
* R600: Add support for GROUP_BARRIER instructionTom Stellard2013-06-281-1/+5
| | | | | Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185161
* R600: Use a refined heuristic to choose when switching clauseVincent Lejeune2013-06-071-9/+43
| | | | | | | | | | | | | | | This is using a hint from AMD APP OpenCL Programming Guide with empirically tweaked parameters. I used Unigine Heaven 3.0 to determine best parameters on my system (i7 2600/Radeon 6950/Kernel 3.9.4) the benchmark : it went from 38.8 average fps to 39.6, which is ~3% gain. (Lightmark 2008.2 gain is much more marginal: from 537 to 539) There is no lit test provided as the parameter were determined empirically and it it would be nearly impossiblet to find a test program that check for optimal behavior. llvm-svn: 183593
* R600: Rework subtarget info and remove AMDILDevice classesTom Stellard2013-06-071-0/+1
| | | | | | | | This should simplify the subtarget definitions and make it easier to add new ones. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183566
* R600: Remove leftover code in R600MachineScheduler.cppVincent Lejeune2013-06-061-16/+0
| | | | | | Spotted by Benjamin Kramer. llvm-svn: 183413
* R600: Schedule copy from phys register at beginning of blockVincent Lejeune2013-06-051-1/+31
| | | | | | It allows regalloc pass to remove them by trivially assigning associated reg llvm-svn: 183336
* R600: Make sure to schedule AR register uses and defs in the same clauseTom Stellard2013-06-051-2/+34
| | | | | Reviewed-by: vljn at ovi.com llvm-svn: 183294
* Move passes from namespace llvm into anonymous namespaces. Sort includes ↵Benjamin Kramer2013-05-231-1/+1
| | | | | | while there. llvm-svn: 182594
* R600: Use bottom up scheduling algorithmVincent Lejeune2013-05-171-23/+28
| | | | llvm-svn: 182129
* R600: Use depth first scheduling algorithmVincent Lejeune2013-05-171-52/+26
| | | | | | | It should increase PV substitution opportunities and lower gpr usage (pending computations path are "flushed" sooner) llvm-svn: 182128
* R600: Replace big texture opcode switch in scheduler by usesTC/usesVCVincent Lejeune2013-05-171-23/+3
| | | | llvm-svn: 182127
* R600: Relax some vector constraints on Dot4.Vincent Lejeune2013-05-171-2/+2
| | | | | | | | | | Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register coalescer to remove some unneeded COPY. This patch also defines some structures/functions that can be used to handle every vector instructions (CUBE, Cayman special instructions...) in a similar fashion. llvm-svn: 182126
* R600: Factorize Fetch size limit inside AMDGPUSubTargetVincent Lejeune2013-05-171-8/+4
| | | | llvm-svn: 182122
* R600: Factorize maximum alu per clause in a single locationVincent Lejeune2013-04-031-1/+1
| | | | llvm-svn: 178667
* R600: Factorize code handling Const Read Port limitationVincent Lejeune2013-03-141-68/+7
| | | | llvm-svn: 177078
* R600MachineScheduler.cpp: Fix use cases of dbgs(). Don't include <iostream> ↵NAKAMURA Takumi2013-03-111-1/+2
| | | | | | here. llvm-svn: 176797
* R600: initial scheduler codeVincent Lejeune2013-03-051-0/+487
This is a skeleton for a pre-RA MachineInstr scheduler strategy. Currently it only tries to expose more parallelism for ALU instructions (this also makes the distribution of GPR channels more uniform and increases the chances of ALU instructions to be packed together in a single VLIW group). Also it tries to reduce clause switching by grouping instruction of the same kind (ALU/FETCH/CF) together. Vincent Lejeune: - Support for VLIW4 Slot assignement - Recomputation of ScheduleDAG to get more parallelism opportunities Tom Stellard: - Fix assertion failure when trying to determine an instruction's slot based on its destination register's class - Fix some compiler warnings Vincent Lejeune: [v2] - Remove recomputation of ScheduleDAG (will be provided in a later patch) - Improve estimation of an ALU clause size so that heuristic does not emit cf instructions at the wrong position. - Make schedule heuristic smarter using SUnit Depth - Take constant read limitations into account Vincent Lejeune: [v3] - Fix some uninitialized values in ConstPair - Add asserts to ensure an ALU slot is always populated llvm-svn: 176498
OpenPOWER on IntegriCloud