summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600/R600MachineScheduler.h
Commit message (Collapse)AuthorAgeFilesLines
* R600 -> AMDGPU renameTom Stellard2015-06-131-103/+0
| | | | llvm-svn: 239657
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-131-2/+2
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* [C++11] Add 'override' keywords and remove 'virtual'. Additionally add ↵Craig Topper2014-04-291-7/+6
| | | | | | 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. R600 edition llvm-svn: 207503
* [C++] Use 'nullptr'.Craig Topper2014-04-281-1/+1
| | | | llvm-svn: 207394
* Factor MI-Sched in preparation for post-ra scheduling support.Andrew Trick2013-12-281-1/+1
| | | | | | | | Factor the MachineFunctionPass into MachineSchedulerBase. Split the DAG class into ScheduleDAGMI and SchedulerDAGMILive. llvm-svn: 198119
* R600: Fix scheduling of instructions that use the LDS output queueTom Stellard2013-11-151-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | The LDS output queue is accessed via the OQAP register. The OQAP register cannot be live across clauses, so if value is written to the output queue, it must be retrieved before the end of the clause. With the machine scheduler, we cannot statisfy this constraint, because it lacks proper alias analysis and it will mark some LDS accesses as having a chain dependency on vertex fetches. Since vertex fetches require a new clauses, the dependency may end up spiltting OQAP uses and defs so the end up in different clauses. See the lds-output-queue.ll test for a more detailed explanation. To work around this issue, we now combine the LDS read and the OQAP copy into one instruction and expand it after register allocation. This patch also adds some checks to the EmitClauseMarker pass, so that it doesn't end a clause with a value still in the output queue and removes AR.X and OQAP handling from the scheduler (AR.X uses and defs were already being expanded post-RA, so the scheduler will never see them). Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 194755
* R600: Non vector only instruction can be scheduled on trans unitVincent Lejeune2013-09-041-2/+3
| | | | llvm-svn: 189980
* Revert "R600: Non vector only instruction can be scheduled on trans unit"Tom Stellard2013-07-311-3/+2
| | | | | | This reverts commit 98ce62780ea7185ba710868bf83c8077e8d7f6d6. llvm-svn: 187526
* R600: Non vector only instruction can be scheduled on trans unitVincent Lejeune2013-07-311-2/+3
| | | | llvm-svn: 187514
* R600: Support schedule and packetization of trans-only instVincent Lejeune2013-06-291-0/+1
| | | | llvm-svn: 185268
* R600: Use a refined heuristic to choose when switching clauseVincent Lejeune2013-06-071-1/+4
| | | | | | | | | | | | | | | This is using a hint from AMD APP OpenCL Programming Guide with empirically tweaked parameters. I used Unigine Heaven 3.0 to determine best parameters on my system (i7 2600/Radeon 6950/Kernel 3.9.4) the benchmark : it went from 38.8 average fps to 39.6, which is ~3% gain. (Lightmark 2008.2 gain is much more marginal: from 537 to 539) There is no lit test provided as the parameter were determined empirically and it it would be nearly impossiblet to find a test program that check for optimal behavior. llvm-svn: 183593
* R600: Schedule copy from phys register at beginning of blockVincent Lejeune2013-06-051-0/+1
| | | | | | It allows regalloc pass to remove them by trivially assigning associated reg llvm-svn: 183336
* R600: Make sure to schedule AR register uses and defs in the same clauseTom Stellard2013-06-051-0/+2
| | | | | Reviewed-by: vljn at ovi.com llvm-svn: 183294
* Move passes from namespace llvm into anonymous namespaces. Sort includes ↵Benjamin Kramer2013-05-231-1/+1
| | | | | | while there. llvm-svn: 182594
* R600: Use bottom up scheduling algorithmVincent Lejeune2013-05-171-1/+1
| | | | llvm-svn: 182129
* R600: Use depth first scheduling algorithmVincent Lejeune2013-05-171-27/+5
| | | | | | | It should increase PV substitution opportunities and lower gpr usage (pending computations path are "flushed" sooner) llvm-svn: 182128
* R600: Factorize code handling Const Read Port limitationVincent Lejeune2013-03-141-2/+1
| | | | llvm-svn: 177078
* R600: initial scheduler codeVincent Lejeune2013-03-051-0/+121
This is a skeleton for a pre-RA MachineInstr scheduler strategy. Currently it only tries to expose more parallelism for ALU instructions (this also makes the distribution of GPR channels more uniform and increases the chances of ALU instructions to be packed together in a single VLIW group). Also it tries to reduce clause switching by grouping instruction of the same kind (ALU/FETCH/CF) together. Vincent Lejeune: - Support for VLIW4 Slot assignement - Recomputation of ScheduleDAG to get more parallelism opportunities Tom Stellard: - Fix assertion failure when trying to determine an instruction's slot based on its destination register's class - Fix some compiler warnings Vincent Lejeune: [v2] - Remove recomputation of ScheduleDAG (will be provided in a later patch) - Improve estimation of an ALU clause size so that heuristic does not emit cf instructions at the wrong position. - Make schedule heuristic smarter using SUnit Depth - Take constant read limitations into account Vincent Lejeune: [v3] - Fix some uninitialized values in ConstPair - Add asserts to ensure an ALU slot is always populated llvm-svn: 176498
OpenPOWER on IntegriCloud