bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Revert scheduling to reduce spilling	Stanislav Mekhanoshin	2020-01-03	1	-2/+11
\| \| \| \| \| \| \| \| \| \|	We can revert region schedule if new schedule decreases occupancy. However, if we already have only one wave we would accept any new schedule even if it blows up register pressure. Such schedule may result in quite heavy spilling which can be avoided if we reject this new schedule. Differential Revision: https://reviews.llvm.org/D72181
*	[AMDGPU] Add VerifyScheduling support.	Jay Foad	2019-10-01	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is cut and pasted from the corresponding GenericScheduler functions. Reviewers: arsenm, atrick, tstellar, vpykhtin Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68264 llvm-svn: 373346
*	Add tracing in pickNodeFromQueue.	Jay Foad	2019-09-25	1	-0/+1
\| \| \| \| \| \| \|	This matches GenericScheduler::pickNodeFromQueue, from which this function was mostly cut and pasted. llvm-svn: 372829
*	AMDGPU: Fix typo	Matt Arsenault	2019-09-06	1	-4/+4
\| \| \| \|	llvm-svn: 371249
*	AMDGPU: Avoid constructing new std::vector in initCandidate	Matt Arsenault	2019-09-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Approximately 30% of the time was spent in the std::vector constructor. In one testcase this pushes the scheduler to being the second slowest pass. I'm not sure I understand why these vector are necessary. The default scheduler initCandidate seems to use some pre-existing vectors for the pressure. llvm-svn: 371136
*	[AMDGPU] Speed up live-in virtual register set computaion in ↵	Valery Pykhtin	2019-06-18	1	-2/+26
\| \| \| \| \| \| \| \|	GCNScheduleDAGMILive. Differential revision: https://reviews.llvm.org/D62401 llvm-svn: 363661
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	AMDGPU: Refactor Subtarget classes	Tom Stellard	2018-07-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
*	[AMDGPU] Small refactoring in the scheduler	Stanislav Mekhanoshin	2018-06-04	1	-18/+3
\| \| \| \| \| \| \| \|	After last changes some code can be simplified. Differential Revision: https://reviews.llvm.org/D47661 llvm-svn: 333934
*	[AMDGPU] Track occupancy in MFI	Stanislav Mekhanoshin	2018-05-31	1	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Keep track of achieved occupancy in SIMachineFunctionInfo. At the moment we have a lot of duplicated or even missed code to query and maintain occupancy info. Record it in the MFI and query in a single call. Interfaces: - getOccupancy() - returns current recorded achieved occupancy. - getMinAllowedOccupancy() - returns lesser of the achieved occupancy and the lowest occupancy we are ready to tolerate. For example if a kernel is memory bound we are ready to tolerate 4 waves. - limitOccupancy() - record occupancy level if we have to lower it. - increaseOccupancy() - record occupancy if scheduler managed to increase the occupancy. MFI takes care of integrating different checks affecting occupancy, including LDS use and waves-per-eu attribute. Note that scheduler starts with not yet known register pressure, so has to record either limit or increase in occupancy after it is done. Later passes can just query a resulting value. New interface is used in the active scheduler and NFC wrt its work. Changes are also made to experimental schedulers to use it and record an occupancy after they are done. Before the change waves-per-eu was ignored by experimental schedulers and tolerance window for memory bound kernels was not used. Differential Revision: https://reviews.llvm.org/D47509 llvm-svn: 333629
*	[AMDGPU] Add perf hints to functions	Stanislav Mekhanoshin	2018-05-25	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is adoption of HSAIL perfhint pass. Two types of hints are produced: 1. Function is memory bound. 2. Kernel can use wave limiter. Currently these hints are used in the scheduler. If a function is suspected to be memory bound we allow occupancy to decrease to 4 waves in the course of scheduling. Differential Revision: https://reviews.llvm.org/D46992 llvm-svn: 333289
*	Rename DEBUG macro to LLVM_DEBUG.	Nicola Zaghen	2018-05-14	1	-40/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
*	[AMDGPU] Fix amdgpu-waves-per-eu accounting in scheduler	Stanislav Mekhanoshin	2018-05-12	1	-2/+5
\| \| \| \| \| \| \| \| \| \|	We cannot query this attribute from a subtarget given a machine function. At this point attribute itself is already unavailable and can only be obtained through MFI. Differential Revision: https://reviews.llvm.org/D46781 llvm-svn: 332166
*	[DebugInfo] Examine all uses of isDebugValue() for debug instructions.	Shiva Chen	2018-05-09	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844
*	[NFC] fix trivial typos in comments	Hiroshi Inoue	2018-01-22	1	-1/+1
\| \| \| \| \| \|	"the the" -> "the" llvm-svn: 323074
*	MachineFunction: Return reference from getFunction(); NFC	Matthias Braun	2017-12-15	1	-2/+2
\| \| \| \| \| \|	The Function can never be nullptr so we can return a reference. llvm-svn: 320884
*	Recommit CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value	Yaxun Liu	2017-12-15	1	-8/+11
\| \| \| \| \| \|	The regression on ppc64 was not due to this commit. llvm-svn: 320788
*	Revert CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value	Yaxun Liu	2017-12-14	1	-11/+8
\| \| \| \| \| \|	This commit might have caused regression on ppc64. Revert it to verify that. llvm-svn: 320712
*	CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value	Yaxun Liu	2017-12-13	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two issues were found about machine inst scheduler when compiling ProRender with -g for amdgcn target: GCNScheduleDAGMILive::schedule tries to update LiveIntervals for DBG_VALUE, which it should not since DBG_VALUE is not mapped in LiveIntervals. when DBG_VALUE is the last instruction of MBB, ScheduleDAGInstrs::buildSchedGraph and ScheduleDAGMILive::scheduleMI does not move RPTracker properly, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D41132 llvm-svn: 320650
*	AMDGPU: Fix crash when scheduling DBG_VALUE	Matt Arsenault	2017-12-05	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This calls handleMove with a DBG_VALUE instruction, which isn't tracked by LiveIntervals. I'm not sure this is the correct place to fix this. The generic scheduler seems to have more deliberate region selection that skips dbg_value. The test is also really hard to reduce. I haven't been able to figure out what exactly causes this particular case to try moving the dbg_value. llvm-svn: 319732
*	[CodeGen] Unify MBB reference format in both MIR and debug output	Francis Visoiu Mistrih	2017-12-04	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(\1)/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
*	[CodeGen] Rename DEBUG_TYPE to match passnames	Evandro Menezes	2017-07-11	1	-1/+1
\| \| \| \| \| \| \| \| \|	Rename missing DEBUG_TYPE "machine-scheduler" from backend files, which were absent from https://reviews.llvm.org/rL303921. Differential revision: https://reviews.llvm.org/D35231 llvm-svn: 307719
*	[AMDGPU] Use GCNRPTracker dumper methods in scheduler	Stanislav Mekhanoshin	2017-05-16	1	-18/+7
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33244 llvm-svn: 303186
*	[AMDGPU] Cache live-ins and register pressure in scheduler	Stanislav Mekhanoshin	2017-05-16	1	-71/+140
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using LIS can be quite expensive, so caching of calculated region live-ins and pressure is implemented. It does two things: 1. Caches the info for the second stage when we schedule with decreased target occupancy. 2. Tracks the basic block from top to bottom thus eliminating the need to scan whole register file liveness at every region split in the middle of the block. The scheduling is now done in 3 stages instead of two, with the first one being really a no-op and only used to collect scheduling regions as sent by the scheduler driver. There is no functional change to the current behavior, only compilation speed is affected. In general computeBlockPressure() could be simplified if we switch to backward RP tracker, because scheduler sends regions within a block starting from the last upward. We could use a natural order of upward tracker to seamlessly change between regions of the same block, since live reg set of a previous tracked region would become a live-out of the next region. That however requires fixing upward tracker to properly account defs and uses of the same instruction as both are contributing to the current pressure. When we converge on the produced pressure we should be able to switch between them back and forth. In addition, backward tracker is less expensive as it uses LIS in recede less often than forward uses it in advance. At the moment the worst known case compilation time has improved from 26 minutes to 8.5. Differential Revision: https://reviews.llvm.org/D33117 llvm-svn: 303184
*	[AMDGPU] Turn register pressure estimation into forward tracker	Stanislav Mekhanoshin	2017-05-16	1	-112/+39
\| \| \| \| \| \| \| \| \| \|	This factors register pressure estimation mechanism from the GCNSchedStrategy into the forward tracker to unify interface with other strategies and expose it to other interested phases. Differential Revision: https://reviews.llvm.org/D33105 llvm-svn: 303179
*	[AMDGPU] Fix incorrect register pressure calculation	Stanislav Mekhanoshin	2017-05-11	1	-2/+3
\| \| \| \| \| \| \| \| \|	Earlier fix D32572 introduced a bug where live-ins were calculated for basic block instead of scheduling region. This change fixes it. Differential Revision: https://reviews.llvm.org/D33086 llvm-svn: 302812
*	AMDGPU: Fix assert in scheduler	Konstantin Zhuravlyov	2017-04-27	1	-1/+2
\| \| \| \| \| \| \| \|	Assert is triggered if DBG_VALUE is first instruction in BB Differential Revision: https://reviews.llvm.org/D32572 llvm-svn: 301511
*	[AMDGPU] Fix recorded region boundaries in max-occupancy scheduler	Stanislav Mekhanoshin	2017-03-28	1	-10/+5
\| \| \| \| \| \| \| \| \| \|	This is incorrect to record region boundaries before scheduling, it may change after scheduling. As a result second pass may see less instructions to schedule than it should. Differential Revision: https://reviews.llvm.org/D31434 llvm-svn: 298945
*	[AMDGPU] Iterative scheduling infrastructure + minimal registry scheduler	Valery Pykhtin	2017-03-21	1	-3/+1
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D31046 llvm-svn: 298368
*	[AMDGPU] Remove getBidirectionalReasonRank	Stanislav Mekhanoshin	2017-03-11	1	-13/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This method inverts the Reason field of a scheduling candidate. It does right comparison between RegCritical and RegExcess, but everything else is broken. In fact it can prefer less strong reason such as Weak over RegCritical because Weak > -RegCritical. The CandReason enum is properly sorted, so just remove artificial ranking. Differential Revision: https://reviews.llvm.org/D30557 llvm-svn: 297536
*	[AMDGPU] Add second pass of the scheduler	Stanislav Mekhanoshin	2017-02-28	1	-5/+97
\| \| \| \| \| \| \| \| \| \| \|	If during scheduling we have identified that we cannot keep optimistic occupancy increase critical register pressure limit and try scheduling of the whole function again. In this case blocks with smaller pressure will have a chance for better scheduling. Differential Revision: https://reviews.llvm.org/D30442 llvm-svn: 296506
*	[AMDGPU] New method to estimate register pressure	Stanislav Mekhanoshin	2017-02-28	1	-21/+135
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change introduces new method to estimate register pressure in GCNScheduler. Standard RPTracker gives huge error due to the following reasons: 1. It does not account for live-ins or live-outs if value is not used in the region itself. That creates a huge error in a very common case if there are a lot of live-thu registers. 2. It does not properly count subregs. 3. It assumes a register used as an input operand can be reused as an output. This is not always possible by itself, this is not what RA will finally do in many cases for various reasons not limited to RA's inability to do so, and this is not so if the value is actually a live-thu. In addition we can now see clear separation between live-in pressure which we cannot change with the scheduling and tentative pressure which we can change. Differential Revision: https://reviews.llvm.org/D30439 llvm-svn: 296491
*	[AMDGPU] Fix read-undef flags when schedule is reverted	Stanislav Mekhanoshin	2017-02-28	1	-12/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	If two subregs of the same register are defined and we need to revert schedule changing def order, we will end up with both instructions having def,read-undef flags because adjustLaneLiveness() will only set this flag but will not remove it. Fix this by removing read-undef flags before calling adjustLaneLiveness. Differential Revision: https://reviews.llvm.org/D30428 llvm-svn: 296484
*	[AMDGPU] Revert failed scheduling	Stanislav Mekhanoshin	2017-02-15	1	-29/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reverts region's scheduling to the original untouched state in case if we have have decreased occupancy. In addition it switches to use TargetRegisterInfo occupancy callback for pressure limits instead of gradually increasing limits which were just passed by. We are going to stay with the best schedule so we do not need to tolerate worsened scheduling anymore. Differential Revision: https://reviews.llvm.org/D29971 llvm-svn: 295206
*	[AMDGPU] Move register related queries to subtarget class	Konstantin Zhuravlyov	2017-02-08	1	-2/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D29318 llvm-svn: 294440
*	[AMDGPU] Fix GCNSchedStrategy.cpp debug output	Stanislav Mekhanoshin	2017-02-06	1	-2/+2
\| \| \| \| \| \| \| \|	There is typo in the debug output: top and bottom candidates are switched. Differential Revision: https://reviews.llvm.org/D29608 llvm-svn: 294257
*	[AMDGPU] Account workgroup size in LDS occupancy limits	Stanislav Mekhanoshin	2017-02-01	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Functions matching LDS use to occupancy return results for a workgroup of 64 workitems. The numbers has to be adjusted for bigger workgroups. For example a workgroup of size 256 already occupies 4 waves just by itself. Given that all numbers of LDS use in the compiler are per workgroup, occupancy shall be multiplied by 4 in this case. Each 64 workitems still limited by the same number, but 4 subrgoups 64 workitems each can afford 4 times more LDS to get the same occupancy. In addition change initializes LDS size in the subtarget to a real value for SI+ targets. This is required since LDS size is a variable in these calculations. Differential Revision: https://reviews.llvm.org/D29423 llvm-svn: 293837
*	[AMDGPU] Fix typo in GCNSchedStrategy	Valery Pykhtin	2017-01-26	1	-1/+1
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D28980 llvm-svn: 293171
*	AMDGPU/SI: Allow using SGPRs 96-101 on VI	Marek Olsak	2016-12-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There is no point in setting SGPRS=104, because VI allocates SGPRs in multiples of 16, so 104 -> 112. That enables us to use all 102 SGPRs for general purposes. Reviewers: tstellarAMD Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27149 llvm-svn: 289260
*	AMDGPU: Whitespace fixes	Matt Arsenault	2016-11-01	1	-2/+2
\| \| \| \|	llvm-svn: 285659
*	[AMDGPU] Wave and register controls	Konstantin Zhuravlyov	2016-09-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747
*	AMDGPU/SI: Implement a custom MachineSchedStrategy	Tom Stellard	2016-08-29	1	-0/+312
	Summary: GCNSchedStrategy re-uses most of GenericScheduler, it's just uses a different method to compute the excess and critical register pressure limits. It's not enabled by default, to enable it you need to pass -misched=gcn to llc. Shader DB stats: 32464 shaders in 17874 tests Totals: SGPRS: 1542846 -> 1643125 (6.50 %) VGPRS: 1005595 -> 904653 (-10.04 %) Spilled SGPRs: 29929 -> 27745 (-7.30 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 36688188 -> 37034900 (0.95 %) bytes LDS: 1913 -> 1913 (0.00 %) blocks Max Waves: 254101 -> 265125 (4.34 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 1338220 -> 1438499 (7.49 %) VGPRS: 886221 -> 785279 (-11.39 %) Spilled SGPRs: 29869 -> 27685 (-7.31 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 34315716 -> 34662428 (1.01 %) bytes LDS: 1551 -> 1551 (0.00 %) blocks Max Waves: 188127 -> 199151 (5.86 %) Wait states: 0 -> 0 (0.00 %) Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23688 llvm-svn: 279995