bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Revert "[AMDGPU] Invert the handling of skip insertion."	Nicolai Hähnle	2020-02-03	1	-19/+30
\| \| \| \| \| \| \| \| \|	This reverts commit 0dc6c249bffac9f23a605ce4e42a84341da3ddbd. The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for Mesa. (cherry picked from commit a80291ce10ba9667352adcc895f9668144f5f616)
*	[AMDGPU] Invert the handling of skip insertion.	cdevadas	2020-01-15	1	-30/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to have an optional pass. This patch inserts the s_cbranch_execz upfront during SILowerControlFlow to skip over the sections of code when no lanes are active. Later, SIRemoveShortExecBranches removes the skips for short branches, unless there is a sideeffect and the skip branch is really necessary. This new pass will replace the handling of skip insertion in the existing SIInsertSkip Pass. Differential revision: https://reviews.llvm.org/D68092
*	Revert [MBP] Disable aggressive loop rotate in plain mode	Jordan Rupprecht	2019-08-29	1	-4/+5
\| \| \| \| \| \| \| \|	This reverts r369664 (git commit 51f48295cbe8fa3a44db263b528dd9f7bae7bf9a) It causes many benchmark regressions, internally and in llvm's benchmark suite. llvm-svn: 370398
*	[MBP] Disable aggressive loop rotate in plain mode	Guozhi Wei	2019-08-22	1	-5/+4
\| \| \| \| \| \| \| \| \| \|	Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 369664
*	Revert r368339 "[MBP] Disable aggressive loop rotate in plain mode"	Hans Wennborg	2019-08-12	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It caused assertions to fire when building Chromium: lib/CodeGen/LiveDebugValues.cpp:331: bool {anonymous}::LiveDebugValues::OpenRangesSet::empty() const: Assertion `Vars.empty() == VarLocs.empty() && "open ranges are inconsistent"' failed. See https://crbug.com/992871#c3 for how to reproduce. > Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. > > To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. > > Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 368579
*	[MBP] Disable aggressive loop rotate in plain mode	Guozhi Wei	2019-08-08	1	-5/+4
\| \| \| \| \| \| \| \| \| \|	Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 368339
*	[MBP] Move a latch block with conditional exit and multi predecessors to top ↵	Guozhi Wei	2019-06-14	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of loop Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is: * a latch block * it has two successors, one is loop header, another is exit * it has more than one predecessors If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch. Differential Revision: https://reviews.llvm.org/D43256 llvm-svn: 363471
*	AMDGPU: Force skip over SMRD, VMEM and s_waitcnt instructions	Rhys Perry	2019-04-17	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes a large Dawn of War 3 performance regression with RADV from Mesa 19.0 to master which was caused by creating less code in some branches. Reviewers: arsen, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60824 llvm-svn: 358592
*	AMDGPU: Add additional MIR tests for exec mask optimizations	Matt Arsenault	2019-03-27	1	-7/+12
\| \| \| \| \| \| \| \| \| \|	Also includes one example of how this transform is unsound. This isn't verifying the copies are used in the control flow intrinisic patterns. Also add option to disable exec mask opt pass. Since this pass is unsound, it may be useful to turn it off until it is fixed. llvm-svn: 357091
*	AMDGPU: Make collapse-endcf test more useful	Matt Arsenault	2019-03-25	1	-6/+20
\| \| \| \| \| \| \| \|	Without a VALU instruction in the return block, these were mostly testing the path to delete exec mask code before s_endpgm rather than the end cf handling. llvm-svn: 356955
*	[AMDGPU] Enable LICM in the BE pipeline	Stanislav Mekhanoshin	2018-06-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This allows to hoist code portion to compute reciprocal of loop invariant denominator in integer division after codegen prepare expansion. Differential Revision: https://reviews.llvm.org/D48604 llvm-svn: 335988
*	[AMDGPU] Switch to the new addr space mapping by default	Yaxun Liu	2018-02-02	1	-2/+2
\| \| \| \| \| \| \| \|	This requires corresponding clang change. Differential Revision: https://reviews.llvm.org/D40955 llvm-svn: 324101
*	AMDGPU: Recompute scc liveness	Matt Arsenault	2017-09-08	1	-0/+60
\| \| \| \| \| \| \| \|	The various scalar bit operations set SCC, so one is erased or moved it needs to be recomputed. Not sure why the existing tests don't fail on this. llvm-svn: 312819
*	[AMDGPU] Eliminate no effect instructions before s_endpgm	Stanislav Mekhanoshin	2017-08-16	1	-7/+28
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D36585 llvm-svn: 310987
*	AMDGPU: Cleanup subtarget features	Matt Arsenault	2017-08-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Try to avoid mutually exclusive features. Don't use a real default GPU, and use a fake "generic". The goal is to make it easier to see which set of features are incompatible between feature strings. Most of the test changes are due to random scheduling changes from not having a default fullspeed model. llvm-svn: 310258
*	[AMDGPU] Turn s_and_saveexec_b64 into s_and_b64 if result is unused	Stanislav Mekhanoshin	2017-08-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	With SI_END_CF elimination for some nested control flow we can now eliminate saved exec register completely by turning a saveexec version of instruction into just a logical instruction. Differential Revision: https://reviews.llvm.org/D36007 llvm-svn: 309766
*	[AMDGPU] Collapse adjacent SI_END_CF	Stanislav Mekhanoshin	2017-08-01	1	-0/+188
	Add a pass to remove redundant S_OR_B64 instructions enabling lanes in the exec. If two SI_END_CF (lowered as S_OR_B64) come together without any vector instructions between them we can only keep outer SI_END_CF, given that CFG is structured and exec bits of the outer end statement are always not less than exec bit of the inner one. This needs to be done before the RA to eliminate saved exec bits registers but after register coalescer to have no vector registers copies in between of different end cf statements. Differential Revision: https://reviews.llvm.org/D35967 llvm-svn: 309762