bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Use bitset for assembler predicates	Stanislav Mekhanoshin	2019-03-11	3	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	AMDGPU target run out of Subtarget feature flags hitting the limit of 64. AssemblerPredicates uses at most uint64_t for their representation. At the same time CodeGen has exhausted this a long time ago and switched to a FeatureBitset with the current limit of 192 bits. This patch completes transition to the bitset for feature bits extending it to asm matcher and MC code emitter. Differential Revision: https://reviews.llvm.org/D59002 llvm-svn: 355839
*	[AMDGPU] Mark enum types in SIDefines.h as unsigned	Stanislav Mekhanoshin	2019-03-11	4	-21/+21
\| \| \| \| \| \| \| \|	MSVC issues some warnings about signed/unsigned comparison. Differential Revision: https://reviews.llvm.org/D59171 llvm-svn: 355836
*	AMDGPU: Move d16 load matching to preprocess step	Matt Arsenault	2019-03-08	10	-195/+348
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When matching half of the build_vector to a load, there could still be a hidden dependency on the other half of the build_vector the pattern wouldn't detect. If there was an additional chain dependency on the other value, a cycle could be introduced. I don't think a tablegen pattern is capable of matching the necessary conditions, so move this into PreprocessISelDAG. Check isPredecessorOf for the other value to avoid a cycle. This has a warning that it's expensive, so this should probably be moved into an MI pass eventually that will have more freedom to reorder instructions to help match this. That is currently complicated by the lack of a computeKnownBits type mechanism for the selected function. llvm-svn: 355731
*	DAG: Don't try to cluster loads with tied inputs	Matt Arsenault	2019-03-08	1	-45/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This avoids breaking possible value dependencies when sorting loads by offset. AMDGPU has some load instructions that write into the high or low bits of the destination register, and have a tied input for the other input bits. These can easily have the same base pointer, but be a swizzle so the high address load needs to come first. This was inserting glue forcing the opposite ordering, producing a cycle the InstrEmitter would assert on. It may be potentially expensive to look for the dependency between the other loads, so just skip any where this could happen. Fixes bug 40936 by reverting r351379, which added a hacky attempt to fix this by adding chains in this case, which I think was just working around broken glue before the InstrEmitter. The core of the patch is re-implementing the fix for that problem. llvm-svn: 355728
*	AMDGPU: Don't bother checking the chain in areLoadsFromSameBasePtr	Matt Arsenault	2019-03-08	1	-15/+0
\| \| \| \| \| \| \|	This is only called in contexts that are verifying the chain itself, and the query itself is only asking about the address. llvm-svn: 355723
*	AMDGPU: Correct DS implementation of areLoadsFromSameBasePtr	Matt Arsenault	2019-03-08	1	-4/+4
\| \| \| \| \| \| \| \| \|	This was checking the wrong operands for the base register and the offsets. The indexes are shifted by the number of output registers from the machine instruction definition, and the chain is moved to the end. llvm-svn: 355722
*	[AMDGPU] V_CVT_F32_UBYTE{0,1,2,3} are full rate instructions	Carl Ritson	2019-03-08	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fix a bug in the scheduling model where V_CVT_F32_UBYTE{0,1,2,3} are incorrectly marked as quarter rate instructions. Reviewers: arsenm, rampitec Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59091 llvm-svn: 355671
*	AMDHSA: Code object v3 updates	Konstantin Zhuravlyov	2019-03-07	1	-4/+13
\| \| \| \| \| \| \| \| \|	- Copy kernel symbol attributes into kernel descriptor attributes - Make sure kernel symbol's visibility is not "higher" than protected Differential Revision: https://reviews.llvm.org/D59057 llvm-svn: 355630
*	AMDGPU: Handle "uniform-work-group-size" attribute (fix for RADV)	Aakanksha Patil	2019-03-07	2	-6/+66
\| \| \| \| \| \| \| \| \| \|	A previous patch for "uniform-work-group-size" attribute was found to break some RADV and possibly radeon SI tests and had to be retracted. This patch fixes that. Differential Revision: http://reviews.llvm.org/D58993 llvm-svn: 355574
*	[AMDGPU] Add support for 64 bit buffer atomic artihmetic instructions	Ryan Taylor	2019-03-06	2	-23/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds support for 64 bit buffer atomic arithmetic instructions but does not include cmpswap as that depends on a fix to the way the register pairs are handled Change-Id: Ib207ea65fb69487ccad5066ea647ae8ddfe2ce61 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58918 llvm-svn: 355520
*	AMDGPU: Preserve undef flag when expanding SI_IF	Matt Arsenault	2019-03-05	1	-2/+2
\| \| \| \| \| \|	Fixes undefined value verifier error. llvm-svn: 355426
*	[AMDGPU] Fix DPP operand order in atomic optimizer	Carl Ritson	2019-03-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Ensure order of operands in DPP atomic optimizer final WWM step is appropriate for sub instructions. Change-Id: I631d050e1c00a3b4bc7c11a90437064403c4cf30 Reviewers: sheredom, tpr Reviewed By: sheredom Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58900 llvm-svn: 355394
*	[AMDGPU] Omit KILL instructions from hazard recognizer	David Stuttard	2019-03-05	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In some cases the KILL was causing a hazard to be introduced as these were scheduled into hazard slots, but don't result in an instruction. KILL shouldn't be considered for hazard recognition. Change-Id: Ib6d2a2160f8c94cd0ce611ab198c7e4f46aeffcf Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58898 llvm-svn: 355384
*	[AMDGPU] Implement AMDGPUMCInstrAnalysis	Scott Linder	2019-03-05	2	-0/+33
\| \| \| \| \| \| \| \| \|	Implement MCInstrAnalysis for AMDGPU, with default implementations save for `evaluateBranch`. Differential Revision: https://reviews.llvm.org/D58400 llvm-svn: 355373
*	[AMDGPU][MC] Enable lds_direct operand for v_readfirstlane_b32, ↵	Dmitry Preobrazhensky	2019-03-04	7	-47/+110
\| \| \| \| \| \| \| \| \| \| \| \|	v_readlane_b32 and v_writelane_b32 See bug 40662: https://bugs.llvm.org/show_bug.cgi?id=40662 Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D58713 llvm-svn: 355312
*	[AMDGPU] Mark ds instructions as meybeAtomic	Stanislav Mekhanoshin	2019-03-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	These were not recognized as potential atomics by memory legalizer. The test was working not because legalizer did a right thing, but because it has skipped all these instructions. When I have fixed DS desciption test started to fail because region address has changed from 4 to 2 a while ago. Differential Revision: https://reviews.llvm.org/D58802 llvm-svn: 355179
*	AMDGPU/GlobalISel: Implement select for G_INSERT	Tom Stellard	2019-03-01	2	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \|	Re-commit r344310. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53116 llvm-svn: 355159
*	AMDGPU/GlobalISel: Implement select for G_EXTRACT	Tom Stellard	2019-02-28	3	-0/+32
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49714 llvm-svn: 355156
*	AMDGPU: Fix typo	Matt Arsenault	2019-02-28	1	-1/+1
\| \| \| \|	llvm-svn: 355056
*	AMDGPU: Enable function calls by default	Matt Arsenault	2019-02-28	1	-4/+9
\| \| \| \| \| \| \|	Fixes some crashes on illegal call situations which are unfortunately still valid IR. llvm-svn: 355051
*	AMDGPU: Fix crashes in invalid call cases	Matt Arsenault	2019-02-28	2	-6/+15
\| \| \| \| \| \| \|	We have to at least tolerate calls to kernels, possibly with a mismatched calling convention on the callsite. llvm-svn: 355049
*	GlobalISel: Implement fewerElementsVector for phi	Matt Arsenault	2019-02-28	1	-0/+1
\| \| \| \|	llvm-svn: 355048
*	GlobalISel: Implement moreElementsVector for phi	Matt Arsenault	2019-02-28	1	-0/+1
\| \| \| \|	llvm-svn: 355047
*	[AMDGPU][MC] Added register size check for VOP3/SDWA/DPP operands	Dmitry Preobrazhensky	2019-02-27	2	-13/+17
\| \| \| \| \| \| \| \| \| \|	See bug 37943: https://bugs.llvm.org/show_bug.cgi?id=37943 Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D58287 llvm-svn: 354974
*	[AMDGPU][MC][GFX8+] Added syntactic sugar for 'vgpr index' operand of ↵	Dmitry Preobrazhensky	2019-02-27	7	-28/+149
\| \| \| \| \| \| \| \| \| \| \| \|	instructions s_set_gpr_idx_on and s_set_gpr_idx_mode See bug 39331: https://bugs.llvm.org/show_bug.cgi?id=39331 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D58288 llvm-svn: 354969
*	[AMDGPU] Fixed hang during DAG combine	Stanislav Mekhanoshin	2019-02-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	SITargetLowering::reassociateScalarOps() does not touch constants so that DAGCombiner::ReassociateOps() does not revert the combine. However a global address is not a ConstantSDNode. Switched to the method used by DAGCombiner::ReassociateOps() itself to detect constants. Differential Revision: https://reviews.llvm.org/D58695 llvm-svn: 354926
*	RegBankSelect: Handle slightly more complex value mappings	Matt Arsenault	2019-02-25	2	-8/+47
\| \| \| \| \| \| \| \|	Try to use concat_vectors. Also remove unnecessary assert on pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers for AMDGPU. llvm-svn: 354828
*	AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizes	Matt Arsenault	2019-02-25	1	-0/+2
\| \| \| \|	llvm-svn: 354825
*	AMDGPU/GlobalISel: Clamp max implicit_def elements	Matt Arsenault	2019-02-25	1	-1/+2
\| \| \| \|	llvm-svn: 354818
*	AMDGPU: Remove IntrReadMem from memtime/memrealtime intrinsics	Matt Arsenault	2019-02-25	1	-2/+10
\| \| \| \| \| \| \|	EarlyCSE with MemorySSA was able to use this to merge multiple calls with no intervening store. llvm-svn: 354814
*	AMDGPU: Correct definitions for bitset instructions	Matt Arsenault	2019-02-25	2	-13/+21
\| \| \| \| \| \| \|	These really read and write the result register, so these need a tied input. llvm-svn: 354809
*	Revert "AMDGPU/NFC: Cleanup subtarget predicates"	Konstantin Zhuravlyov	2019-02-22	14	-137/+138
\| \| \| \| \| \| \|	It breaks one of our downstream merges, so revert it temporarily while investigating failures downstream llvm-svn: 354700
*	AMDGPU: Use removeAllRegUnitsForPhysReg	Matt Arsenault	2019-02-22	2	-4/+3
\| \| \| \|	llvm-svn: 354686
*	AMDGPU: Remove debugger related subtarget features	Matt Arsenault	2019-02-21	17	-334/+13
\| \| \| \| \| \|	As far as I know these aren't needed anymore. llvm-svn: 354634
*	AMDGPU/NFC: Cleanup subtarget predicates	Konstantin Zhuravlyov	2019-02-21	14	-138/+137
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D58522 llvm-svn: 354620
*	[AMDGPU] remove unused AssemblerPredicates	Mark Searles	2019-02-21	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An internal build is hitting asserts complaining about too many subtarget features: llvm/utils/TableGen/Types.cpp:42: const char* llvm::getMinimalTypeForEnumBitfield(uint64_t): Assertion `MaxIndex <= 64 && "Too many bits"' failed. llvm/utils/TableGen/AsmMatcherEmitter.cpp:1476: void {anonymous}::AsmMatcherInfo::buildInfo(): Assertion `SubtargetFeatures.size() <= 64 && "Too many subtarget features!"' failed. The short-term solution is to remove a few unused AssemblerPredicates to get under the limit. The long-term solution seems to be to revisit these asserts. E.g., rather than hardcoded '64', use the standard sized std::bitset like the other places that track subtarget features. Differential Revision: https://reviews.llvm.org/D58516 llvm-svn: 354604
*	AMDGPU/GlobalISel: Make phis legal	Matt Arsenault	2019-02-21	1	-0/+13
\| \| \| \|	llvm-svn: 354592
*	AMDGPU/GlobalISel: Fix bit count ops for non-power-of-2 types	Matt Arsenault	2019-02-21	1	-1/+3
\| \| \| \|	llvm-svn: 354587
*	[AMDGPU] fix commuted case of sub combine	Stanislav Mekhanoshin	2019-02-21	1	-5/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D58481 llvm-svn: 354543
*	AMDGPU/GlobalISel: Move SMRD selection logic to TableGen	Tom Stellard	2019-02-20	4	-128/+136
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52922 llvm-svn: 354516
*	GlobalISel: Fix fewerElementsVector for ctlz with different result type	Matt Arsenault	2019-02-20	1	-2/+2
\| \| \| \| \| \|	Also complete the set of related operations. llvm-svn: 354480
*	GlobalISel: Implement moreElementsVector for g_insert results	Matt Arsenault	2019-02-20	1	-14/+24
\| \| \| \|	llvm-svn: 354477
*	GlobalISel: Implement moreElementsVector for select	Matt Arsenault	2019-02-19	1	-18/+9
\| \| \| \|	llvm-svn: 354354
*	GlobalISel: Implement moreElementsVector for G_EXTRACT source	Matt Arsenault	2019-02-19	1	-0/+1
\| \| \| \|	llvm-svn: 354348
*	GlobalISel: Implement moreElementsVector for bit ops	Matt Arsenault	2019-02-19	1	-0/+20
\| \| \| \|	llvm-svn: 354345
*	AMDGPU: Use MachineInstr::mayAlias to replace ↵	Changpeng Fang	2019-02-18	2	-21/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	areMemAccessesTriviallyDisjoint in LoadStoreOptimizer pass. Summary: This is to fix a memory dependence bug in LoadStoreOptimizer. Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D58295 llvm-svn: 354295
*	GlobalISel: Implement widenScalar for g_extract scalar results	Matt Arsenault	2019-02-18	1	-2/+3
\| \| \| \|	llvm-svn: 354293
*	AMDGPU: Set ABI version to 1 for code object v3	Konstantin Zhuravlyov	2019-02-14	3	-10/+20
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57811 llvm-svn: 354085
*	GlobalISel: Add alignment to LegalityQuery MMOs	Matt Arsenault	2019-02-14	1	-9/+10
\| \| \| \| \| \| \|	This allows targets to specify the minimum alignment required for the load/store. llvm-svn: 354071
*	AMDGPU/GlobalISel: Fix RegBankSelect for GEP.	Matt Arsenault	2019-02-14	2	-32/+15
\| \| \| \| \| \| \| \| \| \|	This is basically a pointer typed add, so shouldn't be any different. This was assuming everything was an SGPR, which is not true. Also cleanup legality for GEP. I don't seem to be seeing the problem the hack marking s64 as a legal pointer type the comment mentions. llvm-svn: 354067