bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU/GlobalISel: Split 64-bit vector extracts during RegBankSelect	Matt Arsenault	2019-10-03	1	-6/+114
\| \| \| \| \| \| \| \|	Register indexing 64-bit elements is possible on the SALU, but not the VALU. Handle splitting this into two 32-bit indexes. Extend waterfall loop handling to allow moving a range of instructions. llvm-svn: 373638
*	AMDGPU/GlobalISel: Allow VGPR to index SGPR register	Matt Arsenault	2019-10-03	1	-3/+2
\| \| \| \| \| \| \| \|	We can still do a waterfall loop over the index if using a VGPR to index an SGPR. The result will still be a VGPR, but we can avoid the wide copy of the source register to a VGPR. llvm-svn: 373637
*	AMDGPU/GlobalISel: Add some more tests for G_INSERT legalization	Matt Arsenault	2019-10-03	1	-0/+168
\| \| \| \|	llvm-svn: 373636
*	AMDGPU/GlobalISel: Fix mutationIsSane assert v8s8 and	Matt Arsenault	2019-10-03	1	-0/+166
\| \| \| \| \| \|	This would try to do FewerElements to v9s8 llvm-svn: 373635
*	AMDGPU/GlobalISel: Expand G_BITCAST legality	Matt Arsenault	2019-10-03	1	-0/+102
\| \| \| \|	llvm-svn: 373567
*	[AMDGPU] Extend buffer intrinsics with swizzling	Piotr Sobczak	2019-10-02	5	-131/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extend cachepolicy operand in the new VMEM buffer intrinsics to supply information whether the buffer data is swizzled. Also, propagate this information to MIR. Intrinsics updated: int_amdgcn_raw_buffer_load int_amdgcn_raw_buffer_load_format int_amdgcn_raw_buffer_store int_amdgcn_raw_buffer_store_format int_amdgcn_raw_tbuffer_load int_amdgcn_raw_tbuffer_store int_amdgcn_struct_buffer_load int_amdgcn_struct_buffer_load_format int_amdgcn_struct_buffer_store int_amdgcn_struct_buffer_store_format int_amdgcn_struct_tbuffer_load int_amdgcn_struct_tbuffer_store Furthermore, disable merging of VMEM buffer instructions in SI Load/Store optimizer, if the "swizzled" bit on the instruction is on. The default value of the bit is 0, meaning that data in buffer is linear and buffer instructions can be merged. There is no difference in the generated code with this commit. However, in the future it will be expected that front-ends use buffer intrinsics with correct "swizzled" bit set. Reviewers: arsenm, nhaehnle, tpr Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, arphaman, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68200 llvm-svn: 373491
*	AMDGPU/GlobalISel: Assume VGPR for G_FRAME_INDEX	Matt Arsenault	2019-10-02	1	-1/+1
\| \| \| \| \| \| \| \| \|	In principle this should behave as any other constant. However eliminateFrameIndex currently assumes a VALU use and uses a vector shift. Work around this by selecting to VGPR for now until eliminateFrameIndex is fixed. llvm-svn: 373415
*	AMDGPU/GlobalISel: Private loads always use VGPRs	Matt Arsenault	2019-10-02	1	-0/+17
\| \| \| \|	llvm-svn: 373414
*	AMDGPU/GlobalISel: Legalize 1024-bit G_BUILD_VECTOR	Matt Arsenault	2019-10-02	2	-40/+155
\| \| \| \| \| \|	This will be needed to support AGPR operations. llvm-svn: 373413
*	AMDGPU/GlobalISel: Fix RegBankSelect for 1024-bit values	Matt Arsenault	2019-10-02	1	-0/+28
\| \| \| \|	llvm-svn: 373412
*	AMDGPU/GlobalISel: Increase max legal size to 1024	Matt Arsenault	2019-10-01	8	-84/+440
\| \| \| \| \| \| \| \|	There are 1024 bit register classes defined for AGPRs. Additionally OpenCL defines vectors up to 16 x i64, and this helps those tests legalize. llvm-svn: 373350
*	Revert "GlobalISel: Handle llvm.read_register"	Dmitri Gribenko	2019-10-01	1	-2/+0
\| \| \| \| \| \| \| \|	This reverts commit r373294. It broke Clang's CodeGen/arm64-microsoft-status-reg.cpp: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/18483 llvm-svn: 373310
*	AMDGPU/GlobalISel: Select s1 src G_SITOFP/G_UITOFP	Matt Arsenault	2019-10-01	2	-44/+515
\| \| \| \|	llvm-svn: 373298
*	AMDGPU/GlobalISel: Add support for init.exec intrinsics	Matt Arsenault	2019-10-01	2	-0/+4
\| \| \| \| \| \| \|	TThe existing wave32 behavior seems broken and incomplete, but this reproduces it. llvm-svn: 373296
*	GlobalISel: Handle llvm.read_register	Matt Arsenault	2019-10-01	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	SelectionDAG has a bunch of machinery to defer this to selection time for some reason. Just directly emit a copy during IRTranslator. The x86 usage does somewhat questionably check hasFP, which could depend on the whole function being at minimum translated. This does lose the convergent bit if the callsite had it, which may be a problem. We also lose that in general for intrinsics, which may also be a problem. llvm-svn: 373294
*	AMDGPU/GlobalISel: Avoid creating shift of 0 in arg lowering	Matt Arsenault	2019-10-01	1	-1/+1
\| \| \| \| \| \| \| \|	This is sort of papering over the fact that we don't run a combiner anywhere, but avoiding creating 2 instructions in the first place is easy. llvm-svn: 373293
*	AMDGPU/GlobalISel: Select G_UADDO/G_USUBO	Matt Arsenault	2019-10-01	2	-0/+394
\| \| \| \|	llvm-svn: 373288
*	GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sources	Matt Arsenault	2019-10-01	2	-46/+232
\| \| \| \| \| \|	Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU. llvm-svn: 373287
*	AMDGPU/GlobalISel: Legalize G_GLOBAL_VALUE	Matt Arsenault	2019-10-01	1	-0/+156
\| \| \| \| \| \| \|	Handle other cases besides LDS. Mostly a straight port of the existing handling, without the intermediate custom nodes. llvm-svn: 373286
*	AMDGPU/GlobalISel: Fix select for v2s16 and/or/xor	Matt Arsenault	2019-09-30	3	-45/+45
\| \| \| \|	llvm-svn: 373180
*	AMDGPU/GlobalISel: Allow selection of scalar min/max	Matt Arsenault	2019-09-21	4	-40/+20
\| \| \| \| \| \| \| \| \|	I believe all of the uniform/divergent pattern predicates are redundant and can be removed. The uniformity bit already influences the register class, and nothhing has broken when I've removed this and others. llvm-svn: 372450
*	Revert r372366 "Use getTargetConstant for BLENDI, and add a test to catch it."	Nico Weber	2019-09-20	1	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 52621307bcab2013e8833f3317cebd63a6db3885. Tests have been failing all night with [0/2] ACTION //llvm/test:check-llvm(//llvm/utils/gn/build/toolchain:unix) -- Testing: 33647 tests, 64 threads -- Testing: 0 .. 10.. UNRESOLVED: LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll (6943 of 33647) ****************** TEST 'LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll' FAILED **************** Test has no run line! ****************** Since there were other concerns on https://reviews.llvm.org/D67785, I'm just reverting for now. llvm-svn: 372383
*	Use getTargetConstant for BLENDI, and add a test to catch it.	Sterling Augustine	2019-09-20	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes a crasher introduced by r372338. Reviewers: echristo, arsenm Subscribers: jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67785 Tighten up the test case. llvm-svn: 372366
*	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"	Matt Arsenault	2019-09-19	19	-91/+2794
\| \| \| \| \| \| \| \| \|	This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
*	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"	Hans Wennborg	2019-09-19	19	-2794/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
*	AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.ds.swizzle	Matt Arsenault	2019-09-19	1	-0/+21
\| \| \| \|	llvm-svn: 372297
*	AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store.format	Matt Arsenault	2019-09-19	2	-0/+833
\| \| \| \| \| \| \| \| \|	This needs special handling due to some subtargets that have a nonstandard register layout for f16 vectors Also reject some illegal types on other targets. llvm-svn: 372293
*	AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store	Matt Arsenault	2019-09-19	2	-168/+791
\| \| \| \|	llvm-svn: 372292
*	AMDGPU/GlobalISel: RegBankSelect struct buffer load/store	Matt Arsenault	2019-09-19	2	-0/+353
\| \| \| \|	llvm-svn: 372291
*	AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.raw.buffer.{load\|store}	Matt Arsenault	2019-09-19	2	-0/+341
\| \| \| \|	llvm-svn: 372290
*	AMDGPU/GlobalISel: Attempt to RegBankSelect image intrinsics	Matt Arsenault	2019-09-19	2	-0/+449
\| \| \| \| \| \|	Images should always have 2 consecutive, mandatory SGPR arguments. llvm-svn: 372289
*	AMDGPU/GlobalISel: Fix RegBankSelect G_SMULH/G_UMULH pre-gfx9	Matt Arsenault	2019-09-19	2	-36/+88
\| \| \| \| \| \|	The scalar versions were only introduced in gfx9. llvm-svn: 372286
*	GlobalISel: Don't materialize immarg arguments to intrinsics	Matt Arsenault	2019-09-19	8	-55/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Encode them directly as an imm argument to G_INTRINSIC. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_ instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
*	[GlobalISel] Partially revert r371901.	Amara Emerson	2019-09-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	r371901 was overeager and widenScalarDst() and the like in the legalizer attempt to increment the insert point given in order to add new instructions after the currently legalizing inst. In cases where the insertion point is not exactly the current instruction, then callers need to de-compensate for the behaviour by decrementing the insertion iterator before calling them. It's not a nice state of affairs, for now just undo the problematic parts of the change. llvm-svn: 372050
*	AMDGPU/GlobalISel: Fix some broken run lines	Matt Arsenault	2019-09-16	4	-8/+8
\| \| \| \|	llvm-svn: 371992
*	AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL	Matt Arsenault	2019-09-16	2	-0/+62
\| \| \| \|	llvm-svn: 371991
*	AMDGPU/GlobalISel: Remove another illegal select test	Matt Arsenault	2019-09-16	1	-37/+0
\| \| \| \|	llvm-svn: 371990
*	AMDGPU/GlobalISel: Remove illegal select tests	Matt Arsenault	2019-09-16	1	-74/+0
\| \| \| \| \| \|	These fail in a release build. llvm-svn: 371955
*	AMDGPU/GlobalISel: Select SMRD loads for more types	Matt Arsenault	2019-09-16	1	-0/+1007
\| \| \| \|	llvm-svn: 371954
*	AMDGPU/GlobalISel: RegBankSelect for kill	Matt Arsenault	2019-09-16	1	-0/+68
\| \| \| \|	llvm-svn: 371953
*	AMDGPU/GlobalISel: Legalize s1 source G_[SU]ITOFP	Matt Arsenault	2019-09-16	2	-0/+72
\| \| \| \|	llvm-svn: 371952
*	AMDGPU/GlobalISel: Set type on vgpr live in special arguments	Matt Arsenault	2019-09-16	1	-0/+27
\| \| \| \| \| \| \|	Fixes assertion with workitem ID intrinsics used in non-kernel functions. llvm-svn: 371951
*	AMDGPU/GlobalISel: Select S16->S32 fptoint	Matt Arsenault	2019-09-16	2	-5/+210
\| \| \| \|	llvm-svn: 371950
*	AMDGPU/GlobalISel: Select s32->s16 G_[US]ITOFP	Matt Arsenault	2019-09-16	4	-9/+143
\| \| \| \|	llvm-svn: 371949
*	AMDGPU/GlobalISel: Fix VALU s16 fneg	Matt Arsenault	2019-09-16	1	-11/+9
\| \| \| \|	llvm-svn: 371948
*	[GlobalISel] Fix insertion point of new instructions to be after PHIs.	Amara Emerson	2019-09-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	For some reason we sometimes insert new instructions one instruction before the first non-PHI when legalizing. This can result in having non-PHI instructions before PHIs, which mean that PHI elimination doesn't catch them. Differential Revision: https://reviews.llvm.org/D67570 llvm-svn: 371901
*	AMDGPU/GlobalISel: Legalize s32->s16 G_SITOFP/G_UITOFP	Matt Arsenault	2019-09-13	2	-0/+34
\| \| \| \|	llvm-svn: 371811
*	AMDGPU/GlobalISel: Fix RegBankSelect for amdgcn.else	Matt Arsenault	2019-09-13	2	-0/+35
\| \| \| \|	llvm-svn: 371808
*	AMDGPU/GlobalISel: Select 16-bit VALU bit ops	Matt Arsenault	2019-09-13	3	-18/+15
\| \| \| \|	llvm-svn: 371807
*	AMDGPU/GlobalISel: Legalize G_FFLOOR	Matt Arsenault	2019-09-13	3	-0/+639
\| \| \| \|	llvm-svn: 371803