bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU/GlobalISel: Set insert point after waterfall loop	Matt Arsenault	2020-01-13	1	-2/+3
\| \| \| \| \| \| \| \| \|	The current users of the waterfall loop utility functions do not make use of the restored original insert point. The insertion is either done, or they set the insert point somewhere else. A future change will want to insert instructions after the waterfall loop, but figuring out the point after the loop is more difficult than ensuring the insert point is there after the loop.
*	AMDGPU/GlobalISel: Simplify assert	Matt Arsenault	2020-01-13	1	-11/+3
\|
*	AMDGPU/GlobalISel: Copy type when inserting readfirstlane	Matt Arsenault	2020-01-12	1	-0/+2
\| \| \| \| \|	getDefIgnoringCopies will fail to find any def if no type is set if we try to use it on the use's operand, so propagate the type.
*	AMDGPU/GlobalISel: Fix G_EXTRACT_VECTOR_ELT mapping for s-v case	Matt Arsenault	2020-01-09	1	-15/+85
\| \| \| \| \| \|	If an SGPR vector is indexed with a VGPR, the actual indexing will be done on the SGPR and produce an SGPR. A copy needs to be inserted inside the waterwall loop to the VGPR result.
*	AMDGPU/GlobalISel: Fix unused variable warning in release	Matt Arsenault	2020-01-06	1	-2/+1
\|
*	AMDGPU/GlobalISel: Legalize G_READCYCLECOUNTER	Matt Arsenault	2020-01-06	1	-1/+2
\|
*	AMDGPU/GlobalISel: Replace handling of boolean values	Matt Arsenault	2020-01-06	1	-93/+293
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This solves selection failures with generated selection patterns, which would fail due to inferring the SGPR reg bank for virtual registers with a set register class instead of VCC bank. Use instruction selection would constrain the virtual register to a specific class, so when the def was selected later the bank no longer was set to VCC. Remove the SCC reg bank. SCC isn't directly addressable, so it requires copying from SCC to an allocatable 32-bit register during selection, so these might as well be treated as 32-bit SGPR values. Now any scalar boolean value that will produce an outupt in SCC should be widened during RegBankSelect to s32. Any s1 value should be a vector boolean during selection. This makes the vcc register bank unambiguous with a normal SGPR during selection. Summary of how this should now work: - G_TRUNC is always a no-op, and never should use a vcc bank result. - SALU boolean operations should be promoted to s32 in RegBankSelect apply mapping - An s1 value means vcc bank at selection. The exception is for legalization artifacts that use s1, which are never VCC. All other contexts should infer the VCC register classes for s1 typed registers. The LLT for the register is now needed to infer the correct register class. Extensions with vcc sources should be legalized to a select of constants during RegBankSelect. - Copy from non-vcc to vcc ensures high bits of the input value are cleared during selection. - SALU boolean inputs should ensure the inputs are 0/1. This includes select, conditional branches, and carry-ins. There are a few somewhat dirty details. One is that G_TRUNC/G_*EXT selection ignores the usual register-bank from register class functions, and can't handle truncates with VCC result banks. I think this is OK, since the artifacts are specially treated anyway. This does require some care to avoid producing cases with vcc. There will also be no 100% reliable way to verify this rule is followed in selection in case of register classes, and violations manifests themselves as invalid copy instructions much later. Standard phi handling also only considers the bank of the result register, and doesn't insert copies to make the source banks match. This doesn't work for vcc, so we have to manually correct phi inputs in this case. We should add a verifier check to make sure there are no phis with mixed vcc and non-vcc register bank inputs. There's also some duplication with the LegalizerHelper, and some code which should live in the helper. I don't see a good way to share special knowledge about what types to use for intermediate operations depending on the bank for example. Using the helper to replace extensions with selects also seems somewhat awkward to me. Another issue is there are some contexts calling getRegBankFromRegClass that apparently don't have the LLT type for the register, but I haven't yet run into a real issue from this. This also introduces new unnecessary instructions in most cases, since we don't yet try to optimize out the zext when the source is known to come from a compare.
*	GlobalISel: Implement lower for G_INTRINSIC_ROUND	Matt Arsenault	2020-01-06	1	-1/+0
\| \| \| \| \|	Mostly copied from AMDGPU lowering implementation, except used G_SITOFP instead of directly creating a select on -1.0, 0.0.
*	AMDGPU/GlobalISel: Refine SMRD selection rules	Matt Arsenault	2020-01-04	1	-4/+22
\| \| \| \| \|	Fix selecting these for volatile global loads, and ensure the loads are constant enough.
*	AMDGPU/GlobalISel: Assume vcc phis for any vcc input	Matt Arsenault	2020-01-04	1	-2/+3
\| \| \| \| \| \| \|	This produces more intelligible looking results, more comparabble to the DAG output in the simplest cases. This is probably wrong in complex control flow, but RegBankSelect doesn't attempt analyzing if this is on a masked path for selecting the bank yet.
*	AMDGPU/GlobalISel: Implement applyMappingImpl less incorrectly	Matt Arsenault	2020-01-04	1	-13/+23
\| \| \| \| \| \| \| \| \| \| \|	We're checking the current register bank of the registers in the instruction, but the mapping may have inserted cross bank copies and is expecting to replace the registers. We mostly get away with this currently, because VGPR->SGPR copies are illegal, and we assume this won't happen. In a future change, we'll start relying on more cross register bank copies being inserted, and this starts to break down.
*	GlobalISel: Add type argument to getRegBankFromRegClass	Matt Arsenault	2020-01-03	1	-2/+3
\| \| \| \| \| \|	AMDGPU can't unambiguously go back from the selected instruction register class to the register bank without knowing if this was used in a boolean context.
*	AMDGPU/GlobalISel: Account for G_PHI result bank	Matt Arsenault	2019-12-30	1	-13/+23
\| \| \| \| \| \| \| \| \|	Sometimes the result bank of the phi is already assigned to something, and should not be ignored. This is in preparation for additional boolean phi handling changes. Also refine the logic to fix some cases that were incorrectly deciding to use SGPRs.
*	AMDGPU/GlobalISel: Use SReg_32 for readfirstlane constraining	Matt Arsenault	2019-12-27	1	-1/+1
\| \| \| \| \|	This matches the DAG behavior where we don't use SReg_32_XM0 everywhere anymore, and fixes not coalescing the copies into m0.
*	AMDGPU/GlobalISel: Fix mapping and selection of llvm.amdgcn.div.fixup	Matt Arsenault	2019-12-24	1	-0/+1
\|
*	AMDGPU/GlobalISel: Simplify code	Matt Arsenault	2019-12-21	1	-5/+5
\| \| \| \| \|	This can directly access the register bank, and doesn't need to get it through the ID.
*	AMDGPU/GlobalISel: Add AGPR bank and RegBankSelect mfma intrinsics	Austin Kerbow	2019-12-01	1	-6/+63
\| \| \| \|	Differential Revision: https://reviews.llvm.org/D70871
*	[globalisel] Rename G_GEP to G_PTR_ADD	Daniel Sanders	2019-11-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: G_GEP is rather poorly named. It's a simple pointer+scalar addition and doesn't support any of the complexities of getelementptr. I therefore propose that we rename it. There's a G_PTR_MASK so let's follow that convention and go with G_PTR_ADD Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69734
*	AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHG	Matt Arsenault	2019-10-25	1	-1/+2
\| \| \| \| \| \| \| \|	Custom lower this to a target instruction with the merge operands. I think it might be better to directly select this and emit a REG_SEQUENCE, but this would be more work since it would require splitting the tablegen patterns for these cases from the other atomics.
*	GlobalISel: Implement lower for G_SADDO/G_SSUBO	Matt Arsenault	2019-10-16	1	-2/+0
\| \| \| \| \| \| \|	Port directly from SelectionDAG, minus the path using ISD::SADDSAT/ISD::SSUBSAT. llvm-svn: 375042
*	AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer	Matt Arsenault	2019-10-09	1	-4/+14
\| \| \| \| \| \| \| \| \| \|	This was ignoring the register bank of the input pointer, and isUniformMMO seems overly aggressive. This will now conservatively assume a VGPR in cases where the incoming bank hasn't been determined yet (i.e. is from a loop phi). llvm-svn: 374255
*	GlobalISel: Add target pre-isel instructions	Matt Arsenault	2019-10-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allows targets to introduce regbankselectable pseudo-instructions. Currently the closet feature to this is an intrinsic. However this requires creating a public intrinsic declaration. This litters the public intrinsic namespace with operations we don't necessarily want to expose to IR producers, and would rather leave as private to the backend. Use a new instruction bit. A previous attempt tried to keep using enum value ranges, but it turned into a mess. llvm-svn: 373937
*	Second attempt to add iterator_range::empty()	Jordan Rose	2019-10-07	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Doing this makes MSVC complain that `empty(someRange)` could refer to either C++17's std::empty or LLVM's llvm::empty, which previously we avoided via SFINAE because std::empty is defined in terms of an empty member rather than begin and end. So, switch callers over to the new method as it is added. https://reviews.llvm.org/D68439 llvm-svn: 373935
*	AMDGPU/GlobalISel: RegBankSelect mul24 intrinsics	Matt Arsenault	2019-10-06	1	-0/+2
\| \| \| \|	llvm-svn: 373841
*	AMDGPU/GlobalISel: RegBankSelect DS GWS intrinsics	Matt Arsenault	2019-10-06	1	-0/+35
\| \| \| \|	llvm-svn: 373840
*	AMDGPU/GlobalISel: Fix RegBankSelect for sendmsg intrinsics	Matt Arsenault	2019-10-06	1	-6/+5
\| \| \| \| \| \|	This wasn't updated for the immarg handling change. llvm-svn: 373837
*	[NFC] Add { } to silence compiler warning [-Wmissing-braces].	Huihui Zhang	2019-10-04	1	-1/+1
\| \| \| \| \| \| \| \| \|	../llvm-project/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:355:48: warning: suggest braces around initialization of subobject [-Wmissing-braces] return addMappingFromTable<1>(MI, MRI, { 0 }, Table); ^ {} llvm-svn: 373784
*	AMDGPU/GlobalISel: Support wave32 waterfall loops	Matt Arsenault	2019-10-04	1	-22/+30
\| \| \| \|	llvm-svn: 373714
*	[NFC] Fix unused variable in release builds	Jordan Rupprecht	2019-10-03	1	-1/+2
\| \| \| \|	llvm-svn: 373646
*	AMDGPU/GlobalISel: Handle RegBankSelect of G_INSERT_VECTOR_ELT	Matt Arsenault	2019-10-03	1	-5/+77
\| \| \| \|	llvm-svn: 373639
*	AMDGPU/GlobalISel: Split 64-bit vector extracts during RegBankSelect	Matt Arsenault	2019-10-03	1	-167/+257
\| \| \| \| \| \| \| \|	Register indexing 64-bit elements is possible on the SALU, but not the VALU. Handle splitting this into two 32-bit indexes. Extend waterfall loop handling to allow moving a range of instructions. llvm-svn: 373638
*	AMDGPU/GlobalISel: Allow VGPR to index SGPR register	Matt Arsenault	2019-10-03	1	-4/+6
\| \| \| \| \| \| \| \|	We can still do a waterfall loop over the index if using a VGPR to index an SGPR. The result will still be a VGPR, but we can avoid the wide copy of the source register to a VGPR. llvm-svn: 373637
*	AMDGPU/GlobalISel: Don't re-get subtarget	Matt Arsenault	2019-10-03	1	-6/+3
\| \| \| \| \| \|	It's already available in the class. llvm-svn: 373568
*	AMDGPU/GlobalISel: Use getIntrinsicID helper	Matt Arsenault	2019-10-02	1	-5/+5
\| \| \| \|	llvm-svn: 373417
*	AMDGPU/GlobalISel: Assume VGPR for G_FRAME_INDEX	Matt Arsenault	2019-10-02	1	-1/+7
\| \| \| \| \| \| \| \| \|	In principle this should behave as any other constant. However eliminateFrameIndex currently assumes a VALU use and uses a vector shift. Work around this by selecting to VGPR for now until eliminateFrameIndex is fixed. llvm-svn: 373415
*	AMDGPU/GlobalISel: Private loads always use VGPRs	Matt Arsenault	2019-10-02	1	-4/+6
\| \| \| \|	llvm-svn: 373414
*	AMDGPU/GlobalISel: Add support for init.exec intrinsics	Matt Arsenault	2019-10-01	1	-1/+8
\| \| \| \| \| \| \|	TThe existing wave32 behavior seems broken and incomplete, but this reproduces it. llvm-svn: 373296
*	AMDGPU/GlobalISel: Allow scc/vcc alternative mappings for s1 constants	Matt Arsenault	2019-10-01	1	-1/+15
\| \| \| \|	llvm-svn: 373295
*	[NFC] Add { } to silence compiler warning [-Wmissing-braces].	Huihui Zhang	2019-09-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	/local/mnt/workspace/huihuiz/llvm-comm-git-2/llvm-project/llvm/lib/Object/MachOObjectFile.cpp:2731:7: warning: suggest braces around initialization of subobject [-Wmissing-braces] "i386", "x86_64", "x86_64h", "armv4t", "arm", "armv5e", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ { 1 warning generated. /local/mnt/workspace/huihuiz/llvm-comm-git-2/llvm-project/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:355:46: warning: suggest braces around initialization of subobject [-Wmissing-braces] return addMappingFromTable<1>(MI, MRI, { 0 }, Table); ^ {} 1 warning generated. /local/mnt/workspace/huihuiz/llvm-comm-git-2/llvm-project/llvm/tools/llvm-objcopy/ELF/Object.cpp:400:57: warning: suggest braces around initialization of subobject [-Wmissing-braces] static constexpr std::array<uint8_t, 4> ZlibGnuMagic = {'Z', 'L', 'I', 'B'}; ^~~~~~~~~~~~~~~~~~ { } 1 warning generated. llvm-svn: 372811
*	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"	Matt Arsenault	2019-09-19	1	-21/+378
\| \| \| \| \| \| \| \| \|	This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
*	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"	Hans Wennborg	2019-09-19	1	-378/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
*	AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.ds.swizzle	Matt Arsenault	2019-09-19	1	-0/+1
\| \| \| \|	llvm-svn: 372297
*	AMDGPU/GlobalISel: RegBankSelect tbuffer load/store	Matt Arsenault	2019-09-19	1	-6/+14
\| \| \| \| \| \|	These have the same operand structure as the non-t buffer operations. llvm-svn: 372296
*	AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store	Matt Arsenault	2019-09-19	1	-8/+206
\| \| \| \|	llvm-svn: 372292
*	AMDGPU/GlobalISel: RegBankSelect struct buffer load/store	Matt Arsenault	2019-09-19	1	-0/+22
\| \| \| \|	llvm-svn: 372291
*	AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.raw.buffer.{load\|store}	Matt Arsenault	2019-09-19	1	-1/+42
\| \| \| \|	llvm-svn: 372290
*	AMDGPU/GlobalISel: Attempt to RegBankSelect image intrinsics	Matt Arsenault	2019-09-19	1	-5/+95
\| \| \| \| \| \|	Images should always have 2 consecutive, mandatory SGPR arguments. llvm-svn: 372289
*	AMDGPU/GlobalISel: Fix RegBankSelect G_SMULH/G_UMULH pre-gfx9	Matt Arsenault	2019-09-19	1	-3/+7
\| \| \| \| \| \|	The scalar versions were only introduced in gfx9. llvm-svn: 372286
*	GlobalISel: Don't materialize immarg arguments to intrinsics	Matt Arsenault	2019-09-19	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Encode them directly as an imm argument to G_INTRINSIC. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_ instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
*	AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL	Matt Arsenault	2019-09-16	1	-0/+2
\| \| \| \|	llvm-svn: 371991