bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU/GlobalISel: Select G_EXTRACT_VECTOR_ELT	Matt Arsenault	2020-01-09	1	-7/+2
\| \| \| \| \|	Doesn't try to do the fold into the base register of an add of a constant in the index like the DAG path does.
*	CodeGen: Use LLT instead of EVT in getRegisterByName	Matt Arsenault	2020-01-09	1	-1/+1
\| \| \| \| \| \|	Only PPC seems to be using it, and only checks some simple cases and doesn't distinguish between FP. Just switch to using LLT to simplify use from GlobalISel.
*	AMDGPU: Fix not using v_cvt_f16_[iu]16	Matt Arsenault	2020-01-07	1	-2/+4
\| \| \| \| \|	We weren't treating i16->f16 casts as legal on targets with these instructions, and always using a pair of casts through i32.
*	AMDGPU: Select llvm.amdgcn.interp.p2.f16 directly	Matt Arsenault	2020-01-06	1	-16/+0
\| \| \| \|	This will enable automatic GlobalISel support in a future commit.
*	AMDGPU: Fix legalizing f16 fpow	Matt Arsenault	2020-01-06	1	-0/+1
\| \| \| \| \| \|	The existing test only covered one case for r600. The use of mul_legacy also looks suspicious to me, but leave it for now. The patterns are also not making use of source modifiers.
*	[amdgpu] Skip non-instruction values in CF user tracing.	Michael Liao	2020-01-03	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - CF users won't be non-instruction values. Skip them to save the compilation time. It's especially true when there are multiple functions in that module, where, says, a constant may be used in most functions. The current CF user tracing adds significant overhead. Reviewers: alex-t, rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72174
*	Move tail call disabling code to target independent code	Reid Kleckner	2020-01-03	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the "disable-tail-calls" attribute was added, checks were added for it in various backends. Now this code has proliferated, and it is something the target is responsible for checking. Move that responsibility back to the ISels (fast, global, and SD). There's no major functionality change, except for targets that never implemented this check. This LLVM attribute was originally added in d9699bc7bdf0362173fcd256690f61a4d47429c2 (2015). Reviewers: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D72118
*	Remove unneeded extra variable realArgIdx. NFC.	Jay Foad	2020-01-02	1	-4/+3
\|
*	[TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues ↵	Craig Topper	2019-12-30	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	instead of creating a MERGE_VALUES node. NFCI This allows us to clean up some places that were peeking through the MERGE_VALUES node after the call. By returning the SDValues directly, we can clean that up. Unfortunately, there are several call sites in AMDGPU that wanted the MERGE_VALUES and now need to create their own.
*	[AMDGPU] Don't create MachinePointerInfos with an UndefValue pointer	Jay Foad	2019-12-23	1	-9/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The only useful information the UndefValue conveys is the address space, which MachinePointerInfo can represent directly without referring to an IR value. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71838
*	Revert "AMDGPU: Try to commute sub of boolean ext"	Tim Renouf	2019-12-13	1	-26/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 69fcfb7d3597e0cdb5554b4e672e9032b411b167. As shown in the test I attached to this commit, the change I reverted causes a problem with "zext(cc1) - zext(cc2)". It commuted the operands to the sub and used different logic to select the addc/subc instruction: sub zext (setcc), x => addcarry 0, x, setcc sub sext (setcc), x => subcarry 0, x, setcc ... but that is bogus. I believe it is not possible to fold those commuted patterns into any form of addcarry or subcarry. It may have worked as intended before "AMDGPU: Change boolean content type to 0 or 1" because the setcc was considered to be -1 rather than 1. Differential Revision: https://reviews.llvm.org/D70978 Change-Id: If2139421aa6c935cbd1d925af58fe4a4aa9e8f43
*	[AMDGPU] Keep consistent check of legal addressing mode.	Michael Liao	2019-11-20	1	-13/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Add test cases for GFX10, which has narrower offset range compared to GFX9. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70473
*	AMDGPU: Refactor treatment of denormal mode	Matt Arsenault	2019-11-19	1	-26/+41
\| \| \| \| \| \| \| \| \| \| \|	Start moving towards treating this as a property of the calling convention, and not the subtarget. The default denormal mode should not be part of the subtarget, and be moved into a separate function attribute. This patch is still NFC. The denormal mode remains as a subtarget feature for now, but make the necessary changes to switch to using an attribute.
*	DAG: Add function context to isFMAFasterThanFMulAndFAdd	Matt Arsenault	2019-11-19	1	-2/+3
\| \| \| \| \| \| \| \|	AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget. AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.
*	[AMDGPU] Lower llvm.amdgcn.s.buffer.load.v3[i\|f]32	Piotr Sobczak	2019-11-15	1	-6/+24
\| \| \| \| \| \| \| \| \| \|	Summary: Add lowering support for 32-bit vec3 variant of s.buffer.load intrinsic. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70118
*	AMDGPU: Change boolean content type to 0 or 1	Matt Arsenault	2019-11-15	1	-0/+7
\| \| \| \| \| \| \| \|	The usage of target boolean checks is overly inflexible, since sext and zext of a compare are equally cheap. The choice is arbitrary, but using 0/1 to some degree is the choice of lower resistance since that's what most targets use. This enables a few combines that don't bother to support ZeroOrNegativeOneBooleanContent.
*	AMDGPU: Try to commute sub of boolean ext	Matt Arsenault	2019-11-15	1	-3/+26
\| \| \| \|	Avoids another regression in a future patch.
*	AMDGPU: Extend add x, (ext setcc) combine to sub	Matt Arsenault	2019-11-13	1	-0/+22
\| \| \| \| \| \|	This is the same as the add case, but inverts the operation type. This avoids regressions in a future patch.
*	AMDGPU: Select global atomicrmw fadd	Matt Arsenault	2019-11-06	1	-5/+14
\| \| \| \|	This only works if there is no use of the return value.
*	DAG: Add DAG argument to isFPExtFoldable	Matt Arsenault	2019-10-31	1	-2/+2
\| \| \| \| \|	For AMDGPU this is dependent on the FP mode, which should eventually not be a property of the subtarget.
*	DAG: Add new control for ISD::FMAD formation	Matt Arsenault	2019-10-31	1	-0/+13
\| \| \| \| \| \| \| \| \|	For AMDGPU this depends on whether denormals are enabled in the default FP mode for the function. Currently this is treated as a subtarget feature, so FMAD is selectively legal based on that. I want to move this out of the subtarget features so this can be controlled with a denormal mode attribute. Additionally, this will allow folding based on a future ftz fast math flag.
*	AMDGPU: Simplify getAddressSpace calls	Matt Arsenault	2019-10-31	1	-5/+5
\| \| \| \| \|	These can be directly taken from the GlobalValue instead of going through the type.
*	AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHG	Matt Arsenault	2019-10-25	1	-7/+0
\| \| \| \| \| \| \| \|	Custom lower this to a target instruction with the merge operands. I think it might be better to directly select this and emit a REG_SEQUENCE, but this would be more work since it would require splitting the tablegen patterns for these cases from the other atomics.
*	AMDGPU: Select basic interp directly from intrinsics	Matt Arsenault	2019-10-21	1	-31/+10
\| \| \| \|	llvm-svn: 375457
*	AMDGPU: Use CopyToReg for interp intrinsic lowering	Matt Arsenault	2019-10-21	1	-16/+17
\| \| \| \| \| \| \|	This doesn't use the default value, so doesn't benefit from the hack to help optimize it. llvm-svn: 375450
*	AMDGPU: Don't error on calls to null or undef	Matt Arsenault	2019-10-20	1	-0/+9
\| \| \| \| \| \|	Calls to constants should probably be generally handled. llvm-svn: 375356
*	Prune a LegacyDivergenceAnalysis and MachineLoopInfo include each	Reid Kleckner	2019-10-19	1	-1/+3
\| \| \| \| \| \|	Now X86ISelLowering doesn't depend on many IR analyses. llvm-svn: 375320
*	AMDGPU: Relax 32-bit SGPR register class	Matt Arsenault	2019-10-18	1	-16/+16
\| \| \| \| \| \| \| \| \| \| \|	Mostly use SReg_32 instead of SReg_32_XM0 for arbitrary values. This will allow the register coalescer to do a better job eliminating copies to m0. For GlobalISel, as a terrible hack, use SGPR_32 for things that should use SCC until booleans are solved. llvm-svn: 375267
*	[Alignment][NFC] Use Align for TargetFrameLowering/Subtarget	Guillaume Chatelet	2019-10-17	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68993 llvm-svn: 375084
*	[AMDGPU] Come back patch for the 'Assign register class for cross block ↵	Alexander Timofeev	2019-10-14	1	-0/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	values according to the divergence.' Detailed description: After https://reviews.llvm.org/D59990 submit several issues were discovered. Changes in common code were preserved but AMDGPU specific part was reverted to keep the backend working correctly. Discovered issues were addressed in the following commits: https://reviews.llvm.org/D67662 https://reviews.llvm.org/D67101 https://reviews.llvm.org/D63953 https://reviews.llvm.org/D63731 This change brings back AMDGPU specific changes. Reviewed by: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D68635 llvm-svn: 374767
*	AMDGPU: Use SGPR_128 instead of SReg_128 for vregs	Matt Arsenault	2019-10-10	1	-6/+6
\| \| \| \| \| \| \| \| \|	SGPR_128 only includes the real allocatable SGPRs, and SReg_128 adds the additional non-allocatable TTMP registers. There's no point in allocating SReg_128 vregs. This shrinks the size of the classes regalloc needs to consider, which is usually good. llvm-svn: 374284
*	AMDGPU: Add offsets to MMO when lowering buffer intrinsics	Tom Stellard	2019-10-08	1	-9/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Without offsets on the MachineMemOperands (MMOs), MachineInstr::mayAlias() will return true for all reads and writes to the same resource descriptor. This leads to O(N^2) complexity in the MachineScheduler when analyzing dependencies of buffer loads and stores. It also limits the SILoadStoreOptimizer from merging more instructions. This patch reduces the compile time of one pathological compute shader from 12 seconds to 1 second. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65097 llvm-svn: 374087
*	[AMDGPU] Extend buffer intrinsics with swizzling	Piotr Sobczak	2019-10-02	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extend cachepolicy operand in the new VMEM buffer intrinsics to supply information whether the buffer data is swizzled. Also, propagate this information to MIR. Intrinsics updated: int_amdgcn_raw_buffer_load int_amdgcn_raw_buffer_load_format int_amdgcn_raw_buffer_store int_amdgcn_raw_buffer_store_format int_amdgcn_raw_tbuffer_load int_amdgcn_raw_tbuffer_store int_amdgcn_struct_buffer_load int_amdgcn_struct_buffer_load_format int_amdgcn_struct_buffer_store int_amdgcn_struct_buffer_store_format int_amdgcn_struct_tbuffer_load int_amdgcn_struct_tbuffer_store Furthermore, disable merging of VMEM buffer instructions in SI Load/Store optimizer, if the "swizzled" bit on the instruction is on. The default value of the bit is 0, meaning that data in buffer is linear and buffer instructions can be merged. There is no difference in the generated code with this commit. However, in the future it will be expected that front-ends use buffer intrinsics with correct "swizzled" bit set. Reviewers: arsenm, nhaehnle, tpr Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, arphaman, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68200 llvm-svn: 373491
*	TLI: Remove DAG argument from getRegisterByName	Matt Arsenault	2019-10-01	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	Replace with the MachineFunction. X86 is the only user, and only uses it for the function. This removes one obstacle from using this in GlobalISel. The other is the more tolerable EVT argument. The X86 use of the function seems questionable to me. It checks hasFP, before frame lowering. llvm-svn: 373292
*	[Alignment][NFC] Remove unneeded llvm:: scoping on Align types	Guillaume Chatelet	2019-09-27	1	-3/+3
\| \| \| \|	llvm-svn: 373081
*	[TargetLowering] Make allowsMemoryAccess methode virtual.	Thomas Raoux	2019-09-26	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	Rename old function to explicitly show that it cares only about alignment. The new allowsMemoryAccess call the function related to alignment by default and can be overridden by target to inform whether the memory access is legal or not. Differential Revision: https://reviews.llvm.org/D67121 llvm-svn: 372935
*	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"	Matt Arsenault	2019-09-19	1	-52/+52
\| \| \| \| \| \| \| \| \|	This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
*	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"	Hans Wennborg	2019-09-19	1	-52/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
*	GlobalISel: Don't materialize immarg arguments to intrinsics	Matt Arsenault	2019-09-19	1	-52/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Encode them directly as an imm argument to G_INTRINSIC. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_ instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
*	[Alignment][NFC] Remove LogAlignment functions	Guillaume Chatelet	2019-09-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67620 llvm-svn: 372231
*	AMDGPU/GlobalISel: Set type on vgpr live in special arguments	Matt Arsenault	2019-09-16	1	-1/+2
\| \| \| \| \| \| \|	Fixes assertion with workitem ID intrinsics used in non-kernel functions. llvm-svn: 371951
*	AMDGPU/GlobalISel: First pass at attempting to legalize load/stores	Matt Arsenault	2019-09-10	1	-14/+24
\| \| \| \| \| \| \| \|	There's still a lot more to do, but this handles decomposing due to alignment. I've gotten it to the point where nothing crashes or infinite loops the legalizer. llvm-svn: 371533
*	[Alignment][NFC] Use llvm::Align for TargetLowering::getPrefLoopAlignment	Guillaume Chatelet	2019-09-10	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: wuzish, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67386 llvm-svn: 371511
*	AMDGPU: Remove pointless wrapper nodes for init.exec intrinsics	Matt Arsenault	2019-09-09	1	-8/+0
\| \| \| \|	llvm-svn: 371364
*	[LLVM][Alignment] Make functions using log of alignment explicit	Guillaume Chatelet	2019-09-05	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align. The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment. A few renames uncovered dubious assignments: - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation. - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation, - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation, Reviewers: lattner, thegameg, courbet Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65945 llvm-svn: 371045
*	AMDGPU: Add intrinsics for address space identification	Matt Arsenault	2019-09-05	1	-0/+13
\| \| \| \| \| \| \|	The library currently uses ptrtoint and directly checks the queue ptr for this, which counts as a pointer capture. llvm-svn: 371009
*	Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in ↵	Jay Foad	2019-09-02	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SI_PC_ADD_REL_OFFSET is 0" Summary: D61491 caused us to use relocs when they're not strictly necessary, to refer to symbols in the text section. This is a pessimization and it's a problem for some loaders that don't support relocs yet. Reviewers: nhaehnle, arsenm, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65813 llvm-svn: 370667
*	AMDGPU: Fix crash from inconsistent register types for v3i16/v3f16	Matt Arsenault	2019-08-27	1	-3/+3
\| \| \| \| \| \| \|	This is something of a workaround since computeRegisterProperties seems to be doing the wrong thing. llvm-svn: 370086
*	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM	Daniel Sanders	2019-08-15	1	-31/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041
*	MVT: Add v3i16/v3f16 vectors	Matt Arsenault	2019-08-15	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	AMDGPU has some buffer intrinsics which theoretically could use this. Some of the generated tables include the 3 and 4 element vector versions of these rounded to 64-bits, which is ambiguous. Add these to help the table disambiguate these. Assertion change is for the path odd sized vectors now take for R600. v3i16 is widened to v4i16, which then needs to be promoted to v4i32. llvm-svn: 369038