bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[AMDGPU] Initialize instruction itinerary from GCNSubtarget	Stanislav Mekhanoshin	2018-09-17	1	-0/+5
\| \| \| \| \| \| \| \|	I need to use it in the GCN codegen. Differential Revision: https://reviews.llvm.org/D52123 llvm-svn: 342400
*	[AMDGPU] Ensure trig range reduction only used for subtargets that require it	David Stuttard	2018-09-14	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GFX9 and above support sin/cos instructions with a greater range and thus don't require a fract instruction prior to invocation. Added a subtarget feature to reflect this and added code to take advantage of expanded range on GFX9+ Also updated the tests to check correct behaviour Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D51933 Change-Id: I1c1f1d3726a5ae32116646ca5cfa1ab4ef69e5b0 llvm-svn: 342222
*	AMDGPU: Re-apply r341982 after fixing the layering issue	Konstantin Zhuravlyov	2018-09-12	1	-47/+78
\| \| \| \| \| \| \| \| \| \| \| \|	Move isa version determination into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). llvm-svn: 342069
*	Revert "AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into ↵	Ilya Biryukov	2018-09-12	1	-78/+47
\| \| \| \| \| \| \| \| \| \| \|	TargetParser." This reverts commit r341982. The change introduced a layering violation. Reverting to unbreak our integrate. llvm-svn: 342023
*	AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination	Konstantin Zhuravlyov	2018-09-11	1	-47/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). Differential Revision: https://reviews.llvm.org/D51890 llvm-svn: 341982
*	AMDGPU: Remove remnants of old address space mapping	Matt Arsenault	2018-08-31	1	-8/+0
\| \| \| \|	llvm-svn: 341165
*	[AMDGPU] Add support for a16 modifiear for gfx9	Ryan Taylor	2018-08-28	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adding support for a16 for gfx9. A16 bit replaces r128 bit for gfx9. Change-Id: Ie8b881e4e6d2f023fb5e0150420893513e5f4841 Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50575 llvm-svn: 340831
*	AMDGPU: Address todo for handling 1/(2 pi)	Matt Arsenault	2018-08-15	1	-4/+5
\| \| \| \|	llvm-svn: 339814
*	AMDGPU: Add feature vi-insts	Matt Arsenault	2018-08-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is necessary to add a VI specific builtin, __builtin_amdgcn_s_dcache_wb. We already have an overly specific feature for one of these builtins, for s_memrealtime. I'm not sure whether it's better to add more of those, or to get rid of that and merge it with vi-insts. Alternatively, maybe this logically goes with scalar-stores? llvm-svn: 339104
*	Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"	Matt Arsenault	2018-07-20	1	-14/+14
\| \| \| \| \| \|	Reverts r337079 with fix for msan error. llvm-svn: 337535
*	Revert "AMDGPU: Fix handling of alignment padding in DAG argument lowering"	Evgeniy Stepanov	2018-07-14	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r337021. WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x1415cd65 in void write_signed<long>(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:95:7 #1 0x1415c900 in llvm::write_integer(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:121:3 #2 0x1472357f in llvm::raw_ostream::operator<<(long) /code/llvm-project/llvm/lib/Support/raw_ostream.cpp:117:3 #3 0x13bb9d4 in llvm::raw_ostream::operator<<(int) /code/llvm-project/llvm/include/llvm/Support/raw_ostream.h:210:18 #4 0x3c2bc18 in void printField<unsigned int, &(amd_kernel_code_s::amd_kernel_code_version_major)>(llvm::StringRef, amd_kernel_code_s const&, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:78:23 #5 0x3c250ba in llvm::printAmdKernelCodeField(amd_kernel_code_s const&, int, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:104:5 #6 0x3c27ca3 in llvm::dumpAmdKernelCode(amd_kernel_code_s const, llvm::raw_ostream&, char const) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:113:5 #7 0x3a46e6c in llvm::AMDGPUTargetAsmStreamer::EmitAMDKernelCodeT(amd_kernel_code_s const&) /code/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:161:3 #8 0xd371e4 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:204:26 [...] Uninitialized value was created by an allocation of 'KernelCode' in the stack frame of function '_ZN4llvm16AMDGPUAsmPrinter21EmitFunctionBodyStartEv' #0 0xd36650 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:192 llvm-svn: 337079
*	AMDGPU: Fix handling of alignment padding in DAG argument lowering	Matt Arsenault	2018-07-13	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was completely broken if there was ever a struct argument, as this information is thrown away during the argument analysis. The offsets as passed in to LowerFormalArguments are not useful, as they partially depend on the legalized result register type, and they don't consider the alignment in the first place. Ignore the Ins array, and instead figure out from the raw IR type what we need to do. This seems to fix the padding computation if the DAG lowering is forced (and stops breaking arguments following padded arguments if the arguments were only partially lowered in the IR) llvm-svn: 337021
*	AMDGPU/SI: Initialize InstrInfo before TargetLoweringInfo in GCNSubtarget	Tom Stellard	2018-07-11	1	-1/+1
\| \| \| \| \| \| \| \|	SITargetLowering queries SIInstrInfo in its constructor, so SIInstrInfo must be initialized first. This fixes msan buildbot failures and was introduced by r336851. llvm-svn: 336861
*	AMDGPU: Refactor Subtarget classes	Tom Stellard	2018-07-11	1	-78/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
*	AMDGPU: Fix UBSan error caused by r335942	Tom Stellard	2018-07-06	1	-22/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fixes PR38071. Reviewers: arsenm, dstenb Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48979 llvm-svn: 336448
*	AMDGPU: Don't use struct type for argument layout	Matt Arsenault	2018-06-29	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	This was introducing unnecessary padding after the explicit arguments, depending on the alignment of the total struct type. Also has the side effect of avoiding creating an extra GEP for the offset from the base kernel argument to the explicit kernel argument offset. llvm-svn: 335999
*	AMDGPU: Separate R600 and GCN TableGen files	Tom Stellard	2018-06-28	1	-238/+375
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 llvm-svn: 335942
*	AMDGPU: Remove ability to reserve VGPRs for debugger	Konstantin Zhuravlyov	2018-06-21	1	-12/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48234 llvm-svn: 335288
*	[AMDGPU][Waitcnt] Fix handling of flat instrs	Mark Searles	2018-06-04	1	-0/+4
\| \| \| \| \| \| \| \|	On GFX9 and earlier, flat memory ops may decrement VMCNT out-of-order as well as LGKMCNT out-of-order. Differential Revision: https://reviews.llvm.org/D46616 llvm-svn: 333926
*	AMDGPU: Pass function directly instead of MachineFunction	Matt Arsenault	2018-05-29	1	-13/+12
\| \| \| \| \| \| \|	These functions just query the underlying IR function, so pass it directly. llvm-svn: 333442
*	AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers	Tom Stellard	2018-05-22	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed. This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files. I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46272 llvm-svn: 332930
*	AMDGPU: Add D16 instructions preserve unused bits feature	Konstantin Zhuravlyov	2018-05-04	1	-0/+5
\| \| \| \| \| \| \| \| \|	- Predicate D16 patterns on this new feature - Added this new feature to gfx900/2/4 Differential Revision: https://reviews.llvm.org/D46366 llvm-svn: 331551
*	AMDGPU: Remove remnants of gfx901 (it was deprecated some time ago)	Konstantin Zhuravlyov	2018-05-01	1	-2/+1
\| \| \| \|	llvm-svn: 331298
*	Remove \brief commands from doxygen comments.	Adrian Prantl	2018-05-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
*	AMDGPU: Add Vega12 and Vega20	Matt Arsenault	2018-04-30	1	-1/+14
\| \| \| \| \| \| \| \|	Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215
*	[AMDGPU][Waitcnt] As of gfx7, VMEM operations do not increment the export ↵	Mark Searles	2018-04-26	1	-0/+4
\| \| \| \| \| \| \| \|	counter and the input registers are available in the next instruction; update the waitcnt pass to take this into account. Differential Revision: https://reviews.llvm.org/D46067 llvm-svn: 330954
*	AMDGPU: enable 128-bit for local addr space under an option	Marek Olsak	2018-04-10	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). v2: - fix regressions in merge-stores.ll and multiple_tails.ll Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329764
*	Revert "AMDGPU: enable 128-bit for local addr space under an option"	Alex Shlyapnikov	2018-04-09	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r329591. It breaks various bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/16516 http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/17374 http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/15992 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/11251 ... llvm-svn: 329610
*	AMDGPU: enable 128-bit for local addr space under an option	Marek Olsak	2018-04-09	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329591
*	[AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructions	Dmitry Preobrazhensky	2018-04-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	Fixed a bug which caused Tablegen crash. See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837 Differential Revision: https://reviews.llvm.org/D45085 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328983
*	Revert r328975, it makes TableGen assert on the bots.	Nico Weber	2018-04-02	1	-5/+0
\| \| \| \|	llvm-svn: 328978
*	[AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructions	Dmitry Preobrazhensky	2018-04-02	1	-0/+5
\| \| \| \| \| \| \| \| \|	See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837 Differential Revision: https://reviews.llvm.org/D45085 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328975
*	AMDGPU: Increase default stack alignment	Matt Arsenault	2018-03-29	1	-1/+6
\| \| \| \| \| \| \|	8 and 16-byte values are common, so increase the default alignment to avoid realigning the stack in most functions. llvm-svn: 328821
*	[AMDGPU] Remove use of OpenCL triple environment and replace with function ↵	Tony Tye	2018-03-23	1	-8/+4
\| \| \| \| \| \| \| \| \| \| \|	attribute for AMDGPU - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS. Differential Revision: https://reviews.llvm.org/D43736 llvm-svn: 328349
*	[AMDGPU] Supported ds_read_b128 generation; Widened vector length for local ↵	Farhana Aleen	2018-03-09	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	address-space. Summary: Starting from GCN 2nd generation, ISA supports ds_read_b128 on top of ds_read_b64. This patch supports ds_read_b128 instruction pattern and generation of this instruction. In the vectorizer, this patch also widen the vector length so that vectorizer generates 128 bit loads for local address-space which gets translated to ds_read_b128. Since the performance benefit is not clear; compiler generates ds_read_b128 under -amdgpu-ds128. Author: FarhanaAleen Reviewed By: rampitec, arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D44210 llvm-svn: 327153
*	AMDGPU/GlobalISel: Pass subtarget + TM to LegalizerInfo	Matt Arsenault	2018-03-08	1	-1/+1
\| \| \| \| \| \|	These are the parameters x86 already uses. llvm-svn: 327020
*	[AMDGPU] Scratch setup fix on AMDPAL gfx9+ merge shader	Tim Renouf	2018-02-26	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: With OS type AMDPAL, the scratch descriptor is hardwired to be loaded from offset 0 of the global information table, whose low pointer is passed in s0. For a merge shader on gfx9+, it needs to be s8 instead, as the hardware reserves s0-s7. Reviewers: kzhuravl Subscribers: arsenm, nhaehnle, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl Differential Revision: https://reviews.llvm.org/D42203 llvm-svn: 326088
*	AMDGPU: Bring processors and features in sync with the spec	Konstantin Zhuravlyov	2018-02-16	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	- Remove gfx800 - Make iceland gfx802 - Add xnack to gfx902 Differential Revision: https://reviews.llvm.org/D43355 llvm-svn: 325393
*	AMDGPU: Remove the s_buffer workaround for GFX9 chips	Marek Olsak	2018-02-07	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I checked the AMD closed source compiler and the workaround is only needed when x3 is emulated as x4, which we don't do in LLVM. SMEM x3 opcodes don't exist, and instead there is a possibility to use x4 with the last component being unused. If the last component is out of buffer bounds and falls on the next 4K page, the hw hangs. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D42756 llvm-svn: 324486
*	[AMDGPU][MC] Added validation of d16 and r128 modifiers of MIMG opcodes	Dmitry Preobrazhensky	2018-02-05	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	See bugs 36094, 36095: https://bugs.llvm.org/show_bug.cgi?id=36094 https://bugs.llvm.org/show_bug.cgi?id=36095 Differential Revision: https://reviews.llvm.org/D42692 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 324231
*	AMDGPU/SI: Add d16 support for buffer intrinsics.	Changpeng Fang	2018-01-12	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D38906 Reviewers: Matt and Brian. llvm-svn: 322402
*	MachineFunction: Return reference from getFunction(); NFC	Matthias Braun	2017-12-15	1	-4/+4
\| \| \| \| \| \|	The Function can never be nullptr so we can return a reference. llvm-svn: 320884
*	AMDGPU/GCN: Bring processors in sync with AMDGPUUsage	Konstantin Zhuravlyov	2017-12-08	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \|	- Add gfx704 - Change bonaire to gfx704 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40046 llvm-svn: 320194
*	AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instruction	Jan Vesely	2017-12-04	1	-0/+5
\| \| \| \| \| \| \| \| \|	Only used by pre-GCN targets v2: fix predicate setting for FMA_Common Differential Revision: https://reviews.llvm.org/D40692 llvm-svn: 319712
*	AMDGPU: Disable fp64 support on pre GCN asics	Jan Vesely	2017-12-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	It's not implemented. Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it v2: fix hasFP64 query Differential Revision: https://reviews.llvm.org/D39931 llvm-svn: 319709
*	AMDGPU: Allow negative MUBUF vaddr for gfx9	Matt Arsenault	2017-11-30	1	-0/+6
\| \| \| \| \| \| \| \|	GFX9 does not enable bounds checking for the resource descriptors used for private access, so it should be OK to use vaddr with a potentially negative value. llvm-svn: 319393
*	AMDGPU: Select DS insts without m0 initialization	Matt Arsenault	2017-11-29	1	-0/+6
\| \| \| \| \| \| \| \| \|	GFX9 stopped using m0 for most DS instructions. Select a different instruction without the use. I think this will be less error prone than trying to manually maintain m0 uses as needed. llvm-svn: 319270
*	AMDGPU: Move hazard avoidance out of waitcnt pass.	Matt Arsenault	2017-11-17	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	This is mostly moving VMEM clause breaking into the hazard recognizer. Also move another hazard currently handled in the waitcnt pass. Also stops breaking clauses unless xnack is enabled. llvm-svn: 318557
*	AMDGPU: Don't use MUBUF vaddr if address may overflow	Matt Arsenault	2017-11-15	1	-0/+5
\| \| \| \| \| \| \|	Effectively revert r263964. Before we would not allow this if vaddr was not known to be positive. llvm-svn: 318240
*	AMDGPU: Implement computeKnownBitsForTargetNode for mbcnt	Matt Arsenault	2017-11-13	1	-0/+4
\| \| \| \|	llvm-svn: 318100