bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] gfx1010 constant bus limit	Stanislav Mekhanoshin	2019-05-02	1	-0/+20
\| \| \| \| \| \| \| \|	Constant bus limit has increased to 2 with GFX10. Differential Revision: https://reviews.llvm.org/D61404 llvm-svn: 359754
*	[AMDGPU] Add gfx1010 target definitions	Stanislav Mekhanoshin	2019-04-24	1	-1/+35
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61041 llvm-svn: 359113
*	[AMDGPU] predicate and feature refactoring	Stanislav Mekhanoshin	2019-04-05	1	-1/+2
\| \| \| \| \| \| \| \| \|	We have done some predicate and feature refactoring lately but did not upstream it. This is to sync. Differential revision: https://reviews.llvm.org/D60292 llvm-svn: 357791
*	AMDGPU: Assume ECC is enabled by default if supported	Matt Arsenault	2019-04-03	1	-2/+12
\| \| \| \| \| \| \| \| \| \|	The test should really be checking for the property directly in the code object headers, but there are problems with this. I don't see this directly represented in the text form, and for the binary emission this is depending on a function level subtarget feature to emit a global flag. llvm-svn: 357558
*	AMDGPU: Remove dx10-clamp from subtarget features	Matt Arsenault	2019-03-29	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since this can be set with s_setreg*, it should not be a subtarget property. Set a default based on the calling convention, and Introduce a new amdgpu-dx10-clamp attribute to override this if desired. Also introduce a new amdgpu-ieee attribute to match. The values need to match to allow inlining. I think it is OK for the caller's dx10-clamp attribute to override the callee, but there doesn't appear to be the infrastructure to do this currently without definining the attribute in the generic Attributes.td. Eventually the calling convention lowering will need to insert a mode switch somewhere for these. llvm-svn: 357302
*	[ScheduleDAG] Move `Topo` and `addEdge` to base class.	Clement Courbet	2019-03-29	1	-3/+1
\| \| \| \| \| \| \| \| \|	Some DAG mutations can only be applied to `ScheduleDAGMI`, and have to internally cast a `ScheduleDAGInstrs` to `ScheduleDAGMI`. There is nothing actually specific to `ScheduleDAGMI` in `Topo`. llvm-svn: 357239
*	AMDGPU: Partially fix default device for HSA	Matt Arsenault	2019-03-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are a few different issues, mostly stemming from using generation based checks for anything instead of subtarget features. Stop adding flat-address-space as a feature for HSA, as it should only be a device property. This was incorrectly allowing flat instructions to select for SI. Increase the default generation for HSA to avoid the encoding error when emitting objects. This has some other side effects from various checks which probably should be separate subtarget features (in the cost model and for dealing with the DS offset folding issue). Partial fix for bug 41070. It should probably be an error to try using amdhsa without flat support. llvm-svn: 356347
*	AMDGPU: Remove debugger related subtarget features	Matt Arsenault	2019-02-21	1	-2/+0
\| \| \| \| \| \|	As far as I know these aren't needed anymore. llvm-svn: 354634
*	[AMDGPU] Split dot-insts feature	Stanislav Mekhanoshin	2019-02-09	1	-1/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57971 llvm-svn: 353587
*	AMDGPU: Eliminate GPU specific SubtargetFeatures	Matt Arsenault	2019-02-08	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	Inline compatability is determined from the individual feature bits. These are just sets of the separate features, but will always be treated as incompatible unless they are specifically ignored. Defining the ISA version number here in tablegen would be nice, but it turns out this wasn't actually used. llvm-svn: 353558
*	AMDGPU: Remove GCN features and predicates	Matt Arsenault	2019-02-08	1	-0/+4
\| \| \| \| \| \| \|	These are no longer necessary since the R600 tablegen files are split out now. llvm-svn: 353548
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[AMDGPU] Add support for TFE/LWE in image intrinsics. 2nd try	David Stuttard	2019-01-14	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TFE and LWE support requires extra result registers that are written in the event of a failure in order to detect that failure case. The specific use-case that initiated these changes is sparse texture support. This means that if image intrinsics are used with either option turned on, the programmer must ensure that the return type can contain all of the expected results. This can result in redundant registers since the vector size must be a power-of-2. This change takes roughly 6 parts: 1. Modify the instruction defs in tablegen to add new instruction variants that can accomodate the extra return values. 2. Updates to lowerImage in SIISelLowering.cpp to accomodate setting TFE or LWE (where the bulk of the work for these instruction types is now done) 3. Extra verification code to catch cases where intrinsics have been used but insufficient return registers are used. 4. Modification to the adjustWritemask optimisation to account for TFE/LWE being enabled (requires extra registers to be maintained for error return value). 5. An extra pass to zero initialize the error value return - this is because if the error does not occur, the register is not written and thus must be zeroed before use. Also added a new (on by default) option to ensure ALL return values are zero-initialized that is required for sparse texture support. 6. Disable the inst_combine optimization in the presence of tfe/lwe (later TODO for this to re-enable and handle correctly). There's an additional fix now to avoid a dmask=0 For an image intrinsic with tfe where all result channels except tfe were unused, I was getting an image instruction with dmask=0 and only a single vgpr result for tfe. That is incorrect because the hardware assumes there is at least one vgpr result, plus the one for tfe. Fixed by forcing dmask to 1, which gives the desired two vgpr result with tfe in the second one. The TFE or LWE result is returned from the intrinsics using an aggregate type. Look in the test code provided to see how this works, but in essence IR code to invoke the intrinsic looks as follows: %v = call {<4 x float>,i32} @llvm.amdgcn.image.load.1d.v4f32i32.i32(i32 15, i32 %s, <8 x i32> %rsrc, i32 1, i32 0) %v.vec = extractvalue {<4 x float>, i32} %v, 0 %v.err = extractvalue {<4 x float>, i32} %v, 1 This re-submit of the change also includes a slight modification in SIISelLowering.cpp to work-around a compiler bug for the powerpc_le platform that caused a buildbot failure on a previous submission. Differential revision: https://reviews.llvm.org/D48826 Change-Id: If222bc03642e76cf98059a6bef5d5bffeda38dda Work around for ppcle compiler bug Change-Id: Ie284cf24b2271215be1b9dc95b485fd15000e32b llvm-svn: 351054
*	[AMDGPU] Separate feature dot-insts	Stanislav Mekhanoshin	2019-01-10	1	-0/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D56524 llvm-svn: 350793
*	Revert r347871 "Fix: Add support for TFE/LWE in image intrinsic"	David Stuttard	2018-11-29	1	-6/+0
\| \| \| \| \| \| \| \| \|	Also revert fix r347876 One of the buildbots was reporting a failure in some relevant tests that I can't repro or explain at present, so reverting until I can isolate. llvm-svn: 347911
*	Add support for TFE/LWE in image intrinsics	David Stuttard	2018-11-29	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TFE and LWE support requires extra result registers that are written in the event of a failure in order to detect that failure case. The specific use-case that initiated these changes is sparse texture support. This means that if image intrinsics are used with either option turned on, the programmer must ensure that the return type can contain all of the expected results. This can result in redundant registers since the vector size must be a power-of-2. This change takes roughly 6 parts: 1. Modify the instruction defs in tablegen to add new instruction variants that can accomodate the extra return values. 2. Updates to lowerImage in SIISelLowering.cpp to accomodate setting TFE or LWE (where the bulk of the work for these instruction types is now done) 3. Extra verification code to catch cases where intrinsics have been used but insufficient return registers are used. 4. Modification to the adjustWritemask optimisation to account for TFE/LWE being enabled (requires extra registers to be maintained for error return value). 5. An extra pass to zero initialize the error value return - this is because if the error does not occur, the register is not written and thus must be zeroed before use. Also added a new (on by default) option to ensure ALL return values are zero-initialized that is required for sparse texture support. 6. Disable the inst_combine optimization in the presence of tfe/lwe (later TODO for this to re-enable and handle correctly). There's an additional fix now to avoid a dmask=0 For an image intrinsic with tfe where all result channels except tfe were unused, I was getting an image instruction with dmask=0 and only a single vgpr result for tfe. That is incorrect because the hardware assumes there is at least one vgpr result, plus the one for tfe. Fixed by forcing dmask to 1, which gives the desired two vgpr result with tfe in the second one. The TFE or LWE result is returned from the intrinsics using an aggregate type. Look in the test code provided to see how this works, but in essence IR code to invoke the intrinsic looks as follows: %v = call {<4 x float>,i32} @llvm.amdgcn.image.load.1d.v4f32i32.i32(i32 15, i32 %s, <8 x i32> %rsrc, i32 1, i32 0) %v.vec = extractvalue {<4 x float>, i32} %v, 0 %v.err = extractvalue {<4 x float>, i32} %v, 1 Differential revision: https://reviews.llvm.org/D48826 Change-Id: If222bc03642e76cf98059a6bef5d5bffeda38dda llvm-svn: 347871
*	AMDGPU: Add sram-ecc feature	Konstantin Zhuravlyov	2018-11-05	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D53222 llvm-svn: 346177
*	[AMDGPU] Remove FeatureVGPRSpilling	Scott Linder	2018-10-31	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \|	This feature is only relevant to shaders, and is no longer used. When disabled, lowering of reserved registers for shaders causes a compiler crash. Remove the feature and add a test for compilation of shaders at OptNone. Differential Revision: https://reviews.llvm.org/D53829 llvm-svn: 345763
*	[AMDGPU] Initialize instruction itinerary from GCNSubtarget	Stanislav Mekhanoshin	2018-09-17	1	-0/+1
\| \| \| \| \| \| \| \|	I need to use it in the GCN codegen. Differential Revision: https://reviews.llvm.org/D52123 llvm-svn: 342400
*	[AMDGPU] Ensure trig range reduction only used for subtargets that require it	David Stuttard	2018-09-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GFX9 and above support sin/cos instructions with a greater range and thus don't require a fract instruction prior to invocation. Added a subtarget feature to reflect this and added code to take advantage of expanded range on GFX9+ Also updated the tests to check correct behaviour Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D51933 Change-Id: I1c1f1d3726a5ae32116646ca5cfa1ab4ef69e5b0 llvm-svn: 342222
*	AMDGPU: Re-apply r341982 after fixing the layering issue	Konstantin Zhuravlyov	2018-09-12	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Move isa version determination into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). llvm-svn: 342069
*	Revert "AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into ↵	Ilya Biryukov	2018-09-12	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \|	TargetParser." This reverts commit r341982. The change introduced a layering violation. Reverting to unbreak our integrate. llvm-svn: 342023
*	AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination	Konstantin Zhuravlyov	2018-09-11	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). Differential Revision: https://reviews.llvm.org/D51890 llvm-svn: 341982
*	AMDGPU: Remove remnants of old address space mapping	Matt Arsenault	2018-08-31	1	-3/+1
\| \| \| \|	llvm-svn: 341165
*	[AMDGPU] Add support for a16 modifiear for gfx9	Ryan Taylor	2018-08-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adding support for a16 for gfx9. A16 bit replaces r128 bit for gfx9. Change-Id: Ie8b881e4e6d2f023fb5e0150420893513e5f4841 Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50575 llvm-svn: 340831
*	AMDGPU: Address todo for handling 1/(2 pi)	Matt Arsenault	2018-08-15	1	-1/+1
\| \| \| \|	llvm-svn: 339814
*	AMDGPU: Add feature vi-insts	Matt Arsenault	2018-08-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is necessary to add a VI specific builtin, __builtin_amdgcn_s_dcache_wb. We already have an overly specific feature for one of these builtins, for s_memrealtime. I'm not sure whether it's better to add more of those, or to get rid of that and merge it with vi-insts. Alternatively, maybe this logically goes with scalar-stores? llvm-svn: 339104
*	Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"	Matt Arsenault	2018-07-20	1	-35/+39
\| \| \| \| \| \|	Reverts r337079 with fix for msan error. llvm-svn: 337535
*	Revert "AMDGPU: Fix handling of alignment padding in DAG argument lowering"	Evgeniy Stepanov	2018-07-14	1	-39/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r337021. WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x1415cd65 in void write_signed<long>(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:95:7 #1 0x1415c900 in llvm::write_integer(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:121:3 #2 0x1472357f in llvm::raw_ostream::operator<<(long) /code/llvm-project/llvm/lib/Support/raw_ostream.cpp:117:3 #3 0x13bb9d4 in llvm::raw_ostream::operator<<(int) /code/llvm-project/llvm/include/llvm/Support/raw_ostream.h:210:18 #4 0x3c2bc18 in void printField<unsigned int, &(amd_kernel_code_s::amd_kernel_code_version_major)>(llvm::StringRef, amd_kernel_code_s const&, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:78:23 #5 0x3c250ba in llvm::printAmdKernelCodeField(amd_kernel_code_s const&, int, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:104:5 #6 0x3c27ca3 in llvm::dumpAmdKernelCode(amd_kernel_code_s const, llvm::raw_ostream&, char const) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:113:5 #7 0x3a46e6c in llvm::AMDGPUTargetAsmStreamer::EmitAMDKernelCodeT(amd_kernel_code_s const&) /code/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:161:3 #8 0xd371e4 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:204:26 [...] Uninitialized value was created by an allocation of 'KernelCode' in the stack frame of function '_ZN4llvm16AMDGPUAsmPrinter21EmitFunctionBodyStartEv' #0 0xd36650 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:192 llvm-svn: 337079
*	AMDGPU: Fix handling of alignment padding in DAG argument lowering	Matt Arsenault	2018-07-13	1	-35/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was completely broken if there was ever a struct argument, as this information is thrown away during the argument analysis. The offsets as passed in to LowerFormalArguments are not useful, as they partially depend on the legalized result register type, and they don't consider the alignment in the first place. Ignore the Ins array, and instead figure out from the raw IR type what we need to do. This seems to fix the padding computation if the DAG lowering is forced (and stops breaking arguments following padded arguments if the arguments were only partially lowered in the IR) llvm-svn: 337021
*	AMDGPU/SI: Initialize InstrInfo before TargetLoweringInfo in GCNSubtarget	Tom Stellard	2018-07-11	1	-2/+2
\| \| \| \| \| \| \| \|	SITargetLowering queries SIInstrInfo in its constructor, so SIInstrInfo must be initialized first. This fixes msan buildbot failures and was introduced by r336851. llvm-svn: 336861
*	AMDGPU: Remove duplicate call to initializeSubtargetDependencies()	Tom Stellard	2018-07-11	1	-1/+0
\| \| \| \| \| \|	This was added in r336851. llvm-svn: 336853
*	AMDGPU: Refactor Subtarget classes	Tom Stellard	2018-07-11	1	-48/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
*	AMDGPU: Don't use struct type for argument layout	Matt Arsenault	2018-06-29	1	-3/+23
\| \| \| \| \| \| \| \| \| \|	This was introducing unnecessary padding after the explicit arguments, depending on the alignment of the total struct type. Also has the side effect of avoiding creating an extra GEP for the offset from the base kernel argument to the explicit kernel argument offset. llvm-svn: 335999
*	AMDGPU: Separate R600 and GCN TableGen files	Tom Stellard	2018-06-28	1	-31/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 llvm-svn: 335942
*	AMDGPU: Remove ability to reserve VGPRs for debugger	Konstantin Zhuravlyov	2018-06-21	1	-6/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48234 llvm-svn: 335288
*	AMDGPU: Round up kernel argument allocation size	Matt Arsenault	2018-05-29	1	-4/+8
\| \| \| \| \| \| \| \| \| \|	AFAIK the driver's allocation will actually have to round this up anyway. It is useful to track the rounded up size, so that the end of the kernel segment is known to be dereferencable so a wider s_load_dword can be used for a short argument at the end of the segment. llvm-svn: 333456
*	AMDGPU: Pass function directly instead of MachineFunction	Matt Arsenault	2018-05-29	1	-2/+2
\| \| \| \| \| \| \|	These functions just query the underlying IR function, so pass it directly. llvm-svn: 333442
*	AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers	Tom Stellard	2018-05-22	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed. This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files. I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46272 llvm-svn: 332930
*	AMDGPU/GlobalISel: Enable TableGen'd instruction selector	Tom Stellard	2018-05-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, mgorny, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45994 llvm-svn: 332039
*	AMDGPU: Add D16 instructions preserve unused bits feature	Konstantin Zhuravlyov	2018-05-04	1	-0/+1
\| \| \| \| \| \| \| \| \|	- Predicate D16 patterns on this new feature - Added this new feature to gfx900/2/4 Differential Revision: https://reviews.llvm.org/D46366 llvm-svn: 331551
*	Remove \brief commands from doxygen comments.	Adrian Prantl	2018-05-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
*	AMDGPU: Add Vega12 and Vega20	Matt Arsenault	2018-04-30	1	-0/+2
\| \| \| \| \| \| \| \|	Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215
*	AMDGPU: enable 128-bit for local addr space under an option	Marek Olsak	2018-04-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). v2: - fix regressions in merge-stores.ll and multiple_tails.ll Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329764
*	Revert "AMDGPU: enable 128-bit for local addr space under an option"	Alex Shlyapnikov	2018-04-09	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r329591. It breaks various bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/16516 http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/17374 http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/15992 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/11251 ... llvm-svn: 329610
*	AMDGPU: enable 128-bit for local addr space under an option	Marek Olsak	2018-04-09	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329591
*	[AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructions	Dmitry Preobrazhensky	2018-04-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Fixed a bug which caused Tablegen crash. See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837 Differential Revision: https://reviews.llvm.org/D45085 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328983
*	Revert r328975, it makes TableGen assert on the bots.	Nico Weber	2018-04-02	1	-1/+0
\| \| \| \|	llvm-svn: 328978
*	[AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructions	Dmitry Preobrazhensky	2018-04-02	1	-0/+1
\| \| \| \| \| \| \| \| \|	See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837 Differential Revision: https://reviews.llvm.org/D45085 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328975
*	AMDGPU/GlobalISel: Pass subtarget + TM to LegalizerInfo	Matt Arsenault	2018-03-08	1	-2/+2
\| \| \| \| \| \|	These are the parameters x86 already uses. llvm-svn: 327020