bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	llc: Change behavior of -mcpu with existing attribute	Matt Arsenault	2020-01-07	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	Don't overwrite existing target-cpu attributes. I've often found the replacement behavior annoying, and this is inconsistent with how the fast math command line flags interact with the function attributes. Does not yet change target-features, since I think that should behave as a concatenation.
*	llc/MIR: Fix setFunctionAttributes for MIR functions	Matt Arsenault	2020-01-06	2	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A random set of attributes are implemented by llc/opt forcing the string attributes on the IR functions before processing anything. This would not happen for MIR functions, which have not yet been created at this point. Use a callback in the MIR parser, purely to avoid dealing with the ugliness that the command line flags are in a .inc file, and would require allowing access to these flags from multiple places (either from the MIR parser directly, or a new utility pass to implement these flags). It would probably be better to cleanup the flag handling into a separate library. This is in preparation for treating more command line flags with a corresponding function attribute in a more uniform way. The fast math flags in particular have a messy system where the command line flag sets the behavior from a function attribute if present, and otherwise the command line flag. This means if any other pass tries to inspect the function attributes directly, it will be inconsistent with the intended behavior. This is also inconsistent with the current behavior of -mcpu and -mattr, which overwrites any pre-existing function attributes. I would like to move this to consistenly have the command line flags not overwrite any pre-existing attributes, and to always ensure the command line flags are consistent with the function attributes.
*	[llvm][MIRVRegNamerUtils] Adding hashing on memoperands.	Puyan Lotfi	2019-12-11	1	-0/+42
\| \| \| \| \| \| \| \|	No more hash collisions for memoperands. Now the MIRCanonicalization pass shouldn't hit hash collisions when dealing with nearly identical memory accessing instructions when their memoperands are in fact different. Differential Revision: https://reviews.llvm.org/D71328
*	[MIRNamer]: Make the check lines in the test robust with regex.	Aditya Nandakumar	2019-11-16	1	-13/+12
\| \| \| \| \| \| \|	Previously we were checking for specific hashes. Make it check for regexes. Should fix failure caused by: 72768685567b
*	[MirNamer][Canonicalizer]: Perform instruction semantic based renaming	Aditya Nandakumar	2019-11-15	1	-8/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D70210 Previously: Due to sensitivity of the algorithm with gaps, and extra instructions, when diffing, often we see naming being off by a few. Makes the diff unreadable even for tests with 7 and 8 instructions respectively. Naming can change depending on candidates (and order of picking candidates). Suddenly if there's one extra instruction somewhere, the entire subtree would be named completely differently. No consistent naming of similar instructions which occur in different functions. If we try to do something like count the frequency distribution of various differences across suite, then the above sensitivity issues are going to result in poor results. Instead: Name instruction based on semantics of the instruction (hash of the opcode and operands). Essentially for a given instruction that occurs in any module/function it'll be named similarly (ie semantic). This has some nice properties Can easily look at many instructions and just check the hash and if they're named similarly, then it's the same instruction. Makes it very easy to spot the same instruction both multiple times, as well as across many functions (useful for frequency distribution). Independent of traversal/candidates/depth of graph. No need to keep track of last index/gaps/skip count etc. No off by few issues with diffs. I've tried the old vs new implementation in files ranging from 30 to 700 instructions. In both cases with the old algorithm, diffs are a sea of red, where as for the semantic version, in both cases, the diffs line up beautifully. Simplified implementation of the main loop (simple iteration) , no keep track of what's visited and not. Handle collision just by incrementing a counter. Roughly bb[N]_hash_[CollisionCount]. Additionally with the new implementation, we can probably avoid doing the hoisting of instructions to various places, as they'll likely be named the same resulting in differences only based on collision (ie regardless of whether the instruction is hoisted or not/close to use or not, it'll be named the same hash which should result in use of the instruction be identical with the only change being the collision count) which is very easy to spot visually.
*	AMDGPU: Add default denormal mode to MachineFunctionInfo	Matt Arsenault	2019-11-01	2	-0/+27
\| \| \| \| \| \|	The default FP mode should really be a property of a specific function, and not a subtarget. Introduce the necessary fields to the SIMachineFunctionInfo to help move towards this goal.
*	[Alignment] Migrate Attribute::getWith(Stack)Alignment	Guillaume Chatelet	2019-10-15	2	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jdoerfert Reviewed By: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68792 llvm-svn: 374884
*	[AMDGPU] Extend buffer intrinsics with swizzling	Piotr Sobczak	2019-10-02	6	-22/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extend cachepolicy operand in the new VMEM buffer intrinsics to supply information whether the buffer data is swizzled. Also, propagate this information to MIR. Intrinsics updated: int_amdgcn_raw_buffer_load int_amdgcn_raw_buffer_load_format int_amdgcn_raw_buffer_store int_amdgcn_raw_buffer_store_format int_amdgcn_raw_tbuffer_load int_amdgcn_raw_tbuffer_store int_amdgcn_struct_buffer_load int_amdgcn_struct_buffer_load_format int_amdgcn_struct_buffer_store int_amdgcn_struct_buffer_store_format int_amdgcn_struct_tbuffer_load int_amdgcn_struct_tbuffer_store Furthermore, disable merging of VMEM buffer instructions in SI Load/Store optimizer, if the "swizzled" bit on the instruction is on. The default value of the bit is 0, meaning that data in buffer is linear and buffer instructions can be merged. There is no difference in the generated code with this commit. However, in the future it will be expected that front-ends use buffer intrinsics with correct "swizzled" bit set. Reviewers: arsenm, nhaehnle, tpr Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, arphaman, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68200 llvm-svn: 373491
*	[Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes ↵	Guillaume Chatelet	2019-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608
*	AMDGPU: Add amdgpu-32bit-address-high-bits to MIR serialization	Matt Arsenault	2019-08-27	2	-1/+16
\| \| \| \|	llvm-svn: 370089
*	AMDGPU/LoadStoreOptimizer: combine MMOs when merging instructions	Tom Stellard	2019-07-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The LoadStoreOptimizer was creating instructions with 2 MachineMemOperands, which meant they were assumed to alias with all other instructions, because MachineInstr:mayAlias() returns true when an instruction has multiple MachineMemOperands. This was preventing these instructions from being merged again, and was giving the scheduler less freedom to reorder them. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65036 llvm-svn: 367237
*	AMDGPU: Serialize mode from MachineFunctionInfo	Matt Arsenault	2019-07-10	2	-0/+69
\| \| \| \|	llvm-svn: 365653
*	AMDGPU: Make s34 the FP register	Matt Arsenault	2019-07-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the FP register callee saved. This is tricky because now the FP needs to be spilled in the prolog relative to the incoming SP register, rather than the frame register used throughout the rest of the function. I don't like how this bypassess the standard mechanism for CSR spills just to get the correct insert point. I may look for a better solution, since all CSR VGPRs may also need to have all lanes activated. Another option might be to make getFrameIndexReference change the base register if the frame index is a CSR, and then try to figure out the right insertion point in emitProlog. If there is a free VGPR lane available for SGPR spilling, try to use it for the FP. If that would require intrtoducing a new VGPR spill, try to use a free call clobbered SGPR. Only fallback to introducing a new VGPR spill as a last resort. This also doesn't attempt to handle SGPR spilling with scalar stores. llvm-svn: 365372
*	[AMDGPU] Enable serializing of argument info.	Michael Liao	2019-07-03	2	-0/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Support serialization of all arguments in machine function info. This enables fabricating MIR tests depending on argument info. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64096 llvm-svn: 364995
*	AMDGPU: Write LDS objects out as global symbols in code generation	Nicolai Haehnle	2019-06-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The symbols use the processor-specific SHN_AMDGPU_LDS section index introduced with a previous change. The linker is then expected to resolve relocations, which are also emitted. Initially disabled for HSA and PAL environments until they have caught up in terms of linker and runtime loader. Some notes: - The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered to a constant at compile times, which means some tests can no longer be applied. The current "solution" is a terrible hack, but the intrinsic isn't used by Mesa, so we can keep it for now. - We no longer know the full LDS size per kernel at compile time, which means that we can no longer generate a relevant error message at compile time. It would be possible to add a check for the size of individual variables, but ultimately the linker will have to perform the final check. Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275 Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61494 llvm-svn: 364297
*	AMDGPU: Always use s33 for global scratch wave offset	Matt Arsenault	2019-06-20	1	-2/+2
\| \| \| \| \| \| \| \| \|	Every called function could possibly need this to calculate the absolute address of stack objectst, and this avoids inserting a copy around every call site in the kernel. It's also somewhat cleaner to keep this in a callee saved SGPR. llvm-svn: 363990
*	Rename ExpandISelPseudo->FinalizeISel, delay register reservation	Matt Arsenault	2019-06-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757
*	Describe stack-id as an enum	Sander de Smalen	2019-06-17	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes MIR stack-id from an integer to an enum, and adds printing/parsing support for this in MIR files. The default stack-id '0' is now renamed to 'default'. This should make MIR tests that have stack objects with different stack-ids more descriptive. It also clarifies code operating on StackID. Reviewers: arsenm, thegameg, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D60137 llvm-svn: 363533
*	AMDGPU: Prepare for explicit absolute relocations in code generation	Nicolai Haehnle	2019-06-16	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We will use absolute relocations for LDS symbols. Change-Id: I9a32795ed0ea835e433a787129cfe3c57ee9a325 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61492 llvm-svn: 363517
*	AMDGPU: Invert frame index offset interpretation	Matt Arsenault	2019-06-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the beginning, the offset of a frame index has been consistently interpreted backwards. It was treating it as an offset from the scratch wave offset register as a frame register. The correct interpretation is the offset from the SP on entry to the function, before the prolog. Frame index elimination then should select either SP or another register as an FP. Treat the scratch wave offset on kernel entry as the pre-incremented SP. Rely more heavily on the standard hasFP and frame pointer elimination logic, and clean up the private reservation code. This saves a copy in most callee functions. The kernel prolog emission code is still kind of a mess relying on checking the uses of physical registers, which I would prefer to eliminate. Currently selection directly emits MUBUF instructions, which require using a reference to some register. Use the register chosen for SP, and then ignore this later. This should probably be cleaned up to use pseudos that don't refer to any specific base register until frame index elimination. Add a workaround for shaders using large numbers of SGPRs. I'm not sure these cases were ever working correctly, since as far as I can tell the logic for figuring out which SGPR is the scratch wave offset doesn't match up with the shader input initialization in the shader programming guide. llvm-svn: 362661
*	[MIR-Canon] Don't do vreg skip for independent instructions if there are none.	Puyan Lotfi	2019-05-31	1	-0/+1
\| \| \| \| \| \| \| \| \|	We don't want to create vregs if there is nothing to use them for. That causes verifier errors. Differential Revision: https://reviews.llvm.org/D62740 llvm-svn: 362247
*	[MIR-Canon] Hardening propagateLocalCopies.	Puyan Lotfi	2019-05-31	1	-2/+7
\| \| \| \| \| \| \| \| \| \|	This is am almost NFC, it does the following: - If there is no register class for a COPY's src or dst, bail. - Fixes uses iterator invalidation bug. Differential Revision: https://reviews.llvm.org/D62713 llvm-svn: 362191
*	[AMDGPU] gfx1010 tests. NFC.	Stanislav Mekhanoshin	2019-05-13	1	-0/+155
\| \| \| \|	llvm-svn: 360615
*	[AMDGPU] gfx1010 VMEM and SMEM implementation	Stanislav Mekhanoshin	2019-04-30	4	-25/+25
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61330 llvm-svn: 359621
*	AMDGPU: Don't use the default cpu in a few tests	Matt Arsenault	2019-04-03	1	-1/+1
\| \| \| \| \| \|	Avoids unnecessary test changes in a future commit. llvm-svn: 357539
*	MIR: Freeze reserved regs after parsing everything	Matt Arsenault	2019-03-27	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \|	The AMDGPU implementation of getReservedRegs depends on MachineFunctionInfo fields that are parsed from the YAML section. This was reserving the wrong register since it was setting the reserved regs before parsing the correct one. Some tests were relying on the default reserved set for the assumed default calling convention. llvm-svn: 357083
*	MIR: Allow targets to serialize MachineFunctionInfo	Matt Arsenault	2019-03-14	12	-0/+358
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has been a very painful missing feature that has made producing reduced testcases difficult. In particular the various registers determined for stack access during function lowering were necessary to avoid undefined register errors in a large percentage of cases. Implement a subset of the important fields that need to be preserved for AMDGPU. Most of the changes are to support targets parsing register fields and properly reporting errors. The biggest sort-of bug remaining is for fields that can be initialized from the IR section will be overwritten by a default initialized machineFunctionInfo section. Another remaining bug is the machineFunctionInfo section is still printed even if empty. llvm-svn: 356215
*	[AMDGPU] Add support for immediate operand for S_ENDPGM	David Stuttard	2019-03-12	7	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add support for immediate operand in S_ENDPGM Change-Id: I0c56a076a10980f719fb2a8f16407e9c301013f6 Reviewers: alexshap Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, eraman, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59213 llvm-svn: 355902
*	AMDGPU: Fix tests using old number for constant address space	Matt Arsenault	2018-09-10	4	-29/+29
\| \| \| \|	llvm-svn: 341770
*	[MIR-Canon] Fixing a test failure caused by COPY Folding.	Puyan Lotfi	2018-04-16	1	-3/+1
\| \| \| \|	llvm-svn: 330115
*	Attempting to work around a non-determinism issue.	Puyan Lotfi	2018-04-11	1	-2/+0
\| \| \| \| \| \| \|	The main thing that matters with this test is that the COPYs are moved together not where the REG_SEQUENCES are. llvm-svn: 329850
*	[MIR-Canon] Improving performance by switching to named vregs.	Puyan Lotfi	2018-04-05	1	-6/+6
\| \| \| \| \| \|	No more skipping thounsands of vregs. Much faster running time. llvm-svn: 329246
*	[MIR-Canon] Adding support for multi-def -> user distance reduction.	Puyan Lotfi	2018-04-05	1	-0/+32
\| \| \| \|	llvm-svn: 329243
*	[GISel]: Verify COPIES involving generic registers.	Aditya Nandakumar	2018-02-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Add verification for copies involving generic registers if they are compatible - ie if it is a generic copy, then the types are the same, and if a COPY b/w generic and target virtual register, then the sizes should be the same. Only checks if there are no sub registers involved for now. https://reviews.llvm.org/D37775 llvm-svn: 324696
*	Followup on Proposal to move MIR physical register namespace to '$' sigil.	Puyan Lotfi	2018-01-31	5	-105/+105
\| \| \| \| \| \| \| \| \| \| \| \|	Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922
*	[MIR] Add support for addrspace in MIR	Francis Visoiu Mistrih	2018-01-26	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	Add support for printing / parsing the addrspace of a MachineMemOperand. Fixes PR35970. Differential Revision: https://reviews.llvm.org/D42502 llvm-svn: 323521
*	Move tests to the correct place	Matthias Braun	2018-01-19	6	-1356/+0
\| \| \| \| \| \| \|	test/CodeGen/MIR is for testing the MIR parser/printer. Tests for passes and targets belong to test/CodeGen/TARGETNAME. llvm-svn: 322925
*	[MIR] Repurposing '$' sigil used by external symbols. Replacing with '&'.	Puyan Lotfi	2018-01-10	3	-6/+6
\| \| \| \| \| \| \| \| \| \|	Planning to add support for named vregs. This puts is in a conundrum since physregs are named as well. To rectify this we need to use a sigil other than '%' for physregs in MIR. We've settled on using '$' for physregs but first we must repurpose it from external symbols using it, which is what this commit is all about. We think '&' will have familiar semantics for C/C++ users. llvm-svn: 322146
*	MIR: Print the register class or bank in vreg defs	Justin Bogner	2017-10-24	4	-31/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This updates the MIRPrinter to include the regclass when printing virtual register defs, which is already valid syntax for the parser. That is, given 64 bit %0 and %1 in a "gpr" regbank, %1(s64) = COPY %0(s64) would now be written as %1:gpr(s64) = COPY %0(s64) While this change alone introduces a bit of redundancy with the registers block, it allows us to update the tests to be more concise and understandable and brings us closer to being able to remove the registers block completely. Note: We generally only print the class in defs, but there is one exception. If there are uses without any defs whatsoever, we'll print the class on all uses. I'm not completely convinced this comes up in meaningful machine IR, but for now the MIRParser and MachineVerifier both accept that kind of stuff, so we don't want to have a situation where we can print something we can't parse. llvm-svn: 316479
*	Canonicalize a large number of mir tests using update_mir_test_checks	Justin Bogner	2017-10-18	2	-3/+8
\| \| \| \| \| \| \| \| \| \|	This converts a large and somewhat arbitrary set of tests to use update_mir_test_checks. I ran the script on all of the tests I expect to need to modify for an upcoming mir syntax change and kept the ones that obviously didn't change the tests in ways that might make it harder to understand. llvm-svn: 316137
*	AMDGPU: Handle non-temporal loads and stores	Konstantin Zhuravlyov	2017-09-07	3	-9/+333
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D36862 llvm-svn: 312729
*	AMDGPU: Handle more than one memory operand in SIMemoryLegalizer	Konstantin Zhuravlyov	2017-09-07	1	-0/+163
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37397 llvm-svn: 312725
*	AMDGPU: Implement memory model	Konstantin Zhuravlyov	2017-07-21	1	-0/+122
\| \| \| \|	llvm-svn: 308781
*	Add an ID field to StackObjects	Matt Arsenault	2017-07-20	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On AMDGPU SGPR spills are really spilled to another register. The spiller creates the spills to new frame index objects, which is used as a placeholder. This will eventually be replaced with a reference to a position in a VGPR to write to and the frame index deleted. It is most likely not a real stack location that can be shared with another stack object. This is a problem when StackSlotColoring decides it should combine a frame index used for a normal VGPR spill with a real stack location and a frame index used for an SGPR. Add an ID field so that StackSlotColoring has a way of knowing the different frame index types are incompatible. llvm-svn: 308673
*	AMDGPU: Fix crash when folding immediates into multiple uses	Nicolai Haehnle	2017-07-18	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When an immediate is folded by constant folding, we re-scan the entire use list for two reasons: 1. The constant folding may have created a new use of the same reg. 2. The constant folding may have removed an additional use in the list we're currently traversing (e.g., constant folding an S_ADD_I32 c, c). However, this could previously lead to a crash when an unrelated use was added twice into the FoldList. Since we re-scan the whole list anyway, we might as well just clear the FoldList again before we do so. Using a MIR test to show this because real code seems to trigger the issue only in connection with some really subtle control flow structures. Fixes GL45-CTS.shading_language_420pack.binding_images on gfx9. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35416 llvm-svn: 308314
*	Enhance synchscope representation	Konstantin Zhuravlyov	2017-07-11	1	-0/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722
*	AMDGPU: Allow SIShrinkInstructions to work in non-SSA	Matt Arsenault	2017-07-10	1	-10/+10
\| \| \| \| \| \| \| \|	Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. llvm-svn: 307575
*	AMDGPU: Add operand target flags serialization	Matt Arsenault	2017-07-02	1	-0/+29
\| \| \| \|	llvm-svn: 306995
*	AMDGPU: Remove legacy bfe intrinsics	Matt Arsenault	2017-04-03	1	-2/+2
\| \| \| \|	llvm-svn: 299372
*	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel	Matt Arsenault	2017-03-21	5	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \|	Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444