summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/inline-asm.ll
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Make VReg_1 only include 1 artificial registerMatt Arsenault2019-10-281-12/+11
| | | | | | | | | | | | | | | | | | | | When TableGen is inferring register classes from contexts, it uses a sorting function based on the number of registers in the class. Since this was being treated as an alias of VGPR_32, they had exactly the same size. The sort used wasn't a stable sort, and even if it were, I believe the tie breaker would effectively end up being the alphabetical ordering of the class name. There appear to be issues trying to use an empty set of registers, so add only one so this will always sort to the end. Also add a comment explaining how VReg_1 is a dirty hack for SelectionDAG. This does end up changing the behavior of i1 with inline asm and VGPR constraints, but the existing behavior was was already nonsensical and inconsistent. It should probably be disallowed anyway. Fixes bug 43699
* [AMDGPU] Correct the handling of inlineasm output registers.Michael Liao2019-05-281-0/+20
| | | | | | | | | | | | | | | | Summary: - There's a regression due to the cross-block RC assignment. Use the proper way to derive the output register RC in inline asm. Reviewers: rampitec, alex-t Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, eraman, hiraditya, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D62537 llvm-svn: 361868
* AMDGPU: Fix printed format of SReg_96Matt Arsenault2019-04-151-0/+10
| | | | | | | These are artificial, so I think this should only come up with inline asm comments. llvm-svn: 358446
* AMDGPU: Rewrite SILowerI1Copies to always stay on SALUNicolai Haehnle2018-10-311-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Instead of writing boolean values temporarily into 32-bit VGPRs if they are involved in PHIs or are observed from outside a loop, we use bitwise masking operations to combine lane masks in a way that is consistent with wave control flow. Move SIFixSGPRCopies to before this pass, since that pass incorrectly attempts to move SGPR phis to VGPRs. This should recover most of the code quality that was lost with the bug fix in "AMDGPU: Remove PHI loop condition optimization". There are still some relevant cases where code quality could be improved, in particular: - We often introduce redundant masks with EXEC. Ideally, we'd have a generic computeKnownBits-like analysis to determine whether masks are already masked by EXEC, so we can avoid this masking both here and when lowering uniform control flow. - The criterion we use to determine whether a def is observed from outside a loop is conservative: it doesn't check whether (loop) branch conditions are uniform. Change-Id: Ibabdb373a7510e426b90deef00f5e16c5d56e64b Reviewers: arsenm, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, mgorny, yaxunl, dstuttard, t-tye, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D53496 llvm-svn: 345719
* [SchedModel] Fix for read advance cycles with implicit pseudo operands.Jonas Paulsson2018-10-301-2/+2
| | | | | | | | | | | | | | | | | | The SchedModel allows the addition of ReadAdvances to express that certain operands of the instructions are needed at a later point than the others. RegAlloc may add pseudo operands that are not part of the instruction descriptor, and therefore cannot have any read advance entries. This meant that in some cases the desired read advance was nullified by such a pseudo operand, which still had the original latency. This patch fixes this by making sure that such pseudo operands get a zero latency during DAG construction. Review: Matthias Braun, Ulrich Weigand. https://reviews.llvm.org/D49671 llvm-svn: 345606
* AMDGPU: Fix assert on n inline asm constraintMatt Arsenault2017-08-091-0/+16
| | | | llvm-svn: 310515
* AMDGPU: Add macro fusion schedule DAG mutationMatt Arsenault2017-07-061-2/+2
| | | | | | Try to increase opportunities to shrink vcc uses. llvm-svn: 307313
* AMDGPU: Use correct register names in inline assemblyMatt Arsenault2017-06-081-6/+6
| | | | | | Fixes using physical registers in inline asm from clang. llvm-svn: 305004
* AMDGPU: Start defining a calling conventionMatt Arsenault2017-05-171-1/+1
| | | | | | | | Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308
* AMDGPU: Fix copies from physical registers in SIFixSGPRCopiesMatt Arsenault2017-04-291-0/+14
| | | | | | | | | This would assert when there were multiple defs of a physical register. We just need to move all of the users of it. llvm-svn: 301730
* AMDGPU: Fix invalid copies when copying i1 to phys regMatt Arsenault2017-04-121-0/+36
| | | | | | | Insert a VReg_1 virtual register so the i1 workaround pass can handle it. llvm-svn: 300113
* AMDGPU: Fix folding reg_sequence into copy to phys regMatt Arsenault2017-04-111-0/+13
| | | | | | | This was producing an illegal reg_sequence defining a physical register with virtual register inputs. llvm-svn: 299997
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-211-16/+16
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* Enable FeatureFlatForGlobal on Volcanic IslandsMatt Arsenault2017-01-241-1/+1
| | | | | | | | | | | This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
* AMDGPU: Use unsigned compare for eq/neMatt Arsenault2016-09-301-2/+2
| | | | | | | | | | For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one. llvm-svn: 282832
* TII: Fix inlineasm size counting comments as instsMatt Arsenault2016-07-011-0/+106
| | | | | | | The main problem was counting comments on their own line as instructions. llvm-svn: 274405
* AMDGPU: Use correct method for determining instruction sizeMatt Arsenault2016-06-201-5/+33
| | | | llvm-svn: 273172
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-061-3/+1
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* SelectionDAG: Fix a crash on inline asm when output register supports ↵Tom Stellard2016-03-091-0/+12
| | | | | | | | | | | | | | | | multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022
* AMDGPU/SI: Detect uniform branches and emit s_cbranch instructionsTom Stellard2016-02-121-0/+18
| | | | | | | | | | Reviewers: arsenm Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16603 llvm-svn: 260765
* AMDGPU/SI: Fix crash when inline assembly is used in a graphics shaderNicolai Haehnle2016-01-061-0/+11
| | | | | | | | | | | | | | | Summary: This is admittedly something that you could only run into by manually playing around with shader assembly because the SITypeWriter pass is skipped for compute. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15902 llvm-svn: 256980
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+12
llvm-svn: 239657
OpenPOWER on IntegriCloud