summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
...
* GlobalISel: Fix widenScalar for G_MERGE_VALUES to pointerMatt Arsenault2019-08-011-0/+16
| | | | | | | AMDGPU testcase isn't broken now, but will be in a future patch without this. llvm-svn: 367591
* AMDGPU/GlobalISel: fix inst-select-load-local.mir in ↵Fangrui Song2019-08-011-4/+2
| | | | | | -DLLVM_ENABLE_ASSERTIONS=off builds after r367498 llvm-svn: 367514
* AMDGPU/GlobalISel: Fix flat load/store of pointer typesMatt Arsenault2019-08-014-96/+104
| | | | llvm-svn: 367513
* AMDGPU/GlobalISel: Remove manual store select codeMatt Arsenault2019-08-018-389/+372
| | | | | | | This regresses the weird types that are newly treated as legal load types, but fixes incorrectly using flat instrucions on SI. llvm-svn: 367512
* AMDGPU/GlobalISel: Select local atomic cmpxchgMatt Arsenault2019-08-011-0/+91
| | | | llvm-svn: 367511
* AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADDMatt Arsenault2019-08-013-0/+153
| | | | llvm-svn: 367509
* AMDGPU/GlobalISel: Allow selection of DS atomicrmwMatt Arsenault2019-08-011-0/+83
| | | | llvm-svn: 367507
* AMDGPU/GlobalISel: Select simple local storesMatt Arsenault2019-08-011-0/+262
| | | | llvm-svn: 367504
* GlobalISel: moreElementsVector for G_LOAD/G_STOREMatt Arsenault2019-08-012-6/+61
| | | | | | | AMDGPU change and test is a placeholder until a future patch with complete handling. llvm-svn: 367503
* AMDGPU/GlobalISel: Select local loadsMatt Arsenault2019-08-011-0/+906
| | | | llvm-svn: 367498
* GlobalISel: Add G_ATOMICRMW_{FADD|FSUB}Matt Arsenault2019-07-301-0/+48
| | | | llvm-svn: 367369
* [AMDGPU/GlobalISel] Add llvm.amdgcn.fdiv.fast legalization.Austin Kerbow2019-07-301-0/+54
| | | | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64966 llvm-svn: 367344
* AMDGPU/GlobalISel: Handle most function return typesMatt Arsenault2019-07-264-135/+1339
| | | | | | | | | handleAssignments gives up pretty easily on structs, and i8 values for some reason. The other case that doesn't work is when an implicit sret needs to be inserted if the return size exceeds the number of return registers. llvm-svn: 367082
* GlobalISel: Fold out unmerge to scalars from concat_vectorMatt Arsenault2019-07-262-47/+56
| | | | | | | Removes illegal intermediate vectors if an operation was lowering to concat_vectors, and the next operation is scalarized. llvm-svn: 367081
* AMDGPU/GlobalISel: Don't assume instruction can be erased when selecting extsMatt Arsenault2019-07-241-9/+19
| | | | | | | The G_ANYEXT handling can end up reaching selectCOPY, which mutates the instruction in place. llvm-svn: 366915
* AMDGPU/GlobalISel: Fix broken testsMatt Arsenault2019-07-229-54/+54
| | | | llvm-svn: 366688
* AMDGPU/GlobalISel: Fix tests without assertsMatt Arsenault2019-07-2212-692/+283
| | | | | | | The legality check is only done under NDEBUG, so the failure cases are different in a release build. llvm-svn: 366680
* AMDGPU/GlobalISel: Legalize GEP for other 32-bit address spacesMatt Arsenault2019-07-192-0/+122
| | | | llvm-svn: 366621
* AMDGPU/GlobalISel: Fix MMO flags for kernel argument loadsMatt Arsenault2019-07-191-114/+114
| | | | | | The DAG lowering sets dereferencable and invariant, not nontemporal. llvm-svn: 366597
* AMDGPU/GlobalISel: Selection for fminnum/fmaxnumMatt Arsenault2019-07-198-251/+1230
| | | | | | | v2f16 case doesn't work yet because the VOP3P complex patterns haven't been ported yet. llvm-svn: 366585
* AMDGPU/GlobalISel: Support arguments with multiple registersMatt Arsenault2019-07-192-12/+42
| | | | | | Handles structs used directly in argument lists. llvm-svn: 366584
* AMDGPU/GlobalISel: Rewrite lowerFormalArgumentsMatt Arsenault2019-07-192-7/+1994
| | | | | | | | | | | | | | | | | This should now handle everything except structs passed as multiple registers. I think most of the packing logic should be handled by handleAssignments, but I'm unclear on what the contract is for multiple registers. This is copying how x86 handles this. This does change the behavior of the test_sgpr_alignment0 amdgpu_vs test. I don't think shader arguments should try to follow the alignment, and registers need to be repacked. I also don't think it matters, since I think the pointers are packed to the beginning of the argument list anyway. llvm-svn: 366582
* AMDGPU: Decompose all values to 32-bit pieces for calling conventionsMatt Arsenault2019-07-191-0/+1
| | | | | | | | | | This is the more natural lowering, and presents more opportunities to reduce 64-bit ops to 32-bit. This should also help avoid issues graphics shaders have had with 64-bit values, and simplify argument lowering in globalisel. llvm-svn: 366578
* GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sourcesMatt Arsenault2019-07-171-129/+318
| | | | | | | | | | | Extract the sources to the GCD of the original size and target size, padding with implicit_def as necessary. Also fix the case where the requested source type is wider than the original result type. This was ignoring the type, and just using the destination. Do the operation in the requested type and truncate back. llvm-svn: 366367
* GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUESMatt Arsenault2019-07-171-0/+61
| | | | | | | | | | | | Use an anyext to the requested type for the leftover operand to produce a slightly wider type, and then truncate the final merge. I have another implementation almost ready which handles arbitrary widens, but I think it produces worse code in this example (which I think is 90% due to not folding redundant copies or folding out implicit_def users), so I wanted to add this as a baseline first. llvm-svn: 366366
* AMDGPU/GFX10: Apply the VMEM-to-scalar-write hazard also to writes to EXECNicolai Haehnle2019-07-171-0/+1
| | | | | | | | | | | | | | | Summary: Change-Id: I854fbf7d48e937bef9f8f3f5d0c8aeb970652630 Reviewers: rampitec, mareko Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64807 Change-Id: I4405b3a7f84186acea5a78d291bff71056e745fc llvm-svn: 366314
* AMDGPU/GlobalISel: Select G_ASHRMatt Arsenault2019-07-163-59/+676
| | | | llvm-svn: 366257
* AMDGPU/GlobalISel: Select G_LSHRMatt Arsenault2019-07-163-0/+699
| | | | llvm-svn: 366256
* AMDGPU/GlobalISel: Select G_SHLMatt Arsenault2019-07-163-0/+698
| | | | | | | | | | I think this manages to not break the DAG handling with the divergent predicates because the stadalone divergent patterns end up with a higher priority than the pattern on the instruction definition. The 16-bit versions don't work yet. llvm-svn: 366254
* AMDGPU/GlobalISel: Fix selection of private storesMatt Arsenault2019-07-161-0/+280
| | | | llvm-svn: 366249
* AMDGPU/GlobalISel: Select private loadsMatt Arsenault2019-07-161-0/+1158
| | | | llvm-svn: 366248
* AMDGPU/GlobalISel: Select flat storesMatt Arsenault2019-07-167-52/+1646
| | | | llvm-svn: 366246
* AMDGPU/GlobalISel: Select flat loadsMatt Arsenault2019-07-162-9/+3357
| | | | | | | | Now that the patterns use the new PatFrag address space support, the only blocker to importing most load patterns is the addressing mode complex patterns. llvm-svn: 366237
* AMDGPU/GlobalISel: Fix test failures in release buildMatt Arsenault2019-07-1613-463/+400
| | | | | | | | | | | | Apparently the check for legal instructions during instruction select does not happen without an asserts build, so these would successfully select in release, and fail in debug. Make s16 and/or/xor legal. These can just be selected directly to the 32-bit operation, as is already done in SelectionDAG, so just make them legal. llvm-svn: 366210
* AMDGPU/GlobalISel: Allow scalar s1 and/or/xorMatt Arsenault2019-07-155-162/+1873
| | | | | | | | If a 1-bit value is in a 32-bit VGPR, the scalar opcodes set SCC to whether the result is 0. If the inputs are SCC, these can be copied to a 32-bit SGPR to produce an SCC result. llvm-svn: 366125
* AMDGPU/GlobalISel: Select G_AND/G_OR/G_XORMatt Arsenault2019-07-153-24/+1762
| | | | llvm-svn: 366121
* AMDGPU/GlobalISel: Don't constrain source register of VCC copiesMatt Arsenault2019-07-151-4/+27
| | | | | | | | | | | | | This is a hack until I come up with a better way of dealing with the pseudo-register banks used for boolean values. If the use instruction constrains the register, the selector for the def instruction won't see that the bank was VCC. A 1-bit SReg_32 is could ambiguously have been SCCRegBank or VCCRegBank in wave32. This is necessary to successfully select branches with and and/or/xor condition. llvm-svn: 366120
* AMDGPU/GlobalISel: Fix selecting vcc->vcc bank copiesMatt Arsenault2019-07-151-3/+31
| | | | | | | | | The extra test change is correct, although how it arrives there is a bug that needs work. With wave32, the test for isVCC ambiguously reports true for an SCC or VCC source. A new allocatable pseudo register class for SCC may be necesssary. llvm-svn: 366119
* AMDGPU/GlobalISel: Fix not constraining result reg of copies to VCCMatt Arsenault2019-07-151-0/+26
| | | | llvm-svn: 366118
* AMDGPU/GlobalISel: Fix handling of sgpr (not scc bank) s1 to VCCMatt Arsenault2019-07-151-9/+36
| | | | | | This was emitting a copy from a 32-bit register to a 64-bit. llvm-svn: 366117
* AMDGPU/GlobalISel: Custom legalize G_INSERT_VECTOR_ELTMatt Arsenault2019-07-151-3/+38
| | | | llvm-svn: 366116
* AMDGPU/GlobalISel: Custom legalize G_EXTRACT_VECTOR_ELTMatt Arsenault2019-07-151-102/+99
| | | | | | Turn the constant cases into G_EXTRACTs. llvm-svn: 366115
* AMDGPU/GlobalISel: Fix G_ICMP for wave32Matt Arsenault2019-07-151-6/+7
| | | | llvm-svn: 366114
* GlobalISel: Implement narrowScalar for vector extract/insert indexesMatt Arsenault2019-07-152-2/+63
| | | | llvm-svn: 366113
* AMDGPU/GlobalISel: Widen vector extractsMatt Arsenault2019-07-151-0/+366
| | | | llvm-svn: 366103
* AMDGPU/GlobalISel: Handle llvm.amdgcn.if.breakMatt Arsenault2019-07-152-0/+53
| | | | llvm-svn: 366102
* AMDGPU/GlobalISel: Select llvm.amdgcn.end.cfMatt Arsenault2019-07-152-0/+75
| | | | llvm-svn: 366099
* AMDGPU/GlobalISel: Select easy cases for G_BUILD_VECTORMatt Arsenault2019-07-151-0/+152
| | | | llvm-svn: 366087
* AMDGPU/GlobalISel: RegBankSelect for G_CONCAT_VECTORSMatt Arsenault2019-07-151-0/+69
| | | | llvm-svn: 366086
* AMDGPU: Drop remnants of byval support for shadersMatt Arsenault2019-07-121-8/+0
| | | | | | | | Before 2018, mesa used to use byval interchangably with inreg, which didn't really make sense. Fix tests still using it to avoid breaking in a future commit. llvm-svn: 365953
OpenPOWER on IntegriCloud