summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU/GlobalISel: Regbank select for fpextMatt Arsenault2019-01-201-0/+31
| | | | llvm-svn: 351692
* AMDGPU/GlobalISel: Cleanup legality for extensionsMatt Arsenault2019-01-204-2/+230
| | | | llvm-svn: 351691
* AMDGPU/GlobalISel: Legalize more types for selectMatt Arsenault2019-01-182-18/+174
| | | | llvm-svn: 351599
* AMDGPU/GlobalISel: Legalize illegal g_constantMatt Arsenault2019-01-182-22/+96
| | | | llvm-svn: 351596
* AMDGPU/GlobalISel: Introduce vcc reg bankMatt Arsenault2019-01-0814-86/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | I'm not entirely sure this is the correct thing to do with the global isel philosophy, but I think this is necessary to handle how differently SGPRs are used normally vs. from a condition. For example, it makes sense to allow a copy from a VGPR to an SGPR, but it makes no sense to allow a copy from VGPRs to SGPRs used as select mask. This avoids regbankselecting strange code with a truncate feeding directly into a condition field. Now a copy is forced from sgpr(s1) to vcc, which is more sensible to handle. Some of these issues could probably avoided with making enough operations resulting in i1 illegal. I think we can't avoid this register bank for legality. For example, an i1 and where one source is from a truncate, and one source is a compare needs some kind of copy inserted to make sure both are in condition registers. llvm-svn: 350611
* AMDGPU/GlobalISel: Legalize concat_vectorsMatt Arsenault2019-01-081-0/+129
| | | | llvm-svn: 350598
* RegBankSelect: Fix copy insertion point for terminatorsMatt Arsenault2019-01-082-0/+203
| | | | | | | | | | | | | | | If a copy was needed to handle the condition of brcond, it was being inserted before the defining instruction. Add tests for iterator edge cases. I find the existing code here suspect for the case where it's looking for terminators that modify the register. It's going to insert a copy in the middle of the terminators, which isn't allowed (it might be necessary to have a COPY_terminator if anybody actually needs this). Also legalize brcond for AMDGPU. llvm-svn: 350595
* AMDGPU/GlobalISel: Disallow VGPR->SCC copiesMatt Arsenault2019-01-084-8/+16
| | | | | | | This fixes using scalar adds when only the carry in is a VGPR using greedy regbankselect. llvm-svn: 350593
* AMDGPU/GlobalISel: RegBankSelect for carry-inMatt Arsenault2019-01-084-0/+595
| | | | | | | | I'm not sure we should be allowing the truncate to s1 for the inputs. It may be necessary to create a new VCC reg bank. llvm-svn: 350592
* AMDGPU/GlobalISel: RegBankSelect for add/sub with carry outMatt Arsenault2019-01-084-0/+275
| | | | llvm-svn: 350589
* AMDGPU/GlobalISel: InstrMapping for G_UNMERGE_VALUESMatt Arsenault2019-01-081-0/+38
| | | | llvm-svn: 350588
* AMDGPU: Remove VS/SV mappings from selectMatt Arsenault2019-01-071-101/+69
| | | | | | These would violate the constant bus restriction llvm-svn: 350517
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.wqm.voteMatt Arsenault2018-12-211-0/+56
| | | | llvm-svn: 349882
* AMDGPU/GlobalISel: RegBankSelect for some fp opsMatt Arsenault2018-12-216-0/+174
| | | | llvm-svn: 349880
* AMDGPU/GlobalISel: Redo legality for build_vectorMatt Arsenault2018-12-211-0/+585
| | | | | | | | | | It seems better to avoid using the callback if possible since there are coverage assertions which are disabled if this is used. Also fix missing tests. Only test the legal cases since it seems legalization for build_vector is quite lacking. llvm-svn: 349878
* AMDGPU: Make i1/i64/v2i32 and/or/xor legalMatt Arsenault2018-12-206-30/+335
| | | | | | | The 64-bit types do depend on the register bank, but that's another issue to deal with later. llvm-svn: 349716
* AMDGPU/GlobalISel: Fix ValueMapping tables for i1Matt Arsenault2018-12-201-2/+57
| | | | | | | This was incorrectly selecting SGPR for any i1 values, e.g. G_TRUNC to i1 from a VGPR was still an SGPR. llvm-svn: 349715
* AMDGPU/GlobalISel: RegBankSelect for fp conversionsMatt Arsenault2018-12-204-0/+110
| | | | llvm-svn: 349709
* AMDGPU/GlobalISel: Legality/regbankselect for atomicrmw/atomic_cmpxchgMatt Arsenault2018-12-2024-0/+1395
| | | | llvm-svn: 349708
* AMDGPU/GlobalISel: Regbankselect for fsubMatt Arsenault2018-12-191-0/+69
| | | | llvm-svn: 349608
* AMDGPU: Legalize/regbankselect frame_indexMatt Arsenault2018-12-181-0/+23
| | | | llvm-svn: 349468
* AMDGPU: Legalize/regbankselect fmaMatt Arsenault2018-12-182-0/+183
| | | | llvm-svn: 349467
* AMDGPU/GlobalISel: Legalize/regbankselect fneg/fabs/fsubMatt Arsenault2018-12-185-0/+156
| | | | llvm-svn: 349463
* AMDGPU/GlobalISel: Legalize/regbankselect block_addrMatt Arsenault2018-12-132-0/+57
| | | | llvm-svn: 349081
* AMDGPU/GlobalISel: Legalize f64 fadd/fmulMatt Arsenault2018-12-132-2/+28
| | | | llvm-svn: 349014
* AMDGPU/GlobalISel: RegBankSelect some simple operationsMatt Arsenault2018-12-139-0/+279
| | | | llvm-svn: 349012
* AMDGPU/GlobalISel: Test cleanupsMatt Arsenault2018-12-1312-138/+41
| | | | | | Remove IR and registers sections llvm-svn: 349011
* [GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes.Amara Emerson2018-12-103-34/+19
| | | | | | | | | | | | This patch restricts the capability of G_MERGE_VALUES, and uses the new G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places. This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32> and <2 x s64> vectors. Differential Revisions: https://reviews.llvm.org/D53629 llvm-svn: 348788
* Revert "AMDGPU/GlobalISel: Implement select for G_INSERT"Tom Stellard2018-10-111-49/+0
| | | | | | | | This reverts commit r344310. The test case was failing on some bots. llvm-svn: 344317
* AMDGPU/GlobalISel: Implement select for G_INSERTTom Stellard2018-10-111-0/+49
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53116 llvm-svn: 344310
* AMDGPU/GlobalISel: Select amdgcn.cvt.pkrtz to 64-bit instructionsTom Stellard2018-10-081-3/+4
| | | | | | | | | | | | | | Summary: The 32-bit variants do not exist on VI+. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52958 llvm-svn: 343985
* AMDGPU/GlobalISel: Add support for G_INTTOPTRTom Stellard2018-10-053-0/+94
| | | | | | | | | | | | | | Summary: This is a no-op. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52916 llvm-svn: 343839
* AMDGPU/GlobalISel: Define instruction mapping for G_SELECTTom Stellard2018-09-011-0/+214
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D49737 llvm-svn: 341271
* AMDGPU/GlobalISel: Define instruction mapping for G_INSERTTom Stellard2018-08-111-0/+83
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49625 llvm-svn: 339491
* AMDGPU/GlobalISel: Fix crash in regbankselect on non-power-of-2 typesTom Stellard2018-07-271-0/+17
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D49624 llvm-svn: 338102
* AMDGPU/GlobalISel: Legalize G_INSERTTom Stellard2018-07-241-0/+123
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49601 llvm-svn: 337798
* AMDGPU/GlobalISel: Implement select() for 32-bit @llvm.minnun and @llvm.maxnumTom Stellard2018-07-132-0/+131
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D46172 llvm-svn: 337056
* AMDGPU/GlobalISel: Implement select() for @llvm.amdgcn.expTom Stellard2018-07-131-0/+33
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45882 llvm-svn: 337046
* AMDGPU/GlobalISel: Implement custom kernel arg loweringMatt Arsenault2018-07-051-0/+723
| | | | | | | | | | | | | Avoid using allocateKernArg / AssignFn. We do not want any of the type splitting properties of normal calling convention lowering. For now at least this exists alongside the IR argument lowering pass. This is necessary to handle struct padding correctly while some arguments are still skipped by the IR argument lowering pass. llvm-svn: 336373
* AMDGPU/GlobalISel: Make IMPLICIT_DEF of all sizes < 512 legal.Tom Stellard2018-06-301-0/+20
| | | | | | | | | | | | | | | | | | | Summary: We could split sizes that are not power of two into smaller sized G_IMPLICIT_DEF instructions, but this ends up generating G_MERGE_VALUES instructions which we then have to handle in the instruction selector. Since G_IMPLICIT_DEF is really a no-op it's easier just to keep everything that can fit into a register legal. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48777 llvm-svn: 336041
* AMDGPU: Add pass to lower kernel arguments to loadsMatt Arsenault2018-06-261-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces most argument uses with loads, but for now not all. The code in SelectionDAG for calling convention lowering is actively harmful for amdgpu_kernel. It attempts to split the argument types into register legal types, which results in low quality code for arbitary types. Since all kernel arguments are passed in memory, we just want the raw types. I've tried a couple of methods of mitigating this in SelectionDAG, but it's easier to just bypass this problem alltogether. It's possible to hack around the problem in the initial lowering, but the real problem is the DAG then expects to be able to use CopyToReg/CopyFromReg for uses of the arguments outside the block. Exposing the argument loads in the IR also has the advantage that the LoadStoreVectorizer can merge them. I'm not sure the best approach to dealing with the IR argument list is. The patch as-is just leaves the IR arguments in place, so all the existing code will still compute the same kernarg size and pointlessly lowers the arguments. Arguably the frontend should emit kernels with an empty argument list in the first place. Alternatively a dummy array could be inserted as a single argument just to reserve space. This does have some disadvantages. Local pointer kernel arguments can no longer have AssertZext placed on them as the equivalent !range metadata is not valid on pointer typed loads. This is mostly bad for SI which needs to know about the known bits in order to use the DS instruction offset, so in this case this is not done. More importantly, this skips noalias arguments since this pass does not yet convert this to the equivalent !alias.scope and !noalias metadata. Producing this metadata correctly seems to be tricky, although this logically is the same as inlining into a function which doesn't exist. Additionally, exposing these loads to the vectorizer may result in degraded aliasing information if a pointer load is merged with another argument load. I'm also not entirely sure this is preserving the current clover ABI, although I would greatly prefer if it would stop widening arguments and match the HSA ABI. As-is I think it is extending < 4-byte arguments to 4-bytes but doesn't align them to 4-bytes. llvm-svn: 335650
* AMDGPU/GlobalISel: Add support for llvm.amdgcn.kernarg.segment.ptrMatt Arsenault2018-06-252-0/+33
| | | | | | | | | Note a normal select test is not currently possible because this relies on input registers tracked in SIMachineFunctionInfo which are not currently serializable in MIR, but this does work end-to-end from the IR. llvm-svn: 335490
* AMDGPU/GlobalISel: Fix G_IMPLICIT_DEF for pointersMatt Arsenault2018-06-251-7/+81
| | | | llvm-svn: 335485
* AMDGPU/GlobalISel: legalize and select 32-bit G_ASHRTom Stellard2018-06-222-0/+108
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D48196 llvm-svn: 335318
* AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFPTom Stellard2018-06-222-0/+50
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48195 llvm-svn: 335316
* AMDGPU/GlobalISel: Implement select() for COPYTom Stellard2018-06-221-0/+27
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46151 llvm-svn: 335315
* AMDGPU/GlobalISel: Implement select() for G_IMPLICIT_DEFTom Stellard2018-06-211-0/+25
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46150 llvm-svn: 335307
* AMDGPU/GlobalISel: Implement select() for @llvm.amdgcn.cvt.pkrtzTom Stellard2018-06-141-0/+44
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45907 llvm-svn: 334757
* AMDGPU/GlobalISel: Implement select() for 32-bit G_FADD and G_FMULTom Stellard2018-06-132-0/+74
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46171 llvm-svn: 334665
* AMDGPU/GlobalISel: Implement select() for G_FCONSTANTTom Stellard2018-05-152-6/+67
| | | | | | | | | | | | Summary: Also clean up G_CONSTANT selection. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46170 llvm-svn: 332379
OpenPOWER on IntegriCloud