summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
...
* GlobalISel: Legalize scalar G_EXTRACT sourcesMatt Arsenault2019-04-221-7/+7
| | | | llvm-svn: 358892
* [GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with ↵Amara Emerson2019-04-1515-651/+420
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369
* GlobalISel: Support legalizing G_CONSTANT with irregular breakdownMatt Arsenault2019-04-101-0/+17
| | | | llvm-svn: 358109
* GlobalISel: Handle odd breakdowns for bit opsMatt Arsenault2019-04-103-0/+138
| | | | llvm-svn: 358105
* AMDGPU/GlobalISel: Implement call lowering for shaders returning valuesTom Stellard2019-04-092-10/+21
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D57166 llvm-svn: 357964
* AMDGPU/GlobalISel: Fix non-power-of-2 selectMatt Arsenault2019-04-051-0/+28
| | | | llvm-svn: 357762
* AMDGPU/GlobalISel: Insert waterfall loop for vector indexingMatt Arsenault2019-03-291-12/+91
| | | | | | | | The register index can only really be an SGPR. Lie that a VGPR index is legal, and then rewrite the instruction in a waterfall loop to handle the index. llvm-svn: 357235
* [GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead ↵Amara Emerson2019-03-272-8/+5
| | | | | | | | | | | | | | | | | | | | instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 llvm-svn: 357101
* MIR: Freeze reserved regs after parsing everythingMatt Arsenault2019-03-271-2/+2
| | | | | | | | | | | | The AMDGPU implementation of getReservedRegs depends on MachineFunctionInfo fields that are parsed from the YAML section. This was reserving the wrong register since it was setting the reserved regs before parsing the correct one. Some tests were relying on the default reserved set for the assumed default calling convention. llvm-svn: 357083
* GlobalISel: Fix RegBankSelect for REG_SEQUENCEMatt Arsenault2019-03-211-0/+140
| | | | | | | | | | | | | The AArch64 test was broken since the result register already had a set register class, so this test was a no-op. The mapping verify call would fail because the result size is not the same as the inputs like in a copy or phi. The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR copies which need much more work to handle correctly (same for phis), but add them as a baseline. llvm-svn: 356713
* AMDGPU: Don't look for constant in insert/extract_vector_elt regbankselectMatt Arsenault2019-03-202-75/+134
| | | | | | | | | The constantness shouldn't change the register bank choice. We also don't need to restrict this to only indexing VGPRs, since it's possible to index SGPRs (but SelectionDAG made using this difficult). Allow directly indexing SGPRs when appropriate. llvm-svn: 356611
* GlobalISel: Use multiple returns for intrinsic structsMatt Arsenault2019-03-141-0/+27
| | | | | | | | | | | This is consistent with what SelectionDAG does and is much easier to work with than the extract sequence with an artificial wide register. For the AMDGPU control flow intrinsics, this was producing an s128 for the i64, i1 tuple return. Any legalization that should apply to a real s128 value would badly obscure the direct values that need to be seen. llvm-svn: 356147
* [AMDGPU] Add support for immediate operand for S_ENDPGMDavid Stuttard2019-03-124-10/+10
| | | | | | | | | | | | | | | | | Summary: Add support for immediate operand in S_ENDPGM Change-Id: I0c56a076a10980f719fb2a8f16407e9c301013f6 Reviewers: alexshap Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, eraman, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59213 llvm-svn: 355902
* [MIPS GlobalISel] NarrowScalar G_MULPetar Avramovic2019-03-111-8/+7
| | | | | | | | | | | | | Narrow Scalar G_MUL for MIPS32. Revisit NarrowScalar implementation in LegalizerHelper. Introduce new helper function multiplyRegisters. It performs generic multiplication of values held in multiple registers. Generated instructions use only types NarrowTy and i1. Destination can be same or two times size of the source. Differential Revision: https://reviews.llvm.org/D58824 llvm-svn: 355814
* AMDGPU/GlobalISel: Implement select for G_INSERTTom Stellard2019-03-011-0/+49
| | | | | | | | | | | | Re-commit r344310. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53116 llvm-svn: 355159
* AMDGPU/GlobalISel: Implement select for G_EXTRACTTom Stellard2019-02-281-0/+77
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49714 llvm-svn: 355156
* AMDGPU/GlobalISel: Add regbankselect test for phisMatt Arsenault2019-02-281-0/+1434
| | | | | | | Add baseline for future fixes. These mostly show how this is broken and producing illegal situations. llvm-svn: 355057
* GlobalISel: Implement fewerElementsVector for phiMatt Arsenault2019-02-281-1/+203
| | | | llvm-svn: 355048
* GlobalISel: Implement moreElementsVector for phiMatt Arsenault2019-02-281-0/+73
| | | | llvm-svn: 355047
* RegBankSelect: Handle slightly more complex value mappingsMatt Arsenault2019-02-254-48/+1736
| | | | | | | | Try to use concat_vectors. Also remove unnecessary assert on pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers for AMDGPU. llvm-svn: 354828
* AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizesMatt Arsenault2019-02-259-28/+118
| | | | llvm-svn: 354825
* AMDGPU/GlobalISel: Clamp max implicit_def elementsMatt Arsenault2019-02-251-0/+86
| | | | llvm-svn: 354818
* AMDGPU/GlobalISel: Make phis legalMatt Arsenault2019-02-211-0/+1109
| | | | llvm-svn: 354592
* AMDGPU/GlobalISel: Fix bit count ops for non-power-of-2 typesMatt Arsenault2019-02-215-0/+270
| | | | llvm-svn: 354587
* AMDGPU/GlobalISel: Move SMRD selection logic to TableGenTom Stellard2019-02-201-13/+30
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52922 llvm-svn: 354516
* GlobalISel: Fix fewerElementsVector for ctlz with different result typeMatt Arsenault2019-02-205-0/+312
| | | | | | Also complete the set of related operations. llvm-svn: 354480
* GlobalISel: Implement moreElementsVector for g_insert resultsMatt Arsenault2019-02-2015-181/+736
| | | | llvm-svn: 354477
* GlobalISel: Implement moreElementsVector for selectMatt Arsenault2019-02-191-25/+25
| | | | llvm-svn: 354354
* GlobalISel: Implement moreElementsVector for G_EXTRACT sourceMatt Arsenault2019-02-191-10/+58
| | | | llvm-svn: 354348
* GlobalISel: Implement moreElementsVector for bit opsMatt Arsenault2019-02-194-92/+622
| | | | llvm-svn: 354345
* GlobalISel: Implement widenScalar for g_extract scalar resultsMatt Arsenault2019-02-181-0/+293
| | | | llvm-svn: 354293
* AMDGPU/GlobalISel: Fix RegBankSelect for GEP.Matt Arsenault2019-02-141-0/+90
| | | | | | | | | | This is basically a pointer typed add, so shouldn't be any different. This was assuming everything was an SGPR, which is not true. Also cleanup legality for GEP. I don't seem to be seeing the problem the hack marking s64 as a legal pointer type the comment mentions. llvm-svn: 354067
* AMDGPU/GlobalISel: Handle split for 64-bit VALU selectMatt Arsenault2019-02-141-0/+400
| | | | llvm-svn: 354065
* AMDGPU/GlobalISel: Add more insert/extract testcasesMatt Arsenault2019-02-122-115/+918
| | | | llvm-svn: 353848
* AMDGPU/GlobalISel: Only make f16 constants legal on f16 targetsMatt Arsenault2019-02-121-12/+23
| | | | | | We could deal with it, but there's no real point. llvm-svn: 353845
* GlobalISel: Verify G_EXTRACTMatt Arsenault2019-02-111-2/+2
| | | | llvm-svn: 353759
* GlobalISel: Implement moreElementsVector for implicit_defMatt Arsenault2019-02-1118-146/+296
| | | | llvm-svn: 353754
* GlobalISel: Add G_FCANONICALIZE instructionMatt Arsenault2019-02-111-0/+286
| | | | llvm-svn: 353719
* AMDGPU/GlobalISel: Fix broken testsMatt Arsenault2019-02-082-0/+31
| | | | llvm-svn: 353559
* AMDGPU/GlobalISel: Fix shift legalization for non-power-of-2Matt Arsenault2019-02-081-0/+85
| | | | | | | | clampScalar doesn't do anything for non-power-of-2 in range. There should probably be a combination rule to reduce the number of matching rules. llvm-svn: 353526
* AMDGPU/GlobalISel: Fix non-power-of-2 implicit_defMatt Arsenault2019-02-081-0/+28
| | | | llvm-svn: 353522
* [MIPS GlobalISel] Select any extending load and truncating storePetar Avramovic2019-02-082-68/+153
| | | | | | | | | | | | | | | | | | Make behavior of G_LOAD in widenScalar same as for G_ZEXTLOAD and G_SEXTLOAD. That is perform widenScalarDst to size given by the target and avoid additional checks in common code. Targets can reorder or add additional rules in LegalizeRuleSet for the opcode to achieve desired behavior. Select extending load that does not have specified type of extension into zero extending load. Select truncating store that stores number of bytes indicated by size in MachineMemoperand. Differential Revision: https://reviews.llvm.org/D57454 llvm-svn: 353520
* AMDGPU/GlobalISel: Don't use a copy in addrspacecast loweringMatt Arsenault2019-02-081-36/+36
| | | | llvm-svn: 353516
* AMDGPU/GlobalISel: Legalize addrspacecastMatt Arsenault2019-02-081-0/+393
| | | | | | | Use a placeholder constant for now on targets that need the load from the queue ptr. llvm-svn: 353497
* GlobalISel: Implement narrowScalar for shift main typeMatt Arsenault2019-02-075-231/+2873
| | | | | | | | | | | | | | | This is pretty much directly ported from SelectionDAG. Doesn't include the shift by non-constant but known bits version, since there isn't a globalisel version of computeKnownBits yet. This shows a disadvantage of targets not specifically which type should be used for the shift amount. If type 0 is legalized before type 1, the operations on the shift amount type use the wider type (which are also less likely to legalize). This can be avoided by targets specifying legalization actions on type 1 earlier than for type 0. llvm-svn: 353455
* AMDGPU/GlobalISel: Restrict g_implicit_def legalityMatt Arsenault2019-02-073-43/+426
| | | | llvm-svn: 353452
* GlobalISel: Fix artifact combiner constant legality checks for vectorsMatt Arsenault2019-02-074-17/+260
| | | | | | | Since G_CONSTANT is illegal for vectors, this needs to check what buildConstant will produce for a splat vector. llvm-svn: 353449
* AMDGPU/GlobalISel: Don't use g_implicit_def in a few testsMatt Arsenault2019-02-073-54/+65
| | | | llvm-svn: 353443
* AMDGPU/GlobalISel: Legalize fsqrtMatt Arsenault2019-02-072-0/+323
| | | | llvm-svn: 353438
* AMDGPU/GlobalISel: Legalize some f16 operationsMatt Arsenault2019-02-076-613/+213
| | | | llvm-svn: 353436
OpenPOWER on IntegriCloud