summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU/GlobalISel: Legalize GEP for other 32-bit address spacesMatt Arsenault2019-07-191-1/+3
| | | | llvm-svn: 366621
* AMDGPU/GlobalISel: Select flat loadsMatt Arsenault2019-07-161-0/+3
| | | | | | | | Now that the patterns use the new PatFrag address space support, the only blocker to importing most load patterns is the addressing mode complex patterns. llvm-svn: 366237
* AMDGPU/GlobalISel: Fix test failures in release buildMatt Arsenault2019-07-161-1/+1
| | | | | | | | | | | | Apparently the check for legal instructions during instruction select does not happen without an asserts build, so these would successfully select in release, and fail in debug. Make s16 and/or/xor legal. These can just be selected directly to the 32-bit operation, as is already done in SelectionDAG, so just make them legal. llvm-svn: 366210
* AMDGPU/GlobalISel: Custom legalize G_INSERT_VECTOR_ELTMatt Arsenault2019-07-151-1/+31
| | | | llvm-svn: 366116
* AMDGPU/GlobalISel: Custom legalize G_EXTRACT_VECTOR_ELTMatt Arsenault2019-07-151-1/+34
| | | | | | Turn the constant cases into G_EXTRACTs. llvm-svn: 366115
* AMDGPU/GlobalISel: Widen vector extractsMatt Arsenault2019-07-151-5/+8
| | | | llvm-svn: 366103
* GlobalISel: Legalization for G_FMINNUM/G_FMAXNUMMatt Arsenault2019-07-101-1/+55
| | | | llvm-svn: 365658
* AMDGPU/GlobalISel: Add support for wide loads >= 256-bitsTom Stellard2019-07-101-1/+8
| | | | | | | | | | | | | | | | | | Summary: This adds support for the most commonly used wide load types: <8xi32>, <16xi32>, <4xi64>, and <8xi64> Reviewers: arsenm Reviewed By: arsenm Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57399 llvm-svn: 365586
* GlobalISel: Implement lower for G_FCOPYSIGNMatt Arsenault2019-07-091-3/+2
| | | | | | | | | In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583
* AMDGPU/GlobalISel: Fix legality for G_BUILD_VECTORMatt Arsenault2019-07-091-7/+4
| | | | llvm-svn: 365575
* AMDGPU/GlobalISel: Legalize more concat_vectorsMatt Arsenault2019-07-091-13/+16
| | | | llvm-svn: 365488
* AMDGPU/GlobalISel: Make s16 G_ICMP legalMatt Arsenault2019-07-091-2/+8
| | | | llvm-svn: 365486
* AMDGPU/GlobalISel: Select G_MERGE_VALUESMatt Arsenault2019-07-091-2/+4
| | | | llvm-svn: 365482
* AMDGPU/GlobalISel: Handle more input argument intrinsicsMatt Arsenault2019-07-011-0/+12
| | | | llvm-svn: 364836
* AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsicsMatt Arsenault2019-07-011-4/+44
| | | | llvm-svn: 364835
* AMDGPU/GlobalISel: Legalize workgroup ID intrinsicsMatt Arsenault2019-07-011-0/+9
| | | | llvm-svn: 364834
* AMDGPU/GlobalISel: Legalize workitem ID intrinsicsMatt Arsenault2019-07-011-0/+84
| | | | | | | | | Tests don't cover the masked input path since non-kernel arguments aren't lowered yet. Test is copied directly from the existing test, with 2 additions. llvm-svn: 364833
* AMDGPU/GlobalISel: Custom lower control flow intrinsicsMatt Arsenault2019-07-011-0/+64
| | | | | | | | Replace the brcond for the 2 cases that act as branches. For now follow how the current system works, although I think we can eventually get rid of the pseudos. llvm-svn: 364832
* AMDGPU/GlobalISel: Legalize s16 add/sub/mulMatt Arsenault2019-07-011-1/+12
| | | | | | | If this is scalar, promote to s32. Use a new observer class to assign the register bank of newly created registers. llvm-svn: 364827
* AMDGPU/GlobalISel: Legalize s16 fcmpMatt Arsenault2019-07-011-1/+9
| | | | llvm-svn: 364817
* AMDGPU/GlobalISel: Make s16 select legalMatt Arsenault2019-07-011-2/+2
| | | | | | | This is easy to handle and avoids legalization artifacts which are likely to obscure combines. llvm-svn: 364787
* AMDGPU/GlobalISel: Convert to using RegisterMatt Arsenault2019-06-281-4/+4
| | | | llvm-svn: 364616
* GlobalISel: Remove unsigned variant of SrcOpMatt Arsenault2019-06-241-12/+12
| | | | | | | | | Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194
* CodeGen: Introduce a class for registersMatt Arsenault2019-06-241-4/+4
| | | | | | | | | Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
* AMDGPU: Consolidate some getGeneration checksMatt Arsenault2019-06-191-3/+2
| | | | | | | | This is incomplete, and ideally these would all be removed, but it's better to localize them to the subtarget first with comments about what they're for. llvm-svn: 363902
* AMDGPU/GlobalISel: Legality for integer min/maxMatt Arsenault2019-05-231-0/+23
| | | | llvm-svn: 361519
* AMDGPU/GlobalISel: Implement s64->s64 [SU]ITOFPMatt Arsenault2019-05-171-0/+37
| | | | llvm-svn: 361082
* GlobalISel: Implement lower for S64->S32 [SU]ITOFPMatt Arsenault2019-05-171-0/+1
| | | | | | | | | | | | | This is ported from the custom AMDGPU DAG implementation. I think this is a better default expansion than what the DAG currently uses, at least if the target has CTLZ. This implements the signed version in terms of the unsigned conversion, which is implemented with bit operations. SelectionDAG has several other implementations that should eventually be ported depending on what instructions are legal. llvm-svn: 361081
* AMDGPU: Fix unused variable warnings in release buildsMatt Arsenault2019-05-171-12/+9
| | | | llvm-svn: 361030
* AMDGPU/GlobalISel: Legalize G_FCEILMatt Arsenault2019-05-171-2/+35
| | | | llvm-svn: 361028
* AMDGPU/GlobalISel: Legalize G_INTRINSIC_TRUNCMatt Arsenault2019-05-171-3/+68
| | | | llvm-svn: 361027
* AMDGPU/GlobalISel: Legalize G_FRINTMatt Arsenault2019-05-171-0/+41
| | | | llvm-svn: 361026
* AMDGPU/GlobalISel: Legalize G_FCOPYSIGNMatt Arsenault2019-05-171-0/+4
| | | | llvm-svn: 361025
* AMDGPU/GlobalISel: Fix non-power-of-2 G_EXTRACT sourcesMatt Arsenault2019-04-221-1/+3
| | | | llvm-svn: 358894
* [GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with ↵Amara Emerson2019-04-151-16/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369
* AMDGPU/GlobalISel: Fix non-power-of-2 selectMatt Arsenault2019-04-051-0/+1
| | | | llvm-svn: 357762
* GlobalISel: Implement fewerElementsVector for phiMatt Arsenault2019-02-281-0/+1
| | | | llvm-svn: 355048
* GlobalISel: Implement moreElementsVector for phiMatt Arsenault2019-02-281-0/+1
| | | | llvm-svn: 355047
* AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizesMatt Arsenault2019-02-251-0/+2
| | | | llvm-svn: 354825
* AMDGPU/GlobalISel: Clamp max implicit_def elementsMatt Arsenault2019-02-251-1/+2
| | | | llvm-svn: 354818
* AMDGPU/GlobalISel: Make phis legalMatt Arsenault2019-02-211-0/+13
| | | | llvm-svn: 354592
* AMDGPU/GlobalISel: Fix bit count ops for non-power-of-2 typesMatt Arsenault2019-02-211-1/+3
| | | | llvm-svn: 354587
* GlobalISel: Fix fewerElementsVector for ctlz with different result typeMatt Arsenault2019-02-201-2/+2
| | | | | | Also complete the set of related operations. llvm-svn: 354480
* GlobalISel: Implement moreElementsVector for g_insert resultsMatt Arsenault2019-02-201-14/+24
| | | | llvm-svn: 354477
* GlobalISel: Implement moreElementsVector for selectMatt Arsenault2019-02-191-18/+9
| | | | llvm-svn: 354354
* GlobalISel: Implement moreElementsVector for G_EXTRACT sourceMatt Arsenault2019-02-191-0/+1
| | | | llvm-svn: 354348
* GlobalISel: Implement moreElementsVector for bit opsMatt Arsenault2019-02-191-0/+20
| | | | llvm-svn: 354345
* GlobalISel: Implement widenScalar for g_extract scalar resultsMatt Arsenault2019-02-181-2/+3
| | | | llvm-svn: 354293
* GlobalISel: Add alignment to LegalityQuery MMOsMatt Arsenault2019-02-141-9/+10
| | | | | | | This allows targets to specify the minimum alignment required for the load/store. llvm-svn: 354071
* AMDGPU/GlobalISel: Fix RegBankSelect for GEP.Matt Arsenault2019-02-141-22/+14
| | | | | | | | | | This is basically a pointer typed add, so shouldn't be any different. This was assuming everything was an SGPR, which is not true. Also cleanup legality for GEP. I don't seem to be seeing the problem the hack marking s64 as a legal pointer type the comment mentions. llvm-svn: 354067
OpenPOWER on IntegriCloud