summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU/GlobalISel: RegBankSelect for killMatt Arsenault2019-09-161-0/+4
| | | | llvm-svn: 371953
* AMDGPU/GlobalISel: Fix RegBankSelect for amdgcn.elseMatt Arsenault2019-09-131-0/+7
| | | | llvm-svn: 371808
* AMDGPU/GlobalISel: Legalize G_FFLOORMatt Arsenault2019-09-131-0/+1
| | | | llvm-svn: 371803
* AMDGPU/GlobalISel: Legalize G_FMADMatt Arsenault2019-09-131-0/+1
| | | | | | | | | | | | | | | Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800
* AMDGPU/GlobalISel: RegBankSelect for G_ZEXTLOAD/G_SEXTLOADMatt Arsenault2019-09-101-2/+8
| | | | llvm-svn: 371536
* AMDGPU/GlobalISel: Legalize G_BUILD_VECTOR v2s16Matt Arsenault2019-09-091-26/+51
| | | | | | | | Handle it the same way as G_BUILD_VECTOR_TRUNC. Arguably only G_BUILD_VECTOR_TRUNC should be legal for this, but G_BUILD_VECTOR will probably be more convenient in most cases. llvm-svn: 371440
* AMDGPU/GlobalISel: Implement LDS G_GLOBAL_VALUEMatt Arsenault2019-09-091-0/+1
| | | | | | Handle the simple case that lowers to a constant. llvm-svn: 371424
* AMDGPU/GlobalISel: Legalize G_BUILD_VECTOR_TRUNCMatt Arsenault2019-09-091-0/+57
| | | | | | | | | | Treat this as legal on gfx9 since it can use S_PACK_* instructions for this. This isn't used by anything yet. The same will probably apply to 16-bit G_BUILD_VECTOR without the trunc. llvm-svn: 371423
* AMDGPU/GlobalISel: Fix RegBankSelect for unaligned, uniform constant loadsMatt Arsenault2019-09-091-4/+5
| | | | llvm-svn: 371416
* AMDGPU/GlobalISel: Fix regbankselect for uniform extloadsMatt Arsenault2019-09-091-4/+4
| | | | | | There are no scalar extloads. llvm-svn: 371414
* AMDGPU/GlobalISel: Fix reg bank for uniform LDS loadsMatt Arsenault2019-09-091-8/+14
| | | | | | | The pointer is always a VGPR. Also fix hardcoding the pointer size to 64. llvm-svn: 371411
* GlobalISel: Support physical register inputs in patternsMatt Arsenault2019-09-061-5/+7
| | | | llvm-svn: 371253
* AMDGPU/GlobalISel: Select G_BITREVERSEMatt Arsenault2019-09-041-0/+1
| | | | llvm-svn: 370980
* GlobalISel/TableGen: Handle setcc patternsMatt Arsenault2019-08-291-4/+3
| | | | | | | | | | | This is a special case because one node maps to two different G_ instructions, and the operand order is changed. This mostly enables G_FCMP for AMDPGPU. G_ICMP is still manually selected for now since it has the SALU and VALU complication to deal with. llvm-svn: 370280
* Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVMDaniel Sanders2019-08-151-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041
* AMDGPU/GlobalISel: Alternative mappings for constantsMatt Arsenault2019-08-051-1/+13
| | | | | | | | Without context we assume SGPR. Allowing VGPR constants theoretically helps avoid a copy. This seems to not actually work now, and the choice isn't based on the use bank. llvm-svn: 367871
* AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADDMatt Arsenault2019-08-011-0/+1
| | | | llvm-svn: 367509
* [AMDGPU/GlobalISel] Add llvm.amdgcn.fdiv.fast legalization.Austin Kerbow2019-07-301-1/+0
| | | | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64966 llvm-svn: 367344
* AMDGPU/GlobalISel: Selection for fminnum/fmaxnumMatt Arsenault2019-07-191-2/+4
| | | | | | | v2f16 case doesn't work yet because the VOP3P complex patterns haven't been ported yet. llvm-svn: 366585
* AMDGPU/GlobalISel: Allow scalar s1 and/or/xorMatt Arsenault2019-07-151-6/+91
| | | | | | | | If a 1-bit value is in a 32-bit VGPR, the scalar opcodes set SCC to whether the result is 0. If the inputs are SCC, these can be copied to a 32-bit SGPR to produce an SCC result. llvm-svn: 366125
* AMDGPU/GlobalISel: Handle llvm.amdgcn.if.breakMatt Arsenault2019-07-151-0/+7
| | | | llvm-svn: 366102
* AMDGPU/GlobalISel: Select llvm.amdgcn.end.cfMatt Arsenault2019-07-151-0/+5
| | | | llvm-svn: 366099
* AMDGPU/GlobalISel: RegBankSelect for G_CONCAT_VECTORSMatt Arsenault2019-07-151-1/+2
| | | | llvm-svn: 366086
* AMDGPU/GlobalISel: Add support for wide loads >= 256-bitsTom Stellard2019-07-101-36/+136
| | | | | | | | | | | | | | | | | | Summary: This adds support for the most commonly used wide load types: <8xi32>, <16xi32>, <4xi64>, and <8xi64> Reviewers: arsenm Reviewed By: arsenm Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57399 llvm-svn: 365586
* AMDGPU/GlobalISel: Improve regbankselect for icmp s16Matt Arsenault2019-07-091-5/+10
| | | | | | Account for 64-bit scalar eq/ne when available. llvm-svn: 365487
* AMDGPU/GlobalISel: Handle 16-bit SALU min/maxMatt Arsenault2019-07-011-5/+19
| | | | | | | | | This needs to be extended to s32, and expanded into cmp+select. This is relying on the fact that widenScalar happens to leave the instruction in place, but this isn't a guaranteed property of LegalizerHelper. llvm-svn: 364831
* AMDGPU/GlobalISel: Lower SALU min/max to cmp+selectMatt Arsenault2019-07-011-6/+41
| | | | | | | Use a change observer to apply a register bank to the newly created intermediate result register. llvm-svn: 364830
* AMDGPU/GlobalISel: Legalize s16 add/sub/mulMatt Arsenault2019-07-011-1/+73
| | | | | | | If this is scalar, promote to s32. Use a new observer class to assign the register bank of newly created registers. llvm-svn: 364827
* AMDGPU/GlobalISel: Fix allowing non-boolean conditions for G_SELECTMatt Arsenault2019-07-011-9/+20
| | | | | | | | | The condition register bank must be scc or vcc so that a copy will be inserted, which will be lowered to a compare. Currently greedy unnecessarily forces using a VCC select. llvm-svn: 364825
* AMDGPU/GlobalISel: RegBankSelect for sendmsg/sendmsghaltMatt Arsenault2019-07-011-3/+29
| | | | llvm-svn: 364819
* AMDGPU/GlobalISel: RegBankSelect for DS ordered add/swapMatt Arsenault2019-07-011-2/+31
| | | | llvm-svn: 364811
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.writelaneMatt Arsenault2019-07-011-5/+58
| | | | llvm-svn: 364808
* AMDGPU/GlobalISel: RegBankSelect for readlane/readfirstlaneMatt Arsenault2019-07-011-0/+75
| | | | llvm-svn: 364801
* AMDGPU/GlobalISel: Fix RegBankSelect for G_FCANONICALIZEMatt Arsenault2019-07-011-0/+1
| | | | llvm-svn: 364768
* AMDGPU/GlobalISel: Fix RegBankSelect for G_BUILD_VECTORMatt Arsenault2019-07-011-1/+2
| | | | llvm-svn: 364767
* AMDGPU/GlobalISel: RegBankSelect for WWM/WQMMatt Arsenault2019-07-011-0/+2
| | | | llvm-svn: 364763
* AMDGPU/GlobalISel: Use vcc reg bank for amdgcn.wqm.voteMatt Arsenault2019-07-011-1/+1
| | | | llvm-svn: 364762
* AMDGPU/GlobalISel: RegBankSelect for update.dppMatt Arsenault2019-06-291-0/+1
| | | | llvm-svn: 364701
* AMDGPU/GlobalISel: RegBankSelect for atomic.inc/atomic.decMatt Arsenault2019-06-291-0/+2
| | | | llvm-svn: 364699
* AMDGPU/GlobalISel: RegBankSelect for some DS intrinsicsMatt Arsenault2019-06-291-1/+17
| | | | llvm-svn: 364698
* AMDGPU/GlobalISel: RegBankSelect for some easy intrinsicsMatt Arsenault2019-06-291-1/+48
| | | | llvm-svn: 364697
* AMDGPU/GlobalISel: RegBankSelect for icmp/fcmp intrinsicsMatt Arsenault2019-06-291-0/+12
| | | | llvm-svn: 364696
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.fmasMatt Arsenault2019-06-291-0/+1
| | | | llvm-svn: 364695
* AMDGPU/GlobalISel: RegBankSelect for some simple leaf intrinsicsMatt Arsenault2019-06-291-1/+11
| | | | llvm-svn: 364694
* AMDGPU/GlobalISel: Convert to using RegisterMatt Arsenault2019-06-281-37/+37
| | | | llvm-svn: 364616
* AMDGPU/GlobalISel: Fix regbankselect for amdgcn.classMatt Arsenault2019-06-251-4/+8
| | | | llvm-svn: 364262
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.classMatt Arsenault2019-06-241-0/+9
| | | | llvm-svn: 364214
* AMDGPU/GlobalISel: Split VALU s64 G_ZEXT/G_SEXT in RegBankSelectMatt Arsenault2019-06-241-13/+57
| | | | | | | | | | | Scalar extends to s64 can use S_BFE_{I64|U64}, but vector extends need to extend to the 32-bit half, and then to 64. I'm not sure what the line should be between what RegBankSelect handles, and what instruction select does, but for now I'm erring on the side of RegBankSelect for future post-RBS combines. llvm-svn: 364212
* GlobalISel: Remove unsigned variant of SrcOpMatt Arsenault2019-06-241-2/+2
| | | | | | | | | Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194
* CodeGen: Introduce a class for registersMatt Arsenault2019-06-241-14/+14
| | | | | | | | | Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
OpenPOWER on IntegriCloud