summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Correct properties for adjcallstack* pseudosMatt Arsenault2019-07-011-0/+4
| | | | | | | These should be SALU writes, and these are lowered to instructions that def SCC. llvm-svn: 364859
* AMDGPU/GlobalISel: Handle more input argument intrinsicsMatt Arsenault2019-07-012-41/+72
| | | | llvm-svn: 364836
* AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsicsMatt Arsenault2019-07-013-24/+48
| | | | llvm-svn: 364835
* AMDGPU/GlobalISel: Legalize workgroup ID intrinsicsMatt Arsenault2019-07-012-0/+36
| | | | llvm-svn: 364834
* AMDGPU/GlobalISel: Legalize workitem ID intrinsicsMatt Arsenault2019-07-013-0/+127
| | | | | | | | | Tests don't cover the masked input path since non-kernel arguments aren't lowered yet. Test is copied directly from the existing test, with 2 additions. llvm-svn: 364833
* AMDGPU/GlobalISel: Custom lower control flow intrinsicsMatt Arsenault2019-07-012-0/+68
| | | | | | | | Replace the brcond for the 2 cases that act as branches. For now follow how the current system works, although I think we can eventually get rid of the pseudos. llvm-svn: 364832
* AMDGPU/GlobalISel: Handle 16-bit SALU min/maxMatt Arsenault2019-07-011-5/+19
| | | | | | | | | This needs to be extended to s32, and expanded into cmp+select. This is relying on the fact that widenScalar happens to leave the instruction in place, but this isn't a guaranteed property of LegalizerHelper. llvm-svn: 364831
* AMDGPU/GlobalISel: Lower SALU min/max to cmp+selectMatt Arsenault2019-07-011-6/+41
| | | | | | | Use a change observer to apply a register bank to the newly created intermediate result register. llvm-svn: 364830
* AMDGPU/GlobalISel: Legalize s16 add/sub/mulMatt Arsenault2019-07-012-2/+85
| | | | | | | If this is scalar, promote to s32. Use a new observer class to assign the register bank of newly created registers. llvm-svn: 364827
* AMDGPU/GlobalISel: Fix allowing non-boolean conditions for G_SELECTMatt Arsenault2019-07-011-9/+20
| | | | | | | | | The condition register bank must be scc or vcc so that a copy will be inserted, which will be lowered to a compare. Currently greedy unnecessarily forces using a VCC select. llvm-svn: 364825
* AMDGPU/GlobalISel: RegBankSelect for sendmsg/sendmsghaltMatt Arsenault2019-07-011-3/+29
| | | | llvm-svn: 364819
* AMDGPU/GlobalISel: Legalize s16 fcmpMatt Arsenault2019-07-011-1/+9
| | | | llvm-svn: 364817
* AMDGPU/GFX10: implement ds_ordered_count changesNicolai Haehnle2019-07-011-1/+22
| | | | | | | | | | | | | | | | | | | Summary: ds_ordered_count can now simultaneously operate on up to 4 dwords in a single instruction, which are taken from (and returned to) lanes 0..3 of a single VGPR. Change-Id: I19b6e7b0732b617c10a779a7f9c0303eec7dd276 Reviewers: mareko, arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63716 llvm-svn: 364815
* AMDGPU: Support GDS atomicsNicolai Haehnle2019-07-018-54/+97
| | | | | | | | | | | | | | | | | Summary: Original patch by Marek Olšák Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab Reviewers: mareko, arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63452 llvm-svn: 364814
* AMDGPU/GlobalISel: RegBankSelect for DS ordered add/swapMatt Arsenault2019-07-011-2/+31
| | | | llvm-svn: 364811
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.writelaneMatt Arsenault2019-07-011-5/+58
| | | | llvm-svn: 364808
* AMDGPU/GlobalISel: Fail instead of assert when selecting loadsMatt Arsenault2019-07-011-5/+11
| | | | llvm-svn: 364807
* AMDGPU/GlobalISel: Complete implementation of G_GEPMatt Arsenault2019-07-013-53/+79
| | | | | | | | Also works around tablegen defect in selecting add with unused carry, but if we have to manually select GEP, might as well handle add manually. llvm-svn: 364806
* AMDGPU/GlobalISel: Select G_PHIMatt Arsenault2019-07-012-0/+41
| | | | llvm-svn: 364805
* AMDGPU/GlobalISel: Try to select VOP3 form of addMatt Arsenault2019-07-011-0/+20
| | | | | | | | | | | There are several things broken, but at least emit the right thing for gfx9. The import of the pattern with the unused carry out seems to not work. Needs a special class for clamp, because OperandWithDefaultOps doesn't really work. llvm-svn: 364804
* AMDGPU/GlobalISel: RegBankSelect for readlane/readfirstlaneMatt Arsenault2019-07-012-0/+82
| | | | llvm-svn: 364801
* AMDGPU/GlobalISel: Implement select for 32-bit G_ADDTom Stellard2019-07-012-2/+7
| | | | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58804 llvm-svn: 364797
* AMDGPU/GlobalISel: Select G_BRCOND for vccMatt Arsenault2019-07-012-25/+44
| | | | llvm-svn: 364795
* AMDGPU/GlobalISel: Select G_FRAME_INDEXMatt Arsenault2019-07-012-0/+19
| | | | llvm-svn: 364789
* AMDGPU/GFX10: fix scratch resource descriptorNicolai Haehnle2019-07-011-2/+2
| | | | | | | | | | | | | | | | | | | | Summary: The stride should depend on the wave size, not the hardware generation. Also, the 32_FLOAT format is 0x16, not 16; though that shouldn't be relevant. Change-Id: I088f93bf6708974d085d1c50967f119061da6dc6 Reviewers: arsenm, rampitec, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63808 llvm-svn: 364788
* AMDGPU/GlobalISel: Make s16 select legalMatt Arsenault2019-07-012-7/+9
| | | | | | | This is easy to handle and avoids legalization artifacts which are likely to obscure combines. llvm-svn: 364787
* AMDGPU/GlobalISel: Select G_BRCOND for scc conditionsMatt Arsenault2019-07-012-0/+34
| | | | llvm-svn: 364786
* AMDGPU/GlobalISel: Tolerate copies with no type setMatt Arsenault2019-07-011-3/+6
| | | | | | | isVCC has the same bug, but isn't used in a context where it can cause a problem. llvm-svn: 364784
* AMDGPU/GlobalISel: Select src modifiersMatt Arsenault2019-07-012-6/+43
| | | | llvm-svn: 364782
* AMDGPU: Convert some places to RegisterMatt Arsenault2019-07-012-9/+10
| | | | llvm-svn: 364769
* AMDGPU/GlobalISel: Fix RegBankSelect for G_FCANONICALIZEMatt Arsenault2019-07-011-0/+1
| | | | llvm-svn: 364768
* AMDGPU/GlobalISel: Fix RegBankSelect for G_BUILD_VECTORMatt Arsenault2019-07-011-1/+2
| | | | llvm-svn: 364767
* AMDGPU/GlobalISel: Fail on store to 32-bit address spaceMatt Arsenault2019-07-011-0/+6
| | | | llvm-svn: 364766
* AMDGPU/GlobalISel: Improve icmp selection coverage.Matt Arsenault2019-07-012-13/+38
| | | | | | Select s64 eq/ne scalar icmp. llvm-svn: 364765
* AMDGPU/GlobalISel: RegBankSelect for WWM/WQMMatt Arsenault2019-07-011-0/+2
| | | | llvm-svn: 364763
* AMDGPU/GlobalISel: Use vcc reg bank for amdgcn.wqm.voteMatt Arsenault2019-07-011-1/+1
| | | | llvm-svn: 364762
* AMDGPU/GlobalISel: Fix scc->vcc copy handlingMatt Arsenault2019-07-012-13/+23
| | | | | | | | | | | | | This was checking the size of the register with the value of the size, which happens to be exec. Also fix assuming VCC is 64-bit to fix wave32. Also remove some untested handling for physical registers which is skipped. This doesn't insert the V_CNDMASK_B32 if SCC is the physical copy source. I'm not sure if this should be trying to handle this special case instead of dealing with this in copyPhysReg. llvm-svn: 364761
* AMDGPU/GlobalISel: Use and instead of BFE with inline immediateMatt Arsenault2019-07-011-6/+29
| | | | | | | Zext from s1 is the only case where this should do anything with the current legal extensions. llvm-svn: 364760
* [AMDGPU] Call isLoopExiting for blocks in the loop.Florian Hahn2019-07-011-2/+4
| | | | | | | | | | | | | | | | isLoopExiting should only be called for blocks in the loop. A follow up patch makes this requirement an assertion. I've updated the usage here, to only match for actual exit blocks. Previously, it would also match blocks not in the loop. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D63980 llvm-svn: 364750
* AMDGPU/GlobalISel: RegBankSelect for update.dppMatt Arsenault2019-06-291-0/+1
| | | | llvm-svn: 364701
* AMDGPU/GlobalISel: RegBankSelect for atomic.inc/atomic.decMatt Arsenault2019-06-291-0/+2
| | | | llvm-svn: 364699
* AMDGPU/GlobalISel: RegBankSelect for some DS intrinsicsMatt Arsenault2019-06-291-1/+17
| | | | llvm-svn: 364698
* AMDGPU/GlobalISel: RegBankSelect for some easy intrinsicsMatt Arsenault2019-06-291-1/+48
| | | | llvm-svn: 364697
* AMDGPU/GlobalISel: RegBankSelect for icmp/fcmp intrinsicsMatt Arsenault2019-06-291-0/+12
| | | | llvm-svn: 364696
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.fmasMatt Arsenault2019-06-291-0/+1
| | | | llvm-svn: 364695
* AMDGPU/GlobalISel: RegBankSelect for some simple leaf intrinsicsMatt Arsenault2019-06-291-1/+11
| | | | llvm-svn: 364694
* [AMDGPU][MC] Fix 2 for sanitizer failure in 364645Dmitry Preobrazhensky2019-06-282-6/+6
| | | | llvm-svn: 364656
* [AMDGPU][MC] Fix for sanitizer failure in 364645Dmitry Preobrazhensky2019-06-281-4/+10
| | | | llvm-svn: 364651
* [AMDGPU][MC] Enabled constant expressions as operands of sendmsgDmitry Preobrazhensky2019-06-285-210/+266
| | | | | | | | | | See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D62735 llvm-svn: 364645
* [AMDGPU] Packed thread ids in function call ABIStanislav Mekhanoshin2019-06-284-22/+132
| | | | | | Differential Revision: https://reviews.llvm.org/D63851 llvm-svn: 364619
OpenPOWER on IntegriCloud