summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU/GlobalISel: Split 64-bit vector extracts during RegBankSelectMatt Arsenault2019-10-031-6/+114
| | | | | | | | Register indexing 64-bit elements is possible on the SALU, but not the VALU. Handle splitting this into two 32-bit indexes. Extend waterfall loop handling to allow moving a range of instructions. llvm-svn: 373638
* AMDGPU/GlobalISel: Allow VGPR to index SGPR registerMatt Arsenault2019-10-031-3/+2
| | | | | | | | We can still do a waterfall loop over the index if using a VGPR to index an SGPR. The result will still be a VGPR, but we can avoid the wide copy of the source register to a VGPR. llvm-svn: 373637
* AMDGPU/GlobalISel: Add some more tests for G_INSERT legalizationMatt Arsenault2019-10-031-0/+168
| | | | llvm-svn: 373636
* AMDGPU/GlobalISel: Fix mutationIsSane assert v8s8 andMatt Arsenault2019-10-031-0/+166
| | | | | | This would try to do FewerElements to v9s8 llvm-svn: 373635
* AMDGPU/GlobalISel: Expand G_BITCAST legalityMatt Arsenault2019-10-031-0/+102
| | | | llvm-svn: 373567
* [AMDGPU] Extend buffer intrinsics with swizzlingPiotr Sobczak2019-10-025-131/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Extend cachepolicy operand in the new VMEM buffer intrinsics to supply information whether the buffer data is swizzled. Also, propagate this information to MIR. Intrinsics updated: int_amdgcn_raw_buffer_load int_amdgcn_raw_buffer_load_format int_amdgcn_raw_buffer_store int_amdgcn_raw_buffer_store_format int_amdgcn_raw_tbuffer_load int_amdgcn_raw_tbuffer_store int_amdgcn_struct_buffer_load int_amdgcn_struct_buffer_load_format int_amdgcn_struct_buffer_store int_amdgcn_struct_buffer_store_format int_amdgcn_struct_tbuffer_load int_amdgcn_struct_tbuffer_store Furthermore, disable merging of VMEM buffer instructions in SI Load/Store optimizer, if the "swizzled" bit on the instruction is on. The default value of the bit is 0, meaning that data in buffer is linear and buffer instructions can be merged. There is no difference in the generated code with this commit. However, in the future it will be expected that front-ends use buffer intrinsics with correct "swizzled" bit set. Reviewers: arsenm, nhaehnle, tpr Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, arphaman, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68200 llvm-svn: 373491
* AMDGPU/GlobalISel: Assume VGPR for G_FRAME_INDEXMatt Arsenault2019-10-021-1/+1
| | | | | | | | | In principle this should behave as any other constant. However eliminateFrameIndex currently assumes a VALU use and uses a vector shift. Work around this by selecting to VGPR for now until eliminateFrameIndex is fixed. llvm-svn: 373415
* AMDGPU/GlobalISel: Private loads always use VGPRsMatt Arsenault2019-10-021-0/+17
| | | | llvm-svn: 373414
* AMDGPU/GlobalISel: Legalize 1024-bit G_BUILD_VECTORMatt Arsenault2019-10-022-40/+155
| | | | | | This will be needed to support AGPR operations. llvm-svn: 373413
* AMDGPU/GlobalISel: Fix RegBankSelect for 1024-bit valuesMatt Arsenault2019-10-021-0/+28
| | | | llvm-svn: 373412
* AMDGPU/GlobalISel: Increase max legal size to 1024Matt Arsenault2019-10-018-84/+440
| | | | | | | | There are 1024 bit register classes defined for AGPRs. Additionally OpenCL defines vectors up to 16 x i64, and this helps those tests legalize. llvm-svn: 373350
* Revert "GlobalISel: Handle llvm.read_register"Dmitri Gribenko2019-10-011-2/+0
| | | | | | | | This reverts commit r373294. It broke Clang's CodeGen/arm64-microsoft-status-reg.cpp: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/18483 llvm-svn: 373310
* AMDGPU/GlobalISel: Select s1 src G_SITOFP/G_UITOFPMatt Arsenault2019-10-012-44/+515
| | | | llvm-svn: 373298
* AMDGPU/GlobalISel: Add support for init.exec intrinsicsMatt Arsenault2019-10-012-0/+4
| | | | | | | TThe existing wave32 behavior seems broken and incomplete, but this reproduces it. llvm-svn: 373296
* GlobalISel: Handle llvm.read_registerMatt Arsenault2019-10-011-0/+2
| | | | | | | | | | | | | SelectionDAG has a bunch of machinery to defer this to selection time for some reason. Just directly emit a copy during IRTranslator. The x86 usage does somewhat questionably check hasFP, which could depend on the whole function being at minimum translated. This does lose the convergent bit if the callsite had it, which may be a problem. We also lose that in general for intrinsics, which may also be a problem. llvm-svn: 373294
* AMDGPU/GlobalISel: Avoid creating shift of 0 in arg loweringMatt Arsenault2019-10-011-1/+1
| | | | | | | | This is sort of papering over the fact that we don't run a combiner anywhere, but avoiding creating 2 instructions in the first place is easy. llvm-svn: 373293
* AMDGPU/GlobalISel: Select G_UADDO/G_USUBOMatt Arsenault2019-10-012-0/+394
| | | | llvm-svn: 373288
* GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sourcesMatt Arsenault2019-10-012-46/+232
| | | | | | Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU. llvm-svn: 373287
* AMDGPU/GlobalISel: Legalize G_GLOBAL_VALUEMatt Arsenault2019-10-011-0/+156
| | | | | | | Handle other cases besides LDS. Mostly a straight port of the existing handling, without the intermediate custom nodes. llvm-svn: 373286
* AMDGPU/GlobalISel: Fix select for v2s16 and/or/xorMatt Arsenault2019-09-303-45/+45
| | | | llvm-svn: 373180
* AMDGPU/GlobalISel: Allow selection of scalar min/maxMatt Arsenault2019-09-214-40/+20
| | | | | | | | | I believe all of the uniform/divergent pattern predicates are redundant and can be removed. The uniformity bit already influences the register class, and nothhing has broken when I've removed this and others. llvm-svn: 372450
* Revert r372366 "Use getTargetConstant for BLENDI, and add a test to catch it."Nico Weber2019-09-201-14/+0
| | | | | | | | | | | | | | | | | | | This reverts commit 52621307bcab2013e8833f3317cebd63a6db3885. Tests have been failing all night with [0/2] ACTION //llvm/test:check-llvm(//llvm/utils/gn/build/toolchain:unix) -- Testing: 33647 tests, 64 threads -- Testing: 0 .. 10.. UNRESOLVED: LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll (6943 of 33647) ******************** TEST 'LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll' FAILED ******************** Test has no run line! ******************** Since there were other concerns on https://reviews.llvm.org/D67785, I'm just reverting for now. llvm-svn: 372383
* Use getTargetConstant for BLENDI, and add a test to catch it.Sterling Augustine2019-09-201-0/+14
| | | | | | | | | | | | | | | | Summary: This fixes a crasher introduced by r372338. Reviewers: echristo, arsenm Subscribers: jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67785 Tighten up the test case. llvm-svn: 372366
* Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Matt Arsenault2019-09-1919-91/+2794
| | | | | | | | | This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
* Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Hans Wennborg2019-09-1919-2794/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC*. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_* instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
* AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.ds.swizzleMatt Arsenault2019-09-191-0/+21
| | | | llvm-svn: 372297
* AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store.formatMatt Arsenault2019-09-192-0/+833
| | | | | | | | | This needs special handling due to some subtargets that have a nonstandard register layout for f16 vectors Also reject some illegal types on other targets. llvm-svn: 372293
* AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.storeMatt Arsenault2019-09-192-168/+791
| | | | llvm-svn: 372292
* AMDGPU/GlobalISel: RegBankSelect struct buffer load/storeMatt Arsenault2019-09-192-0/+353
| | | | llvm-svn: 372291
* AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.raw.buffer.{load|store}Matt Arsenault2019-09-192-0/+341
| | | | llvm-svn: 372290
* AMDGPU/GlobalISel: Attempt to RegBankSelect image intrinsicsMatt Arsenault2019-09-192-0/+449
| | | | | | Images should always have 2 consecutive, mandatory SGPR arguments. llvm-svn: 372289
* AMDGPU/GlobalISel: Fix RegBankSelect G_SMULH/G_UMULH pre-gfx9Matt Arsenault2019-09-192-36/+88
| | | | | | The scalar versions were only introduced in gfx9. llvm-svn: 372286
* GlobalISel: Don't materialize immarg arguments to intrinsicsMatt Arsenault2019-09-198-55/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encode them directly as an imm argument to G_INTRINSIC*. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_* instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
* [GlobalISel] Partially revert r371901.Amara Emerson2019-09-161-1/+1
| | | | | | | | | | | r371901 was overeager and widenScalarDst() and the like in the legalizer attempt to increment the insert point given in order to add new instructions after the currently legalizing inst. In cases where the insertion point is not exactly the current instruction, then callers need to de-compensate for the behaviour by decrementing the insertion iterator before calling them. It's not a nice state of affairs, for now just undo the problematic parts of the change. llvm-svn: 372050
* AMDGPU/GlobalISel: Fix some broken run linesMatt Arsenault2019-09-164-8/+8
| | | | llvm-svn: 371992
* AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEILMatt Arsenault2019-09-162-0/+62
| | | | llvm-svn: 371991
* AMDGPU/GlobalISel: Remove another illegal select testMatt Arsenault2019-09-161-37/+0
| | | | llvm-svn: 371990
* AMDGPU/GlobalISel: Remove illegal select testsMatt Arsenault2019-09-161-74/+0
| | | | | | These fail in a release build. llvm-svn: 371955
* AMDGPU/GlobalISel: Select SMRD loads for more typesMatt Arsenault2019-09-161-0/+1007
| | | | llvm-svn: 371954
* AMDGPU/GlobalISel: RegBankSelect for killMatt Arsenault2019-09-161-0/+68
| | | | llvm-svn: 371953
* AMDGPU/GlobalISel: Legalize s1 source G_[SU]ITOFPMatt Arsenault2019-09-162-0/+72
| | | | llvm-svn: 371952
* AMDGPU/GlobalISel: Set type on vgpr live in special argumentsMatt Arsenault2019-09-161-0/+27
| | | | | | | Fixes assertion with workitem ID intrinsics used in non-kernel functions. llvm-svn: 371951
* AMDGPU/GlobalISel: Select S16->S32 fptointMatt Arsenault2019-09-162-5/+210
| | | | llvm-svn: 371950
* AMDGPU/GlobalISel: Select s32->s16 G_[US]ITOFPMatt Arsenault2019-09-164-9/+143
| | | | llvm-svn: 371949
* AMDGPU/GlobalISel: Fix VALU s16 fnegMatt Arsenault2019-09-161-11/+9
| | | | llvm-svn: 371948
* [GlobalISel] Fix insertion point of new instructions to be after PHIs.Amara Emerson2019-09-131-1/+1
| | | | | | | | | | For some reason we sometimes insert new instructions one instruction before the first non-PHI when legalizing. This can result in having non-PHI instructions before PHIs, which mean that PHI elimination doesn't catch them. Differential Revision: https://reviews.llvm.org/D67570 llvm-svn: 371901
* AMDGPU/GlobalISel: Legalize s32->s16 G_SITOFP/G_UITOFPMatt Arsenault2019-09-132-0/+34
| | | | llvm-svn: 371811
* AMDGPU/GlobalISel: Fix RegBankSelect for amdgcn.elseMatt Arsenault2019-09-132-0/+35
| | | | llvm-svn: 371808
* AMDGPU/GlobalISel: Select 16-bit VALU bit opsMatt Arsenault2019-09-133-18/+15
| | | | llvm-svn: 371807
* AMDGPU/GlobalISel: Legalize G_FFLOORMatt Arsenault2019-09-133-0/+639
| | | | llvm-svn: 371803
OpenPOWER on IntegriCloud