summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU/GlobalISel: Legalize fast unsafe FDIVAustin Kerbow2019-10-211-4/+84
| | | | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69231 llvm-svn: 375460
* GlobalISel: Implement lower for G_SADDO/G_SSUBOMatt Arsenault2019-10-161-1/+4
| | | | | | | Port directly from SelectionDAG, minus the path using ISD::SADDSAT/ISD::SSUBSAT. llvm-svn: 375042
* GlobalISel: Implement fewerElementsVector for G_BUILD_VECTORMatt Arsenault2019-10-091-1/+10
| | | | | | Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR. llvm-svn: 374252
* AMDGPU/GlobalISel: Clamp G_SITOFP/G_UITOFP sourcesMatt Arsenault2019-10-071-3/+6
| | | | llvm-svn: 373989
* GlobalISel: Partially implement lower for G_INSERTMatt Arsenault2019-10-071-7/+3
| | | | llvm-svn: 373946
* AMDGPU/GlobalISel: Widen 16-bit G_MERGE_VALUEs sourcesMatt Arsenault2019-10-071-18/+29
| | | | | | Continue making a mess of merge/unmerge legality. llvm-svn: 373942
* AMDGPU/GlobalISel: Lower G_ATOMIC_CMPXCHG_WITH_SUCCESSMatt Arsenault2019-10-061-0/+3
| | | | llvm-svn: 373839
* GlobalISel: Partially implement lower for G_EXTRACTMatt Arsenault2019-10-061-1/+13
| | | | | | Turn into shift and truncate. Doesn't yet handle pointers. llvm-svn: 373838
* AMDGPU/GlobalISel: Fix using wrong addrspace for apertureMatt Arsenault2019-10-041-1/+3
| | | | | | | This was always passing the destination flat address space, when it should be picking between the two valid source options. llvm-svn: 373716
* AMDGPU/GlobalISel: Fix mutationIsSane assert v8s8 andMatt Arsenault2019-10-031-2/+3
| | | | | | This would try to do FewerElements to v9s8 llvm-svn: 373635
* AMDGPU/GlobalISel: Expand G_BITCAST legalityMatt Arsenault2019-10-031-4/+1
| | | | llvm-svn: 373567
* AMDGPU/GlobalISel: Use getIntrinsicID helperMatt Arsenault2019-10-021-1/+1
| | | | llvm-svn: 373417
* AMDGPU/GlobalISel: Legalize 1024-bit G_BUILD_VECTORMatt Arsenault2019-10-021-4/+6
| | | | | | This will be needed to support AGPR operations. llvm-svn: 373413
* AMDGPU/GlobalISel: Increase max legal size to 1024Matt Arsenault2019-10-011-8/+8
| | | | | | | | There are 1024 bit register classes defined for AGPRs. Additionally OpenCL defines vectors up to 16 x i64, and this helps those tests legalize. llvm-svn: 373350
* AMDGPU/GlobalISel: Select s1 src G_SITOFP/G_UITOFPMatt Arsenault2019-10-011-2/+2
| | | | llvm-svn: 373298
* AMDGPU/GlobalISel: Avoid creating shift of 0 in arg loweringMatt Arsenault2019-10-011-3/+8
| | | | | | | | This is sort of papering over the fact that we don't run a combiner anywhere, but avoiding creating 2 instructions in the first place is easy. llvm-svn: 373293
* AMDGPU/GlobalISel: Select G_UADDO/G_USUBOMatt Arsenault2019-10-011-1/+2
| | | | llvm-svn: 373288
* GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sourcesMatt Arsenault2019-10-011-3/+9
| | | | | | Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU. llvm-svn: 373287
* AMDGPU/GlobalISel: Legalize G_GLOBAL_VALUEMatt Arsenault2019-10-011-8/+98
| | | | | | | Handle other cases besides LDS. Mostly a straight port of the existing handling, without the intermediate custom nodes. llvm-svn: 373286
* Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Matt Arsenault2019-09-191-0/+60
| | | | | | | | | This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
* Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Hans Wennborg2019-09-191-60/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC*. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_* instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
* AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store.formatMatt Arsenault2019-09-191-3/+29
| | | | | | | | | This needs special handling due to some subtargets that have a nonstandard register layout for f16 vectors Also reject some illegal types on other targets. llvm-svn: 372293
* AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.storeMatt Arsenault2019-09-191-0/+34
| | | | llvm-svn: 372292
* AMDGPU/GlobalISel: Legalize s1 source G_[SU]ITOFPMatt Arsenault2019-09-161-1/+2
| | | | llvm-svn: 371952
* AMDGPU/GlobalISel: Select S16->S32 fptointMatt Arsenault2019-09-161-1/+1
| | | | llvm-svn: 371950
* AMDGPU/GlobalISel: Legalize s32->s16 G_SITOFP/G_UITOFPMatt Arsenault2019-09-131-1/+1
| | | | llvm-svn: 371811
* AMDGPU/GlobalISel: Legalize G_FFLOORMatt Arsenault2019-09-131-2/+2
| | | | llvm-svn: 371803
* AMDGPU/GlobalISel: Legalize G_FMADMatt Arsenault2019-09-131-0/+32
| | | | | | | | | | | | | | | Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800
* AMDGPU/GlobalISel: Select G_FABS/G_FNEGMatt Arsenault2019-09-101-4/+10
| | | | | | | | | | | f64 doesn't work yet because tablegen currently doesn't handlde REG_SEQUENCE. This does regress some multi use VALU fneg cases since now the immediate remains in an SGPR, and more moves are used for legalizing the xor. This is a SIFixSGPRCopies deficiency. llvm-svn: 371540
* AMDGPU/GlobalISel: RegBankSelect for G_ZEXTLOAD/G_SEXTLOADMatt Arsenault2019-09-101-1/+3
| | | | llvm-svn: 371536
* AMDGPU/GlobalISel: Legalize constant 32-bit loadsMatt Arsenault2019-09-101-0/+15
| | | | | | | Legalize by casting to a 64-bit constant address. This isn't how the DAG implements it, but it should. llvm-svn: 371535
* AMDGPU/GlobalISel: First pass at attempting to legalize load/storesMatt Arsenault2019-09-101-69/+256
| | | | | | | | There's still a lot more to do, but this handles decomposing due to alignment. I've gotten it to the point where nothing crashes or infinite loops the legalizer. llvm-svn: 371533
* AMDGPU/GlobalISel: Fix insert point when lowering fminnum/fmaxnumMatt Arsenault2019-09-091-1/+1
| | | | llvm-svn: 371471
* AMDGPU/GlobalISel: Rename MIRBuilder to B. NFCAustin Kerbow2019-09-091-49/+49
| | | | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67374 llvm-svn: 371467
* AMDGPU/GlobalISel: Legalize G_BUILD_VECTOR v2s16Matt Arsenault2019-09-091-8/+13
| | | | | | | | Handle it the same way as G_BUILD_VECTOR_TRUNC. Arguably only G_BUILD_VECTOR_TRUNC should be legal for this, but G_BUILD_VECTOR will probably be more convenient in most cases. llvm-svn: 371440
* AMDGPU/GlobalISel: Implement LDS G_GLOBAL_VALUEMatt Arsenault2019-09-091-0/+42
| | | | | | Handle the simple case that lowers to a constant. llvm-svn: 371424
* AMDGPU/GlobalISel: Legalize G_BUILD_VECTOR_TRUNCMatt Arsenault2019-09-091-0/+9
| | | | | | | | | | Treat this as legal on gfx9 since it can use S_PACK_* instructions for this. This isn't used by anything yet. The same will probably apply to 16-bit G_BUILD_VECTOR without the trunc. llvm-svn: 371423
* AMDGPU/GlobalISel: Select G_PTR_MASKMatt Arsenault2019-09-091-0/+4
| | | | llvm-svn: 371412
* AMDGPU/GlobalISel: Legalize wavefrontsize intrinsicMatt Arsenault2019-09-091-0/+6
| | | | llvm-svn: 371407
* AMDGPU: Add intrinsics for address space identificationMatt Arsenault2019-09-051-0/+16
| | | | | | | The library currently uses ptrtoint and directly checks the queue ptr for this, which counts as a pointer capture. llvm-svn: 371009
* AMDGPU/GlobalISel: Restore insert point when getting apertureMatt Arsenault2019-09-051-0/+6
| | | | | | Avoids SSA violations in a future patch. llvm-svn: 371008
* AMDGPU/GlobalISel: Fix placeholder value used for addrspacecastMatt Arsenault2019-09-051-4/+6
| | | | llvm-svn: 371007
* GlobalISel: Add basic legalization for G_BITREVERSEMatt Arsenault2019-09-041-1/+1
| | | | llvm-svn: 370979
* AMDGPU/GlobalISel: Make 16-bit constants legalMatt Arsenault2019-09-041-11/+5
| | | | | | This is mostly for the benefit of patterns which use 16-bit constants. llvm-svn: 370921
* Fix the build for MSVC builds using M_PIReid Kleckner2019-08-291-0/+7
| | | | llvm-svn: 370405
* AMDGPU/GlobalISel: Legalize sin/cosMatt Arsenault2019-08-291-0/+41
| | | | llvm-svn: 370402
* AMDGPU/GlobalISel: Implement addrspacecast for 32-bit constant addrspaceMatt Arsenault2019-08-281-8/+31
| | | | llvm-svn: 370140
* GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sourcesMatt Arsenault2019-08-211-1/+1
| | | | | | | | This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547
* GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUESMatt Arsenault2019-08-131-1/+10
| | | | | | Odd sized vectors aren't handled yet. llvm-svn: 368713
* GlobalISel: Implement lower for G_SHUFFLE_VECTORMatt Arsenault2019-08-131-0/+3
| | | | llvm-svn: 368709
OpenPOWER on IntegriCloud