summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU/GlobalISel: Select llvm.amdgcn.ds.ordered.{add|swap}Matt Arsenault2020-01-131-0/+1
|
* AMDGPU/GlobalISel: Select G_EXTRACT_VECTOR_ELTMatt Arsenault2020-01-091-0/+1
| | | | | Doesn't try to do the fold into the base register of an add of a constant in the index like the DAG path does.
* TableGen/GlobalISel: Add way for SDNodeXForm to work on timmMatt Arsenault2020-01-091-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation assumes there is an instruction associated with the transform, but this is not the case for timm/TargetConstant/immarg values. These transforms should directly operate on a specific MachineOperand in the source instruction. TableGen would assert if you attempted to define an equivalent GISDNodeXFormEquiv using timm when it failed to find the instruction matcher. Specially recognize SDNodeXForms on timm, and pass the operand index to the render function. Ideally this would be a separate render function type that looks like void renderFoo(MachineInstrBuilder, const MachineOperand&), but this proved to be somewhat mechanically painful. Add an optional operand index which will only be passed if the transform should only look at the one source operand. Theoretically it would also be possible to only ever pass the MachineOperand, and the existing renderers would check the parent. I think that would be somewhat ugly for the standard usage which may want to inspect other operands, and I also think MachineOperand should eventually not carry a pointer to the parent instruction. Use it in one sample pattern. This isn't a great example, since the transform exists to satisfy DAG type constraints. This could also be avoided by just changing the MachineInstr's arbitrary choice of operand type from i16 to i32. Other patterns have nontrivial uses, but this serves as the simplest example. One flaw this still has is if you try to use an SDNodeXForm defined for imm, but the source pattern uses timm, you still see the "Failed to lookup instruction" assert. However, there is now a way to avoid it.
* AMDGPU/GlobalISel: Add IMMPopCount xformMatt Arsenault2020-01-091-0/+3
| | | | Partially fixes BFE pattern import.
* AMDGPU/GlobalISel: Add selectVOP3Mods_nnanMatt Arsenault2020-01-091-0/+2
| | | | | | This doesn't enable any new imports yet, but moves the fmed patterns from failing on this to hitting the "complex suboperand referenced more than once" limitation in tablegen.
* AMDGPU/GlobalISel: Add equiv xform for bitcast_fpimm_to_i32Matt Arsenault2020-01-091-0/+3
| | | | Only partially fixes one pattern import.
* AMDGPU/GlobalISel: Fix add of neg inline constant patternMatt Arsenault2020-01-091-0/+3
|
* AMDGPU: Remove VOP3Mods0Clamp0OModMatt Arsenault2020-01-071-2/+0
| | | | | Now that overridable default operands work, there's no reason to use complex patterns to just produce 0s.
* AMDGPU/GlobalISel: Select G_UADDE/G_USUBEMatt Arsenault2020-01-061-1/+1
|
* AMDGPU/GlobalISel: Replace handling of boolean valuesMatt Arsenault2020-01-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This solves selection failures with generated selection patterns, which would fail due to inferring the SGPR reg bank for virtual registers with a set register class instead of VCC bank. Use instruction selection would constrain the virtual register to a specific class, so when the def was selected later the bank no longer was set to VCC. Remove the SCC reg bank. SCC isn't directly addressable, so it requires copying from SCC to an allocatable 32-bit register during selection, so these might as well be treated as 32-bit SGPR values. Now any scalar boolean value that will produce an outupt in SCC should be widened during RegBankSelect to s32. Any s1 value should be a vector boolean during selection. This makes the vcc register bank unambiguous with a normal SGPR during selection. Summary of how this should now work: - G_TRUNC is always a no-op, and never should use a vcc bank result. - SALU boolean operations should be promoted to s32 in RegBankSelect apply mapping - An s1 value means vcc bank at selection. The exception is for legalization artifacts that use s1, which are never VCC. All other contexts should infer the VCC register classes for s1 typed registers. The LLT for the register is now needed to infer the correct register class. Extensions with vcc sources should be legalized to a select of constants during RegBankSelect. - Copy from non-vcc to vcc ensures high bits of the input value are cleared during selection. - SALU boolean inputs should ensure the inputs are 0/1. This includes select, conditional branches, and carry-ins. There are a few somewhat dirty details. One is that G_TRUNC/G_*EXT selection ignores the usual register-bank from register class functions, and can't handle truncates with VCC result banks. I think this is OK, since the artifacts are specially treated anyway. This does require some care to avoid producing cases with vcc. There will also be no 100% reliable way to verify this rule is followed in selection in case of register classes, and violations manifests themselves as invalid copy instructions much later. Standard phi handling also only considers the bank of the result register, and doesn't insert copies to make the source banks match. This doesn't work for vcc, so we have to manually correct phi inputs in this case. We should add a verifier check to make sure there are no phis with mixed vcc and non-vcc register bank inputs. There's also some duplication with the LegalizerHelper, and some code which should live in the helper. I don't see a good way to share special knowledge about what types to use for intermediate operations depending on the bank for example. Using the helper to replace extensions with selects also seems somewhat awkward to me. Another issue is there are some contexts calling getRegBankFromRegClass that apparently don't have the LLT type for the register, but I haven't yet run into a real issue from this. This also introduces new unnecessary instructions in most cases, since we don't yet try to optimize out the zext when the source is known to come from a compare.
* AMDGPU: Use ImmLeaf for inline immediate predicatesMatt Arsenault2020-01-061-0/+5
|
* GlobalISel: Lower s1 source G_SITOFP/G_UITOFPMatt Arsenault2019-11-151-1/+0
|
* [globalisel] Rename G_GEP to G_PTR_ADDDaniel Sanders2019-11-051-1/+1
| | | | | | | | | | | | | | | | Summary: G_GEP is rather poorly named. It's a simple pointer+scalar addition and doesn't support any of the complexities of getelementptr. I therefore propose that we rename it. There's a G_PTR_MASK so let's follow that convention and go with G_PTR_ADD Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69734
* AMDGPU/GlobalISel: Select s1 src G_SITOFP/G_UITOFPMatt Arsenault2019-10-011-0/+1
| | | | llvm-svn: 373298
* AMDGPU/GlobalISel: Add support for init.exec intrinsicsMatt Arsenault2019-10-011-0/+3
| | | | | | | TThe existing wave32 behavior seems broken and incomplete, but this reproduces it. llvm-svn: 373296
* AMDGPU/GlobalISel: Select G_UADDO/G_USUBOMatt Arsenault2019-10-011-0/+1
| | | | llvm-svn: 373288
* AMDGPU/GlobalISel: Avoid getting MRI in every functionMatt Arsenault2019-09-281-1/+7
| | | | | | | Store it in AMDGPUInstructionSelector to avoid boilerplate in nearly every select function. llvm-svn: 373139
* Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Matt Arsenault2019-09-191-0/+7
| | | | | | | | | This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
* Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Hans Wennborg2019-09-191-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC*. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_* instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
* AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.storeMatt Arsenault2019-09-191-0/+7
| | | | llvm-svn: 372292
* AMDGPU/GlobalISel: Select llvm.amdgcn.classMatt Arsenault2019-09-091-0/+2
| | | | | | Also fixes missing SubtargetPredicate on f16 class instructions. llvm-svn: 371436
* AMDGPU/GlobalISel: Select fmed3Matt Arsenault2019-09-091-0/+5
| | | | llvm-svn: 371435
* AMDGPU/GlobalISel: Select G_PTR_MASKMatt Arsenault2019-09-091-0/+1
| | | | llvm-svn: 371412
* [GlobalISel] Make the InstructionSelector instance non-const, allowing state ↵Amara Emerson2019-08-131-6/+5
| | | | | | | | | | | | | | | | to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652
* AMDGPU/GlobalISel: Allow selection of DS atomicrmwMatt Arsenault2019-08-011-1/+1
| | | | llvm-svn: 367507
* AMDGPU/GlobalISel: Select simple local storesMatt Arsenault2019-08-011-1/+3
| | | | llvm-svn: 367504
* AMDGPU/GlobalISel: Select local loadsMatt Arsenault2019-08-011-1/+8
| | | | llvm-svn: 367498
* AMDGPU/GlobalISel: Select private loadsMatt Arsenault2019-07-161-0/+5
| | | | llvm-svn: 366248
* AMDGPU/GlobalISel: Select flat loadsMatt Arsenault2019-07-161-0/+9
| | | | | | | | Now that the patterns use the new PatFrag address space support, the only blocker to importing most load patterns is the addressing mode complex patterns. llvm-svn: 366237
* AMDGPU/GlobalISel: Select G_AND/G_OR/G_XORMatt Arsenault2019-07-151-0/+1
| | | | llvm-svn: 366121
* AMDGPU/GlobalISel: Select G_SUBMatt Arsenault2019-07-091-1/+1
| | | | llvm-svn: 365484
* AMDGPU/GlobalISel: Select G_UNMERGE_VALUESMatt Arsenault2019-07-091-0/+1
| | | | llvm-svn: 365483
* AMDGPU/GlobalISel: Select G_MERGE_VALUESMatt Arsenault2019-07-091-0/+1
| | | | llvm-svn: 365482
* AMDGPU/GlobalISel: Complete implementation of G_GEPMatt Arsenault2019-07-011-1/+3
| | | | | | | | Also works around tablegen defect in selecting add with unused carry, but if we have to manually select GEP, might as well handle add manually. llvm-svn: 364806
* AMDGPU/GlobalISel: Select G_PHIMatt Arsenault2019-07-011-0/+1
| | | | llvm-svn: 364805
* AMDGPU/GlobalISel: Select G_BRCOND for vccMatt Arsenault2019-07-011-0/+3
| | | | llvm-svn: 364795
* AMDGPU/GlobalISel: Select G_FRAME_INDEXMatt Arsenault2019-07-011-0/+1
| | | | llvm-svn: 364789
* AMDGPU/GlobalISel: Select G_BRCOND for scc conditionsMatt Arsenault2019-07-011-0/+1
| | | | llvm-svn: 364786
* AMDGPU/GlobalISel: Select src modifiersMatt Arsenault2019-07-011-0/+3
| | | | llvm-svn: 364782
* AMDGPU/GlobalISel: Improve icmp selection coverage.Matt Arsenault2019-07-011-0/+2
| | | | | | Select s64 eq/ne scalar icmp. llvm-svn: 364765
* AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXTMatt Arsenault2019-06-251-0/+1
| | | | llvm-svn: 364308
* AMDGPU/GlobalISel: Select G_TRUNCMatt Arsenault2019-06-241-0/+1
| | | | llvm-svn: 364215
* AMDGPU/GlobalISel: Implement select for G_ICMP and G_SELECTTom Stellard2019-06-171-0/+2
| | | | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60640 llvm-svn: 363576
* AMDGPU/GlobalISel: Implement select for G_INSERTTom Stellard2019-03-011-0/+1
| | | | | | | | | | | | Re-commit r344310. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53116 llvm-svn: 355159
* AMDGPU/GlobalISel: Implement select for G_EXTRACTTom Stellard2019-02-281-0/+1
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49714 llvm-svn: 355156
* AMDGPU/GlobalISel: Move SMRD selection logic to TableGenTom Stellard2019-02-201-0/+8
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52922 llvm-svn: 354516
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* Revert "AMDGPU/GlobalISel: Implement select for G_INSERT"Tom Stellard2018-10-111-1/+0
| | | | | | | | This reverts commit r344310. The test case was failing on some bots. llvm-svn: 344317
* AMDGPU/GlobalISel: Implement select for G_INSERTTom Stellard2018-10-111-0/+1
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53116 llvm-svn: 344310
* AMDGPU: Remove remnants of old address space mappingMatt Arsenault2018-08-311-3/+0
| | | | llvm-svn: 341165
OpenPOWER on IntegriCloud