summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
...
* Use getTargetConstant for BLENDI, and add a test to catch it.Sterling Augustine2019-09-201-0/+14
| | | | | | | | | | | | | | | | Summary: This fixes a crasher introduced by r372338. Reviewers: echristo, arsenm Subscribers: jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67785 Tighten up the test case. llvm-svn: 372366
* MachineScheduler: Fix missing dependency with multiple subreg defsMatt Arsenault2019-09-201-0/+86
| | | | | | | | | | | | | | | | | | | | If an instruction had multiple subregister defs, and one of them was undef, this would improperly conclude all other lanes are killed. There could still be other defs of those read-undef lanes in other operands. This would improperly remove register uses from CurrentVRegUses, so the visitation of later operands would not find the necessary register dependency. This would also mean this would fail or not depending on how different subregister def operands were ordered. On an undef subregister def, scan the instruction for other subregister defs and avoid killing those. This possibly should be deferring removing anything from CurrentVRegUses until the entire instruction has been processed instead. llvm-svn: 372362
* [AMDGPU] Unnecessary -amdgpu-scalarize-global-loads=false flag removed from ↵Alexander Timofeev2019-09-193-22/+55
| | | | | | | | | | min/max lit tests. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D67712 llvm-svn: 372340
* Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Matt Arsenault2019-09-1920-91/+2864
| | | | | | | | | This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
* Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Hans Wennborg2019-09-1920-2864/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC*. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_* instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
* AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.ds.swizzleMatt Arsenault2019-09-191-0/+21
| | | | llvm-svn: 372297
* AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store.formatMatt Arsenault2019-09-192-0/+833
| | | | | | | | | This needs special handling due to some subtargets that have a nonstandard register layout for f16 vectors Also reject some illegal types on other targets. llvm-svn: 372293
* AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.storeMatt Arsenault2019-09-192-168/+791
| | | | llvm-svn: 372292
* AMDGPU/GlobalISel: RegBankSelect struct buffer load/storeMatt Arsenault2019-09-192-0/+353
| | | | llvm-svn: 372291
* AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.raw.buffer.{load|store}Matt Arsenault2019-09-192-0/+341
| | | | llvm-svn: 372290
* AMDGPU/GlobalISel: Attempt to RegBankSelect image intrinsicsMatt Arsenault2019-09-192-0/+449
| | | | | | Images should always have 2 consecutive, mandatory SGPR arguments. llvm-svn: 372289
* Fix typoMatt Arsenault2019-09-191-1/+1
| | | | llvm-svn: 372288
* MachineScheduler: Fix assert from not checking subregsMatt Arsenault2019-09-191-0/+70
| | | | | | | The assert would fail if there was a dead def of a subregister if there was a previous use of a different subregister. llvm-svn: 372287
* AMDGPU/GlobalISel: Fix RegBankSelect G_SMULH/G_UMULH pre-gfx9Matt Arsenault2019-09-192-36/+88
| | | | | | The scalar versions were only introduced in gfx9. llvm-svn: 372286
* GlobalISel: Don't materialize immarg arguments to intrinsicsMatt Arsenault2019-09-198-55/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encode them directly as an imm argument to G_INTRINSIC*. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_* instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
* [AMDGPU] Allow FP inline constant in v_madak_f16 and v_fmaak_f16Tim Renouf2019-09-182-0/+40
| | | | | | | Differential Revision: https://reviews.llvm.org/D67680 Change-Id: Ic38f47cb2079c2c1070a441b5943854844d80a7c llvm-svn: 372208
* [AMDGPU]: PHI Elimination hooks added for custom COPY insertion. FixedAlexander Timofeev2019-09-172-1/+55
| | | | | | | Defferential Revision: https://reviews.llvm.org/D67101 Reviewers: rampitec, vpykhtin llvm-svn: 372086
* [GlobalISel] Partially revert r371901.Amara Emerson2019-09-161-1/+1
| | | | | | | | | | | r371901 was overeager and widenScalarDst() and the like in the legalizer attempt to increment the insert point given in order to add new instructions after the currently legalizing inst. In cases where the insertion point is not exactly the current instruction, then callers need to de-compensate for the behaviour by decrementing the insertion iterator before calling them. It's not a nice state of affairs, for now just undo the problematic parts of the change. llvm-svn: 372050
* AMDGPU/GlobalISel: Fix some broken run linesMatt Arsenault2019-09-164-8/+8
| | | | llvm-svn: 371992
* AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEILMatt Arsenault2019-09-162-0/+62
| | | | llvm-svn: 371991
* AMDGPU/GlobalISel: Remove another illegal select testMatt Arsenault2019-09-161-37/+0
| | | | llvm-svn: 371990
* AMDGPU/GlobalISel: Remove illegal select testsMatt Arsenault2019-09-161-74/+0
| | | | | | These fail in a release build. llvm-svn: 371955
* AMDGPU/GlobalISel: Select SMRD loads for more typesMatt Arsenault2019-09-161-0/+1007
| | | | llvm-svn: 371954
* AMDGPU/GlobalISel: RegBankSelect for killMatt Arsenault2019-09-161-0/+68
| | | | llvm-svn: 371953
* AMDGPU/GlobalISel: Legalize s1 source G_[SU]ITOFPMatt Arsenault2019-09-162-0/+72
| | | | llvm-svn: 371952
* AMDGPU/GlobalISel: Set type on vgpr live in special argumentsMatt Arsenault2019-09-161-0/+27
| | | | | | | Fixes assertion with workitem ID intrinsics used in non-kernel functions. llvm-svn: 371951
* AMDGPU/GlobalISel: Select S16->S32 fptointMatt Arsenault2019-09-162-5/+210
| | | | llvm-svn: 371950
* AMDGPU/GlobalISel: Select s32->s16 G_[US]ITOFPMatt Arsenault2019-09-164-9/+143
| | | | llvm-svn: 371949
* AMDGPU/GlobalISel: Fix VALU s16 fnegMatt Arsenault2019-09-161-11/+9
| | | | llvm-svn: 371948
* [GlobalISel] Fix insertion point of new instructions to be after PHIs.Amara Emerson2019-09-131-1/+1
| | | | | | | | | | For some reason we sometimes insert new instructions one instruction before the first non-PHI when legalizing. This can result in having non-PHI instructions before PHIs, which mean that PHI elimination doesn't catch them. Differential Revision: https://reviews.llvm.org/D67570 llvm-svn: 371901
* Revert for: [AMDGPU]: PHI Elimination hooks added for custom COPY insertion.Alexander Timofeev2019-09-132-55/+1
| | | | llvm-svn: 371873
* AMDGPU/GlobalISel: Legalize s32->s16 G_SITOFP/G_UITOFPMatt Arsenault2019-09-132-0/+34
| | | | llvm-svn: 371811
* AMDGPU/GlobalISel: Fix RegBankSelect for amdgcn.elseMatt Arsenault2019-09-132-0/+35
| | | | llvm-svn: 371808
* AMDGPU/GlobalISel: Select 16-bit VALU bit opsMatt Arsenault2019-09-133-18/+15
| | | | llvm-svn: 371807
* AMDGPU/GlobalISel: Legalize G_FFLOORMatt Arsenault2019-09-133-0/+639
| | | | llvm-svn: 371803
* Temporarily revert r371640 "LiveIntervals: Split live intervals on multiple ↵Tim Shen2019-09-131-18/+0
| | | | | | | | dead defs". It reveals a miscompile on Hexagon. See PR43302 for details. llvm-svn: 371802
* AMDGPU/GlobalISel: Legalize G_FMADMatt Arsenault2019-09-132-0/+817
| | | | | | | | | | | | | | | Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800
* AMDGPU/GlobalISel: Select G_CTPOPMatt Arsenault2019-09-131-0/+204
| | | | llvm-svn: 371798
* LiveIntervals: Remove assertionMatt Arsenault2019-09-121-0/+28
| | | | | | | | | | | | | This testcase is invalid, and caught by the verifier. For the verifier to catch it, the live interval computation needs to complete. Remove the assert so the verifier catches this, which is less confusing. In this testcase there is an undefined use of a subregister, and lanes which aren't used or defined. An equivalent testcase with the super-register shrunk to have no untouched lanes already hit this verifier error. llvm-svn: 371792
* AMDGPU: Inline constant when materalizing FI with add on gfx9Matt Arsenault2019-09-122-4/+47
| | | | | | | | | This was relying on the SGPR usable for the carry out clobber to also be used for the input. There was no carry out on gfx9. With no carry out clobber to worry about, so the literal can just be directly used with a VOP2 add. llvm-svn: 371791
* [DAGCombiner] Improve division estimation of floating points.Qiu Chaofan2019-09-122-16/+16
| | | | | | | | | | | | | Current implementation of estimating divisions loses precision since it estimates reciprocal first and does multiplication. This patch is to re-order arithmetic operations in the last iteration in DAGCombiner to improve the accuracy. Reviewed By: Sanjay Patel, Jinsong Ji Differential Revision: https://reviews.llvm.org/D66050 llvm-svn: 371713
* AMDGPU: Move m0 initializations earlierAustin Kerbow2019-09-112-21/+93
| | | | | | | | | | | | | | | | | Summary: After hoisting and merging m0 initializations schedule them as early as possible in the MBB. This helps the scheduler avoid hazards in some cases. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67450 llvm-svn: 371671
* [AMDGPU] Fix crash in phi-elimination hook.Michael Liao2019-09-111-0/+26
| | | | | | | | | | | | | | Summary: - Pre-check in case there's just a single PHI insn. Reviewers: alex-t, rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D67451 llvm-svn: 371649
* LiveIntervals: Split live intervals on multiple dead defsMatt Arsenault2019-09-111-0/+18
| | | | | | | | If there are multiple dead defs of the same virtual register, these are required to be split into multiple virtual registers with separate live intervals to avoid a verifier error. llvm-svn: 371640
* [Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes ↵Guillaume Chatelet2019-09-1128-59/+59
| | | | | | | | | | | | | | | | | | | | | | mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608
* GlobalISel/TableGen: Handle REG_SEQUENCE patternsMatt Arsenault2019-09-102-13/+28
| | | | | | | | The scalar f64 patterns don't work yet because they fail on multiple results from the unused implicit def of scc in the result bit operation. llvm-svn: 371542
* AMDGPU/GlobalISel: Select G_FABS/G_FNEGMatt Arsenault2019-09-108-318/+945
| | | | | | | | | | | f64 doesn't work yet because tablegen currently doesn't handlde REG_SEQUENCE. This does regress some multi use VALU fneg cases since now the immediate remains in an SGPR, and more moves are used for legalizing the xor. This is a SIFixSGPRCopies deficiency. llvm-svn: 371540
* AMDGPU/GlobalISel: Select cvt pk intrinsicsMatt Arsenault2019-09-105-26/+322
| | | | llvm-svn: 371539
* AMDGPU/GlobalISel: Select llvm.amdgcn.sffbhMatt Arsenault2019-09-101-0/+62
| | | | llvm-svn: 371538
* AMDGPU/GlobalISel: RegBankSelect for G_ZEXTLOAD/G_SEXTLOADMatt Arsenault2019-09-102-0/+195
| | | | llvm-svn: 371536
OpenPOWER on IntegriCloud