summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPU.td
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU/NFC: Fix formatting for 900, 902 ISA Version featuresKonstantin Zhuravlyov2018-05-041-4/+2
| | | | llvm-svn: 331553
* AMDGPU: Add D16 instructions preserve unused bits featureKonstantin Zhuravlyov2018-05-041-3/+15
| | | | | | | | | - Predicate D16 patterns on this new feature - Added this new feature to gfx900/2/4 Differential Revision: https://reviews.llvm.org/D46366 llvm-svn: 331551
* AMDGPU: Add Vega12 and Vega20Matt Arsenault2018-04-301-0/+31
| | | | | | | | Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215
* AMDGPU: Consolidate SubtargetPredicate definitionsMatt Arsenault2018-04-261-0/+7
| | | | llvm-svn: 330979
* AMDGPU: enable 128-bit for local addr space under an optionMarek Olsak2018-04-101-0/+6
| | | | | | | | | | | | | | | | | | | Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). v2: - fix regressions in merge-stores.ll and multiple_tails.ll Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329764
* Revert "AMDGPU: enable 128-bit for local addr space under an option"Alex Shlyapnikov2018-04-091-6/+0
| | | | | | | | | | | | | | This reverts commit r329591. It breaks various bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/16516 http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/17374 http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/15992 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/11251 ... llvm-svn: 329610
* AMDGPU: enable 128-bit for local addr space under an optionMarek Olsak2018-04-091-0/+6
| | | | | | | | | | | | | | | | | Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329591
* [AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructionsDmitry Preobrazhensky2018-04-021-1/+10
| | | | | | | | | | | Fixed a bug which caused Tablegen crash. See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837 Differential Revision: https://reviews.llvm.org/D45085 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328983
* Revert r328975, it makes TableGen assert on the bots.Nico Weber2018-04-021-10/+1
| | | | llvm-svn: 328978
* [AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructionsDmitry Preobrazhensky2018-04-021-1/+10
| | | | | | | | | See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837 Differential Revision: https://reviews.llvm.org/D45085 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328975
* AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsicsNicolai Haehnle2018-04-011-0/+1
| | | | | | | | | | | | | | | | | | Summary: Avoids having to list all intrinsics manually. This is in preparation for the new dimension-aware image intrinsics, which I'd rather not have to list here by hand. Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44937 llvm-svn: 328938
* AMDGPU: Add fast fmaf feature to gfx702Konstantin Zhuravlyov2018-02-271-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D43790 llvm-svn: 326252
* [MachineOperand][Target] MachineOperand::isRenamable semantics changesGeoff Berry2018-02-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931
* AMDGPU: Bring processors and features in sync with the specKonstantin Zhuravlyov2018-02-161-7/+2
| | | | | | | | | | - Remove gfx800 - Make iceland gfx802 - Add xnack to gfx902 Differential Revision: https://reviews.llvm.org/D43355 llvm-svn: 325393
* [AMDGPU][MC] Added validation of d16 and r128 modifiers of MIMG opcodesDmitry Preobrazhensky2018-02-051-3/+9
| | | | | | | | | | | See bugs 36094, 36095: https://bugs.llvm.org/show_bug.cgi?id=36094 https://bugs.llvm.org/show_bug.cgi?id=36095 Differential Revision: https://reviews.llvm.org/D42692 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 324231
* AMDGPU/SI: Add d16 support for buffer intrinsics.Changpeng Fang2018-01-121-4/+19
| | | | | | | | | | Differential Revision: https://reviews.llvm.org/D38906 Reviewers: Matt and Brian. llvm-svn: 322402
* AMDGPU/GCN: Bring processors in sync with AMDGPUUsageKonstantin Zhuravlyov2017-12-081-16/+4
| | | | | | | | | | | | - Add gfx704 - Change bonaire to gfx704 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40046 llvm-svn: 320194
* AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instructionJan Vesely2017-12-041-0/+6
| | | | | | | | | Only used by pre-GCN targets v2: fix predicate setting for FMA_Common Differential Revision: https://reviews.llvm.org/D40692 llvm-svn: 319712
* AMDGPU: Select DS insts without m0 initializationMatt Arsenault2017-11-291-0/+4
| | | | | | | | | GFX9 stopped using m0 for most DS instructions. Select a different instruction without the use. I think this will be less error prone than trying to manually maintain m0 uses as needed. llvm-svn: 319270
* AMDGPU: Don't use MUBUF vaddr if address may overflowMatt Arsenault2017-11-151-0/+7
| | | | | | | Effectively revert r263964. Before we would not allow this if vaddr was not known to be positive. llvm-svn: 318240
* AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.tdKonstantin Zhuravlyov2017-11-101-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D39880 llvm-svn: 317920
* AMDGPU: Add max-mix-insts subtarget featureMatt Arsenault2017-10-251-4/+16
| | | | llvm-svn: 316553
* AMDGPU: Do not emit deprecated notes for code object v3Konstantin Zhuravlyov2017-10-141-0/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D38749 llvm-svn: 315810
* AMDGPU: Fix incorrect selection of pseudo-branchesMatt Arsenault2017-10-101-0/+2
| | | | | | These should only be used if the machine structurizer is enabled. llvm-svn: 315357
* AMDGPU: Remove global isGCN predicatesMatt Arsenault2017-10-031-2/+12
| | | | | | | | | | | | | | These are problematic because they apply to everything, and can easily clobber whatever more specific predicate you are trying to add to a function. Currently instructions use SubtargetPredicate/PredicateControl to apply this to patterns applied to an instruction definition, but not to free standing Pats. Add a wrapper around Pat so the special PredicateControls requirements can be appended to the final predicate list like how Mips does it. llvm-svn: 314742
* AMDGPU: Fix typosMatt Arsenault2017-10-021-2/+2
| | | | llvm-svn: 314715
* AMDGPU: Start selecting v_mad_mix_f32Matt Arsenault2017-09-071-0/+3
| | | | llvm-svn: 312732
* AMDGPU: Add ds_{read|write}_addtid_b32 definitionsMatt Arsenault2017-09-011-0/+3
| | | | llvm-svn: 312349
* AMDGPU: Add most d16 load/store instruction definitionsMatt Arsenault2017-09-011-0/+2
| | | | | | | Doesn't include the tied operand necessary for the loads, but is enough for the assembler to work. llvm-svn: 312347
* AMDGPU: Fix gfx801 featuresKonstantin Zhuravlyov2017-08-241-0/+2
| | | | | | | | gfx801 has 1/2 rate F64, Fast F32 FMA Differential Revision: https://reviews.llvm.org/D36981 llvm-svn: 311694
* [AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodesDmitry Preobrazhensky2017-08-161-2/+12
| | | | | | | | | | See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36674 llvm-svn: 311006
* AMDGPU: Cleanup subtarget featuresMatt Arsenault2017-08-071-9/+3
| | | | | | | | | | | | Try to avoid mutually exclusive features. Don't use a real default GPU, and use a fake "generic". The goal is to make it easier to see which set of features are incompatible between feature strings. Most of the test changes are due to random scheduling changes from not having a default fullspeed model. llvm-svn: 310258
* AMDGPU: Fix typo in feature descriptionMatt Arsenault2017-08-061-1/+1
| | | | llvm-svn: 310217
* AMDGPU: Add instruction definitions for some scratch_* instructionsMatt Arsenault2017-07-211-0/+2
| | | | | | Omit atomics for now since they probably aren't useful. llvm-svn: 308747
* AMDGPU: Add encoding for carryless add/sub instructionsMatt Arsenault2017-07-201-1/+14
| | | | llvm-svn: 308639
* [AMDGPU] SDWA: several fixes for V_CVT and VOPC instructionsSam Kolton2017-06-271-3/+3
| | | | | | | | | | | | | | Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass. Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye Differential Revision: https://reviews.llvm.org/D34626 llvm-svn: 306413
* AMDGPU: Whitespace fixesMatt Arsenault2017-06-261-2/+2
| | | | llvm-svn: 306265
* [AMDGPU] SDWA: add support for GFX9 in peephole passSam Kolton2017-06-221-3/+34
| | | | | | | | | | | | | | | | Summary: Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers. Added several subtarget features for GFX9 SDWA. This diff also contains changes from D34026. Depends D34026 Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34241 llvm-svn: 305986
* AMDGPU: Start adding global_* instructionsMatt Arsenault2017-06-201-1/+5
| | | | llvm-svn: 305838
* AMDGPU : Fix ISA Version Definitions.Wei Ding2017-06-101-2/+31
| | | | | | Differential Revision: http://reviews.llvm.org/D28531 llvm-svn: 305137
* AMDGPU: Make auto waitcnt before barrier a featureKonstantin Zhuravlyov2017-06-021-0/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D33793 llvm-svn: 304571
* [AMDGPU] SDWA: Add assembler support for GFX9Sam Kolton2017-05-231-4/+16
| | | | | | | | | | | | | | | Summary: Added separate pseudo and real instruction for GFX9 SDWA instructions. Currently supports only in assembler. Depends D32493 Reviewers: vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33132 llvm-svn: 303620
* AMDGPU: Add new subtarget features for gfx9 flat instructionsMatt Arsenault2017-05-101-1/+20
| | | | | | | Flat instructions gain an immediate offset, and 2 new sets of segment specific flat instructions are added. llvm-svn: 302729
* [AMDGPU] DPP: add support for GFX9Sam Kolton2017-04-271-1/+1
| | | | | | | | | | Reviewers: artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D32588 llvm-svn: 301551
* AMDGPU/GFX9: Enable FastFMAF32Konstantin Zhuravlyov2017-04-211-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D32363 llvm-svn: 301029
* AMDGPU: Always use VGPR indexing on GFX9Marek Olsak2017-03-211-1/+1
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr Differential Revision: https://reviews.llvm.org/D31157 llvm-svn: 298396
* AMDGPU: Add VOP3P instruction formatMatt Arsenault2017-02-271-2/+11
| | | | | | | | Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368
* AMDGPU: Redefine clamp node as clamp 0.0-1.0Matt Arsenault2017-02-211-0/+6
| | | | | | | | | | | Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788
* AMDGPU: Fix assembler subtarget predicate for gfx9Matt Arsenault2017-02-181-1/+11
| | | | | | This was accepting GFX9 instructions on VI. llvm-svn: 295557
* AMDGPU: Merge initial gfx9 supportMatt Arsenault2017-02-181-3/+21
| | | | llvm-svn: 295554
OpenPOWER on IntegriCloud