summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Dimension-aware image intrinsicsNicolai Haehnle2018-04-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: These new image intrinsics contain the texture type as part of their name and have each component of the address/coordinate as individual parameters. This is a preparatory step for implementing the A16 feature, where coordinates are passed as half-floats or -ints, but the Z compare value and texel offsets are still full dwords, making it difficult or impossible to distinguish between A16 on or off in the old-style intrinsics. Additionally, these intrinsics pass the 'texfailpolicy' and 'cachectrl' as i32 bit fields to reduce operand clutter and allow for future extensibility. v2: - gather4 supports 2darray images - fix a bug with 1D images on SI Change-Id: I099f309e0a394082a5901ea196c3967afb867f04 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44939 llvm-svn: 329166
* AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsicsNicolai Haehnle2018-04-011-0/+7
| | | | | | | | | | | | | | | | | | Summary: Avoids having to list all intrinsics manually. This is in preparation for the new dimension-aware image intrinsics, which I'd rather not have to list here by hand. Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44937 llvm-svn: 328938
* Reapply "AMDGPU: Add 32-bit constant address space"Matt Arsenault2018-02-091-0/+3
| | | | | | This reverts r324494 and reapplies r324487. llvm-svn: 324747
* AMDGPU: Fix layering issueMatt Arsenault2018-02-091-0/+18
| | | | | | | Move utility function that depends on codegen. Fixes build with r324487 reapplied. llvm-svn: 324746
* AMDGPU/SI: Adjust the encoding family for D16 buffer instructions when the ↵Changpeng Fang2018-02-011-0/+6
| | | | | | | | | | | | target has UnpackedD16VMem feature. Reviewers: Matt and Brian Differential Revision: https://reviews.llvm.org/D42548 llvm-svn: 323988
* AMDGPU/SI: Add d16 support for buffer intrinsics.Changpeng Fang2018-01-121-1/+2
| | | | | | | | | | Differential Revision: https://reviews.llvm.org/D38906 Reviewers: Matt and Brian. llvm-svn: 322402
* AMDGPU: Partially fix disassembly of MIMG instructionsMatt Arsenault2017-12-131-65/+0
| | | | | | | | | | | | | | | | | | | | | Stores failed to decode at all since they didn't have a DecoderNamespace set. Loads worked, but did not change the register width displayed to match the numbmer of enabled channels. The number of printed registers for vaddr is still wrong, but I don't think that's encoded in the instruction so there's not much we can do about that. Image atomics are still broken. MIMG is the same encoding for SI/VI, but the image atomic classes are split up into encoding specific versions unlike every other MIMG instruction. They have isAsmParserOnly set on them for some reason. dmask is also special for these, so we probably should not have it as an explicit operand as it is now. llvm-svn: 320614
* AMDGPU: Fix creating invalid copy when adjusting dmaskMatt Arsenault2017-12-041-6/+50
| | | | | | | | | Move the entire optimization to one place. Before it was possible to adjust dmask without changing the register class of the output instruction, since they were done in separate places. Fix all lane sizes and move all of the optimization into the DAG folding. llvm-svn: 319705
* [AMDGPU][MC][GFX8][GFX9] Corrected names of integer ↵Dmitry Preobrazhensky2017-11-201-4/+5
| | | | | | | | | | | | v_{add/addc/sub/subrev/subb/subbrev} See bug 34765: https://bugs.llvm.org//show_bug.cgi?id=34765 Reviewers: tamazov, SamWot, arsenm, vpykhtin Differential Revision: https://reviews.llvm.org/D40088 llvm-svn: 318675
* [AMDGPU][MC][GFX9] Added 16-bit renamed and "_legacy" VALU opcodesDmitry Preobrazhensky2017-08-091-1/+6
| | | | | | | | | | See Bug 33629: https://bugs.llvm.org//show_bug.cgi?id=33629 Reviewers: vpykhtin, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D36322 llvm-svn: 310497
* AMDGPU: Initial implementation of callsMatt Arsenault2017-08-011-1/+3
| | | | | | | | | Includes a hack to fix the type selected for the GlobalAddress of the function, which will be fixed by changing the default datalayout to use generic pointers for 0. llvm-svn: 309732
* [AMDGPU] SDWA: merge VI and GFX9 pseudo instructionsSam Kolton2017-06-211-2/+9
| | | | | | | | | | | | Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9. Reviewers: dp, arsenm, vpykhtin Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov Differential Revision: https://reviews.llvm.org/D34026 llvm-svn: 305886
* [AMDGPU] Get address space mapping by target triple environmentYaxun Liu2017-03-271-1/+1
| | | | | | | | | | | | | | | | | | As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846
* AMDGPU: Merge initial gfx9 supportMatt Arsenault2017-02-181-0/+1
| | | | llvm-svn: 295554
* MachineScheduler: Export function to construct "default" scheduler.Matthias Braun2016-11-281-8/+0
| | | | | | | | | | | | | | | | | | This makes the createGenericSchedLive() function that constructs the default scheduler available for the public API. This should help when you want to get a scheduler and the default list of DAG mutations. This also shrinks the list of default DAG mutations: {Load|Store}ClusterDAGMutation and MacroFusionDAGMutation are no longer added by default. Targets can easily add them if they need them. It also makes it easier for targets to add alternative/custom macrofusion or clustering mutations while staying with the default createGenericSchedLive(). It also saves the callback back and forth in TargetInstrInfo::enableClusterLoads()/enableClusterStores(). Differential Revision: https://reviews.llvm.org/D26986 llvm-svn: 288057
* AMDGPU: Enable store clusteringMatt Arsenault2016-11-151-0/+4
| | | | | | | Also respect the TII hook for these like the generic code does in case we want a flag later to disable this. llvm-svn: 287021
* [AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx ↵Sam Kolton2016-10-071-1/+0
| | | | | | | | | | | | to AMDGPUBaseInfo.h Reviewers: artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25084 llvm-svn: 283560
* AMDGPU: Move R600 only pieces into R600 classesMatt Arsenault2016-07-091-57/+0
| | | | llvm-svn: 274979
* Fix "not all control paths return a value" warning on MSVCSimon Pilgrim2016-06-271-0/+2
| | | | llvm-svn: 273872
* AMDGPU: Cleanup subtarget handling.Matt Arsenault2016-06-241-22/+27
| | | | | | | | | Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
* [AMDGPU] Remove exit-on-error in test (PR27761)Diana Picus2016-06-231-1/+3
| | | | | | | | | | | | | | | | | The exit-on-error flag was necessary in order to avoid an assertion when handling DYNAMIC_STACKALLOC nodes in SelectionDAGLegalize. We can avoid the assertion by creating some dummy nodes. This enables us to remove the exit-on-error flag on the first 2 run lines (SI), but on the third run line (R600) we would run into another assertion when trying to reserve indirect registers. This patch also replaces that assertion with an early exit from the function. Fixes PR27761. Differential Revision: http://reviews.llvm.org/D20852 llvm-svn: 273550
* AMDGPU: Move subtarget specific code out of AMDGPUInstrInfo.cppTom Stellard2016-01-281-209/+0
| | | | | | | | | | | | | | Summary: Also delete all the stub functions that are identical to the implementations in TargetInstrInfo.cpp. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16609 llvm-svn: 259054
* Make a bunch of static arrays const.Craig Topper2015-10-181-1/+1
| | | | llvm-svn: 250642
* Remove redundant TargetFrameLowering::getFrameIndexOffset virtualJames Y Knight2015-08-151-1/+3
| | | | | | | | | | | function. This was the same as getFrameIndexReference, but without the FrameReg output. Differential Revision: http://reviews.llvm.org/D12042 llvm-svn: 245148
* Fix broken ArrayRef conversion from r243497.Alex Lorenz2015-07-281-1/+1
| | | | llvm-svn: 243501
* MIR Serialization: Serialize the target index machine operands.Alex Lorenz2015-07-281-0/+11
| | | | | Reviewers: Duncan P. N. Exon Smith llvm-svn: 243497
* Remove TargetInstrInfo::canFoldMemoryOperandSimon Pilgrim2015-07-191-5/+0
| | | | | | | | | | canFoldMemoryOperand is not actually used anywhere in the codebase - all existing users instead call foldMemoryOperand directly when they wish to fold and can correctly deduce what they need from the return value. This patch removes the canFoldMemoryOperand base function and the target implementations; only x86 had a real (bit-rotted) implementation, although AMDGPU had a preparatory stub that had never needed to be completed. Differential Revision: http://reviews.llvm.org/D11331 llvm-svn: 242638
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+369
| | | | llvm-svn: 239657
* Revert "AMDGPU: Add core backend files for R600/SI codegen v6"Tom Stellard2012-07-161-46/+0
| | | | | | This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303
* AMDGPU: Add core backend files for R600/SI codegen v6Tom Stellard2012-07-161-0/+46
llvm-svn: 160270
OpenPOWER on IntegriCloud