summaryrefslogtreecommitdiffstats
path: root/llvm/docs/AMDGPUUsage.rst
Commit message (Collapse)AuthorAgeFilesLines
* [docs] NFC: Fix typos in documentsHans Wennborg2020-01-071-1/+1
| | | | | | | | | "the the" -> "the" "an" -> "a" Patch by Kazuaki Ishizaki <ishizaki@jp.ibm.com>! Differential revision: https://reviews.llvm.org/D72091
* [AMDGPU][MC][DOC] Updated AMD GPU assembler syntax description.Dmitry Preobrazhensky2019-12-251-7/+31
| | | | | | | | | | | Summary of changes: - added description of GFX9 subtargets: - gfx900; - gfx902; - gfx904; - gfx906; - gfx908; - gfx909.
* [AMDGPU] AMDGPUUsage clarify address space information and other typo and ↵Tony2019-12-121-435/+476
| | | | | | | | | | | | | | | | | | formatting fixes Summary: - Clarify AMDGPU address spaces. - Correct path to AMDGPU backend since now in the mono-repo. - Fix numerous text style and typo issues. - Correct reStructure text formatting warnings. - Made reStructure directive usage more consistent. - Add references for gfx10 ISA specification. Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71392
* Fix a few doc typos, to cycle bots.Nico Weber2019-12-081-6/+6
|
* [AMDGPU] add support for hostcall buffer pointer as hidden kernel argumentSameer Sahasrabuddhe2019-11-201-0/+10
| | | | | | | | | | | Hostcall is a service that allows a kernel to submit requests to the host using shared buffers, and block until a response is received. This will eventually replace the shared buffer currently used for printf, and repurposes the same hidden kernel argument. This change introduces a new ValueKind in the HSA metadata to represent the hostcall buffer. Differential Revision: https://reviews.llvm.org/D70038
* [AMDGPU] gfx908 targetStanislav Mekhanoshin2019-07-091-1/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D64429 llvm-svn: 365525
* [AMDGPU][MC][DOC] Updated AMD GPU assembler syntax description.Dmitry Preobrazhensky2019-07-081-1/+1
| | | | | | Corrected a typo. llvm-svn: 365353
* [AMDGPU][MC][DOC] Updated AMD GPU assembler syntax description.Dmitry Preobrazhensky2019-07-081-4/+3
| | | | | | | | | Summary of changes: - added description of GFX10; - added description of operands sccz, vccz, lds_direct, etc; - minor bugfixing and improvements. llvm-svn: 365347
* [AMDGPU] Added a new metadata for multi grid sync implicit argumentYaxun Liu2019-07-051-0/+12
| | | | | | | | Patch by Christudasan Devadasan. Differential Revision: https://reviews.llvm.org/D63886 llvm-svn: 365217
* AMDGPU/MC: Add .amdgpu_lds directiveNicolai Haehnle2019-06-251-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The directive defines a symbol as an group/local memory (LDS) symbol. LDS symbols behave similar to common symbols for the purposes of ELF, using the processor-specific SHN_AMDGPU_LDS as section index. It is the linker and/or runtime loader's job to "instantiate" LDS symbols and resolve relocations that reference them. It is not possible to initialize LDS memory (not even zero-initialize as for .bss). We want to be able to link together objects -- starting with relocatable objects, but possible expanding to shared objects in the future -- that access LDS memory in a flexible way. LDS memory is in an address space that is entirely separate from the address space that contains the program image (code and normal data), so having program segments for it doesn't really make sense. Furthermore, we want to be able to compile multiple kernels in a compilation unit which have disjoint use of LDS memory. In that case, we may want to place LDS symbols differently for different kernels to save memory (LDS memory is very limited and physically private to each kernel invocation), so we can't simply place LDS symbols in a .lds section. Hence this solution where LDS symbols always stay undefined. Change-Id: I08cbc37a7c0c32f53f7b6123aa0afc91dbc1748f Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61493 llvm-svn: 364296
* [AMDGPU] gfx10 documentation update. NFC.Stanislav Mekhanoshin2019-06-131-1270/+2412
| | | | llvm-svn: 363332
* AMDGPU: Remove amdgpu-max-work-group-size attributeMatt Arsenault2019-06-051-2/+0
| | | | | | | This has been deprecated for a long time, and mesa recently switched to amdgpu-flat-work-group-size. llvm-svn: 362641
* Try to fix Sphinx bot.Zachary Turner2019-04-051-8/+6
| | | | llvm-svn: 357790
* AMDGPU: Remove dx10-clamp from subtarget featuresMatt Arsenault2019-03-291-0/+8
| | | | | | | | | | | | | | | | | | Since this can be set with s_setreg*, it should not be a subtarget property. Set a default based on the calling convention, and Introduce a new amdgpu-dx10-clamp attribute to override this if desired. Also introduce a new amdgpu-ieee attribute to match. The values need to match to allow inlining. I think it is OK for the caller's dx10-clamp attribute to override the callee, but there doesn't appear to be the infrastructure to do this currently without definining the attribute in the generic Attributes.td. Eventually the calling convention lowering will need to insert a mode switch somewhere for these. llvm-svn: 357302
* [AMDGPU] Add an additional Code Object V3 assembler exampleScott Linder2019-03-291-0/+78
| | | | | | | | | Document the intended use of the `.amdgcn.next_free_{s,v}gpr` in the context of multiple kernels and functions. Differential Revision: https://reviews.llvm.org/D59949 llvm-svn: 357289
* AMDGPU: Make sram-ecc off by default for Vega20Konstantin Zhuravlyov2019-03-291-2/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D59718 llvm-svn: 357247
* [AMDGPU] Clarify Code Object V2/V3 differences in AMDGPUUsageScott Linder2019-03-281-25/+128
| | | | | | | | | | | | Ensure Code Object V2 documentation is complete, but always contains a warning and a link to the equivalent Code Object V3 documentation. Explicitly indicate that any note records present in a code object that are not documented must be considered deprecated and ignored. Differential Revision: https://reviews.llvm.org/D59782 llvm-svn: 357176
* AMDGPU: Add support for cross address space synchronization scopesKonstantin Zhuravlyov2019-03-251-56/+74
| | | | | | Differential Revision: https://reviews.llvm.org/D59517 llvm-svn: 356946
* [AMDGPU] Add an experimental buffer fat pointer address space.Neil Henning2019-03-181-3/+11
| | | | | | | | | | | | Add an experimental buffer fat pointer address space that is currently unhandled in the backend. This commit reserves address space 7 as a non-integral pointer repsenting the 160-bit fat pointer (128-bit buffer descriptor + 32-bit offset) that is heavily used in graphics workloads using the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D58957 llvm-svn: 356373
* [AMDGPU][MC][CODEOBJECT] Added predefined symbols to access GPU minor and ↵Dmitry Preobrazhensky2019-02-081-2/+18
| | | | | | | | | | | | | | stepping numbers Added the following Code Object v3 symbols: .amdgcn.gfx_generation_minor .amdgcn.gfx_generation_stepping Reviewers: artem.tamazov, kzhuravl Differential Revision: https://reviews.llvm.org/D57826 llvm-svn: 353515
* [AMDGPU][MC][DOC] Updated AMD GPU assembler descriptionDmitry Preobrazhensky2018-12-171-21/+16
| | | | | | | | Stage 2: added detailed description of operands See bug 36572: https://bugs.llvm.org/show_bug.cgi?id=36572 llvm-svn: 349368
* [AMDGPU] Update code object metadata format documentationScott Linder2018-11-151-34/+519
| | | | | | | | | | | | | | | * Add amdhsa prefix to names to allow other tools to use the metadata without collision. * Make names consistent. * Simplify structure. * Change note record ID. * Switch from YAML to MsgPack format. * Document metadata assembler directive. Patch By: t-tye (Tony Tye) Differential Revision: https://reviews.llvm.org/D53445 llvm-svn: 346992
* AMDGPU/Docs: Add product names for Vega20Konstantin Zhuravlyov2018-11-071-5/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D54178 llvm-svn: 346354
* AMDGPU/Docs: Fix the processor tableKonstantin Zhuravlyov2018-11-061-101/+101
| | | | llvm-svn: 346263
* AMDGPU: Add sram-ecc featureKonstantin Zhuravlyov2018-11-051-18/+33
| | | | | | Differential Revision: https://reviews.llvm.org/D53222 llvm-svn: 346177
* [AMDGPU] Defined gfx909 Raven Ridge 2Tim Renouf2018-10-241-0/+6
| | | | | | | Differential Revision: https://reviews.llvm.org/D53418 Change-Id: Ie3d054f2e956c2768988c0f4c0ffd29a47294eef llvm-svn: 345120
* [docs] Turn of `nasm` highlighting for a code block.Chandler Carruth2018-08-061-1/+1
| | | | | | | | This appears to produce a warning on the docs build bot. It doesn't reproduce for me, likely because I have a newer (or more full featured) pygments install. llvm-svn: 338978
* AMDHSA: Put old assembler docs backKonstantin Zhuravlyov2018-06-221-6/+101
| | | | | | | | | Until we switch to code object v3 by default. Follow up for https://reviews.llvm.org/D47736. Differential Revision: https://reviews.llvm.org/D48497 llvm-svn: 335378
* [AMDGPU] Update assembler for HSA Code Object v3Scott Linder2018-06-211-112/+272
| | | | | | | | | | | | | | Update AMDGPU assembler syntax behind the code-object-v3 feature: * Replace/rename most AMDGPU assembler directives/symbols and document them. * Provide more diagnostics (e.g. values out of range, missing values, repeated values). * Provide path for backwards compatibility, even with underlying descriptor changes. Differential Revision: https://reviews.llvm.org/D47736 llvm-svn: 335281
* AMDGPU/AMDHSA: Remove GridWorkGroupCountX/Y/ZKonstantin Zhuravlyov2018-06-211-9/+2
| | | | | | | | | | | | and everything that comes with it from implementation and v3 header files. Leave definition in v2 header files for backwards compatibility. Differential Revision: https://reviews.llvm.org/D48191 llvm-svn: 335267
* [AMDGPU] Document the AMDGPU LLVM attributesTony Tye2018-06-141-1/+31
| | | | | | Differential Revision: https://reviews.llvm.org/D48101 llvm-svn: 334733
* AMDHSA: Code object v3 updatesKonstantin Zhuravlyov2018-06-121-32/+33
| | | | | | | | | | | | | | | - Do not emit following assembler directives: - .hsa_code_object_version - .hsa_code_object_isa - .amd_amdgpu_isa - .amd_amdgpu_hsa_metadata - .amd_amdgpu_pal_metadata - Do not emit .note entries - Cleanup and bring in sync kernel descriptor header file - Emit kernel descriptor into .rodata with appropriate relocations and alignments llvm-svn: 334519
* AMDGPU: Always set COMPUTE_PGM_RSRC2.ENABLE_TRAP_HANDLER to zero for AMDHSA asKonstantin Zhuravlyov2018-05-291-11/+7
| | | | | | | | it is set by CP Differential Revision: https://reviews.llvm.org/D47392 llvm-svn: 333451
* [AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume ↵Tony Tye2018-05-161-7/+26
| | | | | | | | | | execution. No longer require the queue pointer to be passed in in fixed SGPRs. Differential Revision: https://reviews.llvm.org/D46769 llvm-svn: 332485
* AMDGPU: Add Vega12 and Vega20Matt Arsenault2018-04-301-4/+14
| | | | | | | | Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215
* [AMDGPU] Add gfx902 product namesTony Tye2018-04-141-5/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D45609 llvm-svn: 330081
* [AMDGPU] Update relocation record descriptionTony Tye2018-04-131-4/+14
| | | | | | | | Document which relocation records are static and dynamic. Differential Revision: https://reviews.llvm.org/D45587 llvm-svn: 329981
* [NFC] fix trivial typos in documents and commentsHiroshi Inoue2018-04-121-1/+1
| | | | | | "is is" -> "is", "if if" -> "if", "or or" -> "or" llvm-svn: 329878
* Add AMDPAL Code Conventions section to AMD docsTim Corringham2018-04-041-0/+119
| | | | | | | | | | | | | Summary: This is a first version of the AMDPAL code conventions. Further updates will undoubtably be required to fully document AMDPAL. Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D45246 llvm-svn: 329188
* [AMDGPU] Define code object identification string used in AMDHSA runtimes.Tony Tye2018-03-271-0/+28
| | | | | | Differential Revision: https://reviews.llvm.org/D44718 llvm-svn: 328669
* [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPUTony Tye2018-03-231-4/+8
| | | | | | | | Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue. Differential Revision: https://reviews.llvm.org/D44697 llvm-svn: 328351
* [AMDGPU] Remove use of OpenCL triple environment and replace with function ↵Tony Tye2018-03-231-18/+18
| | | | | | | | | | | attribute for AMDGPU - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS. Differential Revision: https://reviews.llvm.org/D43736 llvm-svn: 328349
* [Documentation] Fix markup problem in AMDGPUUsage.rst.Eugene Zelenko2018-03-211-1/+1
| | | | llvm-svn: 328116
* [TableGen] Pass result of std::unique to vector::erase instead of ↵Craig Topper2018-03-201-1/+1
| | | | | | calculating a size and calling resize. llvm-svn: 328031
* [AMDGPU][MC][DOC] Updated AMD GPU assembler descriptionDmitry Preobrazhensky2018-03-121-29/+31
| | | | | | | | | See bug 36572: https://bugs.llvm.org/show_bug.cgi?id=36572 Differential Revision: https://reviews.llvm.org/D44020 Reviewers: artem.tamazov, vpykhtin llvm-svn: 327288
* [AMDGPU] Update AMDGOUUsage.rst descriptionsTony Tye2018-03-081-27/+32
| | | | | | | | | - Improve description of XNACK ELF flag. - Rename all uses of wave to wavefront to be consistent. Differential Revision: https://reviews.llvm.org/D43983 llvm-svn: 326989
* [DebugInfo] Support DWARF v5 source code embedding extensionScott Linder2018-02-231-4/+54
| | | | | | | | | | | | | | | | | | | In DWARF v5 the Line Number Program Header is extensible, allowing values with new content types. In this extension a content type is added, DW_LNCT_LLVM_source, which contains the embedded source code of the file. Add new optional attribute for !DIFile IR metadata called source which contains source text. Use this to output the source to the DWARF line table of code objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM to support optional source. Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output format of llvm-dwarfdump to make room for the new attribute on file_names entries, and support embedded sources for the -source option in llvm-objdump. Differential Revision: https://reviews.llvm.org/D42765 llvm-svn: 325970
* AMDGPU: Bring elf flags in sync with the specKonstantin Zhuravlyov2018-02-161-41/+44
| | | | | | | | | | | - Add MACH flags - Add XNACK flag - Add reserved flags - Minor cleanups in docs Differential Revision: https://reviews.llvm.org/D43356 llvm-svn: 325399
* [AMDGPU] Change constant addr space to 4Yaxun Liu2018-02-131-20/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D43170 llvm-svn: 325030
* Reapply "AMDGPU: Add 32-bit constant address space"Matt Arsenault2018-02-091-0/+1
| | | | | | This reverts r324494 and reapplies r324487. llvm-svn: 324747
OpenPOWER on IntegriCloud