summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/NVPTX/wmma.py
Commit message (Collapse)AuthorAgeFilesLines
* [NVPTX] Added llvm.nvvm.mma.m8n8k4.* intrinsicsArtem Belevich2019-10-281-59/+101
| | | | Differential Revision: https://reviews.llvm.org/D69324
* PTX 6.3 extends `wmma` instruction to support s8/u8/s4/u4/b1 -> s32.Artem Belevich2019-04-251-70/+357
| | | | | | | | | | | | All of the new instructions are still handled mostly by tablegen. I've slightly refactored the code to drive intrinsic/instruction generation from a master list of supported variants, so all irregularities have to be implemented in one place only. The test generation script wmma.py has been refactored in a similar way. Differential Revision: https://reviews.llvm.org/D60015 llvm-svn: 359247
* [NVPTX] generate correct MMA instruction mnemonics with PTX63+.Artem Belevich2019-04-251-3/+14
| | | | | | | | | | | PTX 6.3 requires using ".aligned" in the MMA instruction names. In order to generate correct name, now we pass current PTX version to each instruction as an extra constant operand and InstPrinter adjusts its output accordingly. Differential Revision: https://reviews.llvm.org/D59393 llvm-svn: 359246
* Python compat - print statementSerge Guelton2019-01-031-0/+2
| | | | | | | | | Make sure all print statements are compatible with Python 2 and Python3 using the `from __future__ import print_function` statement. Differential Revision: https://reviews.llvm.org/D56249 llvm-svn: 350307
* [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma ↵Artem Belevich2018-04-181-18/+23
| | | | | | | | | | instructions. The new instructions were added added for sm_70+ GPUs in CUDA-9.1. Differential Revision: https://reviews.llvm.org/D45068 llvm-svn: 330296
* [NVPTX] Make tensor shape part of WMMA intrinsic's name.Artem Belevich2018-03-211-39/+44
| | | | | | | | | | This is needed for the upcoming implementation of the new 8x32x16 and 32x8x16 variants of WMMA instructions introduced in CUDA 9.1. Differential Revision: https://reviews.llvm.org/D44719 llvm-svn: 328158
* [NVPTX] Make tensor load/store intrinsics overloaded.Artem Belevich2018-03-201-16/+36
| | | | | | | | | | | | | | | | This way we can support address-space specific variants without explicitly encoding the space in the name of the intrinsic. Less intrinsics to deal with -> less boilerplate. Added a bit of tablegen magic to match/replace an intrinsics with a pointer argument in particular address space with the space-specific instruction variant. Updated tests to use non-default address spaces. Differential Revision: https://reviews.llvm.org/D43268 llvm-svn: 328006
* [NVPTX] Implemented wmma intrinsics and instructions.Artem Belevich2017-10-121-0/+201
WMMA = "Warp Level Matrix Multiply-Accumulate". These are the new instructions introduced in PTX6.0 and available on sm_70 GPUs. Differential Revision: https://reviews.llvm.org/D38645 llvm-svn: 315601
OpenPOWER on IntegriCloud