summaryrefslogtreecommitdiffstats
path: root/mlir/test/Dialect/GPU
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[mlir] Create a gpu.module operation for the GPU Dialect."Benjamin Kramer2020-01-163-7/+8
| | | | | | | This reverts commit 4624a1e8ac8a3f69cc887403b976f538f587744a. Causing problems downstream. (cherry picked from commit 0133cc60e4e230ee2c176c23eff5aa2f4ee17a75)
* [mlir] Create a gpu.module operation for the GPU Dialect.Tres Popp2020-01-143-8/+7
| | | | | | | | | | | | | | | | | Summary: This is based on the use of code constantly checking for an attribute on a model and instead represents the distinct operaion with a different op. Instead, this op can be used to provide better filtering. Reviewers: herhut, mravishankar, antiagainst, rriddle Reviewed By: herhut, antiagainst, rriddle Subscribers: liufengdb, aartbik, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72336
* [mlir][GPU] introduce utilities for promotion to workgroup memoryAlex Zinenko2020-01-091-0/+119
| | | | | | | | | | | | | | | | | | | Introduce a set of function that promote a memref argument of a `gpu.func` to workgroup memory using memory attribution. The promotion boils down to additional loops performing the copy from the original argument to the attributed memory in the beginning of the function, and back at the end of the function using all available threads. The loop bounds are specified so as to adapt to any size of the workgroup. These utilities are intended to compose with other existing utilities (loop coalescing and tiling) in cases where the distribution of work across threads is uneven, e.g. copying a 2D memref with only the threads along the "x" dimension. Similarly, specialization of the kernel to specific launch sizes should be implemented as a separate pass combining constant propagation and canonicalization. Introduce a simple attribute-driven pass to test the promotion transformation since we don't have a heuristic at the moment. Differential revision: https://reviews.llvm.org/D71904
* Add gpu.shuffle op.Christian Sigg2019-12-202-0/+19
| | | | | | | | This will allow us to lower most of gpu.all_reduce (when all_reduce doesn't exist in the target dialect) within the GPU dialect, and only do target-specific lowering for the shuffle op. PiperOrigin-RevId: 286548256
* Harden the requirements to memory attribution types in gpu.funcAlex Zinenko2019-12-181-0/+33
| | | | | | | | | | When memory attributions are present in `gpu.func`, require that they are of memref type and live in memoryspaces 3 and 5 for workgroup and private memory attributions, respectively. Adapt the conversion from the GPU dialect to the NVVM dialect to drop the private memory space from attributions as NVVM is able to model them as local `llvm.alloca`s in the default memory space. PiperOrigin-RevId: 286161763
* Plug gpu.func into the GPU lowering pipelinesAlex Zinenko2019-12-163-15/+14
| | | | | | | | | | | | | | This updates the lowering pipelines from the GPU dialect to lower-level dialects (NVVM, SPIRV) to use the recently introduced gpu.func operation instead of a standard function annotated with an attribute. In particular, the kernel outlining is updated to produce gpu.func instead of std.func and the individual conversions are updated to consume gpu.funcs and disallow standard funcs after legalization, if necessary. The attribute "gpu.kernel" is preserved in the generic syntax, but can also be used with the custom syntax on gpu.funcs. The special kind of function for GPU allows one to use additional features such as memory attribution. PiperOrigin-RevId: 285822272
* Introduce Linkage attribute to the LLVM dialectAlex Zinenko2019-12-021-2/+2
| | | | | | | | | | | LLVM IR supports linkage on global objects such as global variables and functions. Introduce the Linkage attribute into the LLVM dialect, backed by an integer storage. Use this attribute on LLVM::GlobalOp and make it mandatory. Implement parsing/printing of the attribute and conversion to LLVM IR. See tensorflow/mlir#277. PiperOrigin-RevId: 283309328
* Introduce gpu.funcAlex Zinenko2019-11-252-0/+71
| | | | | | | | | | | | | | | | | | Introduce a new function-like operation to the GPU dialect to provide a placeholder for the execution semantic description and to add support for GPU memory hierarchy. This aligns with the overall goal of the dialect to expose the common abstraction layer for GPU devices, in particular by providing an MLIR unit of semantics (i.e. an operation) for memory modeling. This proposal has been discussed in the mailing list: https://groups.google.com/a/tensorflow.org/d/msg/mlir/RfXNP7Hklsc/MBNN7KhjAgAJ As decided, the "convergence" aspect of the execution model will be factored out into a new discussion and therefore is not included in this commit. This commit only introduces the operation but does not hook it up with the remaining flow. The intention is to develop the new flow while keeping the old flow operational and do the switch in a simple, separately reversible commit. PiperOrigin-RevId: 282357599
* Don't force newline before function attributesAlex Zinenko2019-11-211-1/+1
| | | | | | | | | | | Due to legacy reasons, a newline character followed by two spaces was always inserted before the attributes of the function Op in pretty form. This breaks formatting when functions are nested in some other operations. Don't print the newline and just put the attributes on the same line, which is also more consistent with module Op. Line breaking aware of indentation can be introduced separately into the parser if deemed useful. PiperOrigin-RevId: 281721793
* Extend kernel outlining to also consider dim worth inlining.Stephan Herhut2019-11-201-1/+1
| | | | PiperOrigin-RevId: 281483447
* Look for SymbolRefAttr in KernelOutlining instead of hard-coding CallOpMLIR Team2019-11-081-4/+14
| | | | | | | | This code should be exercised using the existing kernel outlining unit test, but let me know if I should add a dedicated unit test using a fake call instruction as well. PiperOrigin-RevId: 279436321
* Convert the Canonicalize and CSE passes to generic Operation Passes.River Riddle2019-10-241-1/+1
| | | | | | This allows for them to be used on other non-function, or even other function-like, operations. The algorithms are already generic, so this is simply changing the derived pass type. The majority of this change is just ensuring that the nesting of these passes remains the same, as the pass manager won't auto-nest them anymore. PiperOrigin-RevId: 276573038
* Fix minor spelling tweaks (NFC)Kazuaki Ishizaki2019-10-201-1/+1
| | | | | | Closes tensorflow/mlir#175 PiperOrigin-RevId: 275726876
* Use StrEnumAttr for gpu.allreduce op instead of StringAttr to better encode ↵Stephan Herhut2019-10-181-1/+1
| | | | | | constraints. PiperOrigin-RevId: 275448372
* Add gpu.barrier op to synchronize invocations of a local workgroup.Christian Sigg2019-10-181-0/+2
| | | | | | | | Adding gen table for rewrite patterns from GPU to NVVM dialect. Copy missing op documentation from GPUOps.td to GPU.md. PiperOrigin-RevId: 275419588
* Support custom accumulator provided as region to gpu.all_reduce.Christian Sigg2019-10-162-1/+79
| | | | | | | | | | In addition to specifying the type of accumulation through the 'op' attribute, the accumulation can now also be specified as arbitrary code region. Adds a gpu.yield op to specify the result of the accumulation. Also support more types (integers) and accumulations (mul). PiperOrigin-RevId: 275065447
* Use named modules for gpu.launch_funcAlex Zinenko2019-10-083-123/+215
| | | | | | | | | | | | | | | | | | The kernel function called by gpu.launch_func is now placed into an isolated nested module during the outlining stage to simplify separate compilation. Until recently, modules did not have names and could not be referenced. This limitation was circumvented by introducing a stub kernel at the same name at the same nesting level as the module containing the actual kernel. This relation is only effective in one direction: from actual kernel function to its launch_func "caller". Leverage the recently introduced symbol name attributes on modules to refer to a specific nested module from `gpu.launch_func`. This removes the implicit connection between the identically named stub and kernel functions. It also enables support for `gpu.launch_func`s to call different kernels located in the same module. PiperOrigin-RevId: 273491891
* Promote MemRefDescriptor to a pointer to struct when passing function ↵Nicolas Vasilache2019-09-271-7/+11
| | | | | | | | | | | | | boundaries in LLVMLowering. The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues. Solving the ABI problem generally is a very complex problem and most likely involves taking a dependence on clang that we do not want atm. A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement. PiperOrigin-RevId: 271591708
* Add AllReduceOp to GPU dialect with lowering to NVVM.Christian Sigg2019-09-261-0/+3
| | | | | | | | The reduction operation is currently fixed to "add", and the scope is fixed to "workgroup". The implementation is currently limited to sizes that are multiple 32 (warp size) and no larger than 1024. PiperOrigin-RevId: 271290265
* Clone called functions into nested GPU module.Christian Sigg2019-09-241-2/+12
| | | | PiperOrigin-RevId: 270891190
* Outline GPU kernel function into a nested module.Christian Sigg2019-09-231-6/+30
| | | | | | | | Roll forward of commit 5684a12. When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 270639748
* Automated rollback of commit 5684a12434f923d03b6870f2aa16226bfb0b38b6George Karpenkov2019-09-191-30/+6
| | | | PiperOrigin-RevId: 270126672
* Outline GPU kernel function into a nested module.MLIR Team2019-09-191-6/+30
| | | | | | When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 269987720
* Addressing some late review comments on kernel inlining.Stephan Herhut2019-09-091-1/+3
| | | | | | Just formatting and better lit tests, no functional change. PiperOrigin-RevId: 267942907
* Make GPU kernel outlining test independent of value names.Stephan Herhut2019-09-051-24/+36
| | | | PiperOrigin-RevId: 267323604
* Make GPU kernel outlining inline constants.Stephan Herhut2019-09-041-0/+20
| | | | | | | It is generally beneficial to pass less arguments to a kernel, so cloning constants into the kernel is beneficial. PiperOrigin-RevId: 267139084
* Add verification for dimension attribute on GPUDialect index operations.Stephan Herhut2019-08-281-0/+36
| | | | PiperOrigin-RevId: 266073204
* Move GPU dialect to {lib,include/mlir}/DialectAlex Zinenko2019-07-254-0/+362
Per tacit agreement, individual dialects should now live in lib/Dialect/Name with headers in include/mlir/Dialect/Name and tests in test/Dialect/Name. PiperOrigin-RevId: 259896851
OpenPOWER on IntegriCloud