summaryrefslogtreecommitdiffstats
path: root/mlir/tools/mlir-cuda-runner
Commit message (Collapse)AuthorAgeFilesLines
* [mlir] Mark the MLIR tools for installation in CMakeKern Handa2020-02-051-1/+1
| | | | | | | | This binplaces `mlir-translate`, `mlir-cuda-runner`, and `mlir-cpu-runner` when building the CMake install target. Differential Revision: https://reviews.llvm.org/D73986 (cherry picked from commit b8004b7308b490b93231789cd05f86294a77d663)
* Revert "[mlir] Create a gpu.module operation for the GPU Dialect."Benjamin Kramer2020-01-161-1/+1
| | | | | | | This reverts commit 4624a1e8ac8a3f69cc887403b976f538f587744a. Causing problems downstream. (cherry picked from commit 0133cc60e4e230ee2c176c23eff5aa2f4ee17a75)
* [mlir] Create a gpu.module operation for the GPU Dialect.Tres Popp2020-01-141-1/+1
| | | | | | | | | | | | | | | | | Summary: This is based on the use of code constantly checking for an attribute on a model and instead represents the distinct operaion with a different op. Instead, this op can be used to provide better filtering. Reviewers: herhut, mravishankar, antiagainst, rriddle Reviewed By: herhut, antiagainst, rriddle Subscribers: liufengdb, aartbik, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72336
* Adjust License.txt file to use the LLVM licenseMehdi Amini2019-12-232-26/+8
| | | | PiperOrigin-RevId: 286906740
* Change CUDA tests to use print_memref.Christian Sigg2019-11-211-13/+0
| | | | | | Swap dimensions in all-reduce-op test. PiperOrigin-RevId: 281791744
* Make type and rank explicit in mcuMemHostRegister function.Christian Sigg2019-11-191-18/+24
| | | | | | Fix registered size of indirect MemRefType kernel arguments. PiperOrigin-RevId: 281362940
* Add support for alignment attribute in std.alloc.Nicolas Vasilache2019-11-121-0/+1
| | | | | | | | | | | | This CL adds an extra pointer to the memref descriptor to allow specifying alignment. In a previous implementation, we used 2 types: `linalg.buffer` and `view` where the buffer type was the unit of allocation/deallocation/alignment and `view` was the unit of indexing. After multiple discussions it was decided to use a single type, which conflates both, so the memref descriptor now needs to carry both pointers. This is consistent with the [RFC-Proposed Changes to MemRef and Tensor MLIR Types](https://groups.google.com/a/tensorflow.org/forum/#!searchin/mlir/std.view%7Csort:date/mlir/-wKHANzDNTg/4K6nUAp8AAAJ). PiperOrigin-RevId: 279959463
* GPUToCUDA: attach CUBIN to the nested module rather than to the functionAlex Zinenko2019-10-081-12/+13
| | | | | | | | | | | Originally, we were attaching attributes containing CUBIN blobs to the kernel function called by `gpu.launch_func`. This kernel is now contained in a nested module that is used as a compilation unit. Attach compiled CUBIN blobs to the module rather than to the function since we were compiling the module. This also avoids duplication of the attribute on multiple kernels within the same module. PiperOrigin-RevId: 273497303
* Fuse GenerateCubinAccessors pass into LaunchFunctToCudaAlex Zinenko2019-10-081-1/+0
| | | | | | | | | | | Now that the accessor function is a trivial getter of the global variable, it makes less sense to have the getter generation as a separate pass. Move the getter generation into the lowering of `gpu.launch_func` to CUDA calls. This change is mostly code motion, but the process can be simplified further by generating the addressof inplace instead of using a call. This is will be done in a follow-up. PiperOrigin-RevId: 273492517
* NFC: rename Conversion/ControlFlowToCFG to Conversion/LoopToStandardAlex Zinenko2019-10-031-1/+1
| | | | | | | | This makes the name of the conversion pass more consistent with the naming scheme, since it actually converts from the Loop dialect to the Standard dialect rather than working with arbitrary control flow operations. PiperOrigin-RevId: 272612112
* Replace spurious `long` stride type by int64_t - NFCNicolas Vasilache2019-10-021-1/+1
| | | | PiperOrigin-RevId: 272425434
* Normalize MemRefType lowering to LLVM as strided MemRef descriptorNicolas Vasilache2019-09-301-1/+3
| | | | | | | | | | | | | | | | | | | | | This CL finishes the implementation of the lowering part of the [strided memref RFC](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). Strided memrefs correspond conceptually to the following templated C++ struct: ``` template <typename Elem, size_t Rank> struct { Elem *ptr; int64_t offset; int64_t sizes[Rank]; int64_t strides[Rank]; }; ``` The linearization procedure for address calculation for strided memrefs is the same as for linalg views: `base_offset + SUM_i index_i * stride_i`. The following CL will unify Linalg and Standard by removing !linalg.view in favor of strided memrefs. PiperOrigin-RevId: 272033399
* Promote MemRefDescriptor to a pointer to struct when passing function ↵Nicolas Vasilache2019-09-271-11/+22
| | | | | | | | | | | | | boundaries in LLVMLowering. The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues. Solving the ABI problem generally is a very complex problem and most likely involves taking a dependence on clang that we do not want atm. A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement. PiperOrigin-RevId: 271591708
* Outline GPU kernel function into a nested module.Christian Sigg2019-09-231-35/+7
| | | | | | | | Roll forward of commit 5684a12. When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 270639748
* Automated rollback of commit 5684a12434f923d03b6870f2aa16226bfb0b38b6George Karpenkov2019-09-191-7/+35
| | | | PiperOrigin-RevId: 270126672
* Outline GPU kernel function into a nested module.MLIR Team2019-09-191-35/+7
| | | | | | When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 269987720
* NFC: Finish replacing FunctionPassBase/ModulePassBase with OpPassBase.River Riddle2019-09-131-1/+1
| | | | | | These directives were temporary during the generalization of FunctionPass/ModulePass to OpPass. PiperOrigin-RevId: 268970259
* Refactor the pass manager to support operations other than FuncOp/ModuleOp.River Riddle2019-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change generalizes the structure of the pass manager to allow arbitrary nesting pass managers for other operations, at any level. The only user visible change to existing code is the fact that a PassManager must now provide an MLIRContext on construction. A new class `OpPassManager` has been added that represents a pass manager on a specific operation type. `PassManager` will remain the top-level entry point into the pipeline, with OpPassManagers being nested underneath. OpPassManagers will still be implicitly nested if the operation type on the pass differs from the pass manager. To explicitly build a pipeline, the 'nest' methods on OpPassManager may be used: // Pass manager for the top-level module. PassManager pm(ctx); // Nest a pipeline operating on FuncOp. OpPassManager &fpm = pm.nest<FuncOp>(); fpm.addPass(...); // Nest a pipeline under the FuncOp pipeline that operates on spirv::ModuleOp OpPassManager &spvModulePM = pm.nest<spirv::ModuleOp>(); // Nest a pipeline on FuncOps inside of the spirv::ModuleOp. OpPassManager &spvFuncPM = spvModulePM.nest<FuncOp>(); To help accomplish this a new general OperationPass is added that operates on opaque Operations. This pass can be inserted in a pass manager of any type to operate on any operation opaquely. An example of this opaque OperationPass is a VerifierPass, that simply runs the verifier opaquely on the current operation. /// Pass to verify an operation and signal failure if necessary. class VerifierPass : public OperationPass<VerifierPass> { void runOnOperation() override { Operation *op = getOperation(); if (failed(verify(op))) signalPassFailure(); markAllAnalysesPreserved(); } }; PiperOrigin-RevId: 266840344
* Port mlir-cuda-runner to use dialect conversion framework.Stephan Herhut2019-08-281-23/+26
| | | | | | | | | Instead of lowering the program in two steps (Standard->LLVM followed by GPU->NVVM), leading to invalid IR inbetween, the runner now uses one pattern based rewrite step to go directly from Standard+GPU to LLVM+NVVM. PiperOrigin-RevId: 265861934
* NFC: Move LLVMIR, SDBM, and StandardOps to the Dialect/ directory.River Riddle2019-08-191-1/+1
| | | | PiperOrigin-RevId: 264193915
* Change from llvm::make_unique to std::make_uniqueJacques Pienaar2019-08-171-2/+2
| | | | | | | | Switch to C++14 standard method as llvm::make_unique has been removed ( https://reviews.llvm.org/D66259). Also mark some targets as c++14 to ease next integrates. PiperOrigin-RevId: 263953918
* NFC: Implement OwningRewritePatternList as a class instead of a using directive.River Riddle2019-08-051-1/+1
| | | | | | This allows for proper forward declaration, as opposed to leaking the internal implementation via a using directive. This also allows for all pattern building to go through 'insert' methods on the OwningRewritePatternList, replacing uses of 'push_back' and 'RewriteListBuilder'. PiperOrigin-RevId: 261816316
* Move GPU dialect to {lib,include/mlir}/DialectAlex Zinenko2019-07-251-2/+2
| | | | | | | Per tacit agreement, individual dialects should now live in lib/Dialect/Name with headers in include/mlir/Dialect/Name and tests in test/Dialect/Name. PiperOrigin-RevId: 259896851
* NFC: Expose a ConversionPatternRewriter for use with ConversionPatterns.River Riddle2019-07-191-2/+3
| | | | | | This specific PatternRewriter will allow for exposing hooks in the future that are only useful for the conversion framework, e.g. type conversions. PiperOrigin-RevId: 258818122
* Move shared cpu runner library to Support/JitRunner.Stephan Herhut2019-07-162-7/+5
| | | | PiperOrigin-RevId: 258347825
* Lower affine control flow to std control flow to LLVM dialectNicolas Vasilache2019-07-121-0/+1
| | | | | | | | | | | | | This CL splits the lowering of affine to LLVM into 2 parts: 1. affine -> std 2. std -> LLVM The conversions mostly consists of splitting concerns between the affine and non-affine worlds from existing conversions. Short-circuiting of affine `if` conditions was never tested or exercised and is removed in the process, it can be reintroduced later if needed. LoopParametricTiling.cpp is updated to reflect the newly added ForOp::build. PiperOrigin-RevId: 257794436
* mcuMemHostRegister: take into account sizeof(float)Alex Zinenko2019-07-121-2/+3
| | | | | | | | cuMemHostRegister expects the size of registered memory in bytes whereas the memref descriptor in memref_t contains the number of elements. Get the actual size in bytes instead. PiperOrigin-RevId: 257589116
* NFC: Rename Module to ModuleOp.River Riddle2019-07-101-2/+2
| | | | | | Module is a legacy name that only exists as a typedef of ModuleOp. PiperOrigin-RevId: 257427248
* NFC: Rename Function to FuncOp.River Riddle2019-07-101-2/+2
| | | | PiperOrigin-RevId: 257293379
* Simplify launch_func rewrite pattern in mlir-cuda-runnerAlex Zinenko2019-07-051-6/+2
| | | | | | | Use the bulk setOperands on the cloned operation, no need to cast or iterate over the operand list. PiperOrigin-RevId: 256643496
* Add an mlir-cuda-runner tool.Stephan Herhut2019-07-043-0/+339
This tool allows to execute MLIR IR snippets written in the GPU dialect on a CUDA capable GPU. For this to work, a working CUDA install is required and the build has to be configured with MLIR_CUDA_RUNNER_ENABLED set to 1. PiperOrigin-RevId: 256551415
OpenPOWER on IntegriCloud