summaryrefslogtreecommitdiffstats
path: root/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[mlir] Create a gpu.module operation for the GPU Dialect."Benjamin Kramer2020-01-161-10/+11
| | | | | | | This reverts commit 4624a1e8ac8a3f69cc887403b976f538f587744a. Causing problems downstream. (cherry picked from commit 0133cc60e4e230ee2c176c23eff5aa2f4ee17a75)
* [mlir] Create a gpu.module operation for the GPU Dialect.Tres Popp2020-01-141-11/+10
| | | | | | | | | | | | | | | | | Summary: This is based on the use of code constantly checking for an attribute on a model and instead represents the distinct operaion with a different op. Instead, this op can be used to provide better filtering. Reviewers: herhut, mravishankar, antiagainst, rriddle Reviewed By: herhut, antiagainst, rriddle Subscribers: liufengdb, aartbik, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72336
* Unbreak the mlir build after 202ab273e6eca134b69882f100c666fcd3affbcfBenjamin Kramer2020-01-131-1/+1
|
* [mlir] Added missing GPU lowering ops.Julian Gross2020-01-131-1/+8
| | | | | | | | | | | Summary: This diff adds missing GPU lowering ops to MLIR. Reviewers: herhut, pifon2a, ftynse Tags: #pre-merge_beta_testing, #llvm Differential Revision: https://reviews.llvm.org/D72439
* [mlir] NFC: Remove Value::operator* and Value::operator-> now that Value is ↵River Riddle2020-01-111-13/+13
| | | | | | | | | | properly value-typed. Summary: These were temporary methods used to simplify the transition. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D72548
* NFC: Replace ValuePtr with Value and remove it now that Value is value-typed.River Riddle2019-12-231-108/+107
| | | | | | ValuePtr was a temporary typedef during the transition to a value-typed Value. PiperOrigin-RevId: 286945714
* Adjust License.txt file to use the LLVM licenseMehdi Amini2019-12-231-13/+4
| | | | PiperOrigin-RevId: 286906740
* NFC: Introduce new ValuePtr/ValueRef typedefs to simplify the transition to ↵River Riddle2019-12-221-108/+109
| | | | | | | | | | Value being value-typed. This is an initial step to refactoring the representation of OpResult as proposed in: https://groups.google.com/a/tensorflow.org/g/mlir/c/XXzzKhqqF_0/m/v6bKb08WCgAJ This change will make it much simpler to incrementally transition all of the existing code to use value-typed semantics. PiperOrigin-RevId: 286844725
* Add gpu.shuffle op.Christian Sigg2019-12-201-2/+60
| | | | | | | | This will allow us to lower most of gpu.all_reduce (when all_reduce doesn't exist in the target dialect) within the GPU dialect, and only do target-specific lowering for the shuffle op. PiperOrigin-RevId: 286548256
* NFC: Remove unnecessary 'llvm::' prefix from uses of llvm symbols declared ↵River Riddle2019-12-181-3/+3
| | | | | | | | in `mlir` namespace. Aside from being cleaner, this also makes the codebase more consistent. PiperOrigin-RevId: 286206974
* Harden the requirements to memory attribution types in gpu.funcAlex Zinenko2019-12-181-2/+25
| | | | | | | | | | When memory attributions are present in `gpu.func`, require that they are of memref type and live in memoryspaces 3 and 5 for workgroup and private memory attributions, respectively. Adapt the conversion from the GPU dialect to the NVVM dialect to drop the private memory space from attributions as NVVM is able to model them as local `llvm.alloca`s in the default memory space. PiperOrigin-RevId: 286161763
* Plug gpu.func into the GPU lowering pipelinesAlex Zinenko2019-12-161-3/+37
| | | | | | | | | | | | | | This updates the lowering pipelines from the GPU dialect to lower-level dialects (NVVM, SPIRV) to use the recently introduced gpu.func operation instead of a standard function annotated with an attribute. In particular, the kernel outlining is updated to produce gpu.func instead of std.func and the individual conversions are updated to consume gpu.funcs and disallow standard funcs after legalization, if necessary. The attribute "gpu.kernel" is preserved in the generic syntax, but can also be used with the custom syntax on gpu.funcs. The special kind of function for GPU allows one to use additional features such as memory attribution. PiperOrigin-RevId: 285822272
* Fix maskAndClamp in gpu.all_reduce.Christian Sigg2019-12-131-2/+5
| | | | | | The clamp value determines the returned predicate. Previously, the clamp value was fixed to 31 and the predicate was therefore always true. This is incorrect for partial warp reductions, but went unnoticed because the returned values happened to be zero (but it could be anything). PiperOrigin-RevId: 285343160
* Automated rollback of commit f68ac464d818629e0fe10c23b44ac782d64a12d2Christian Sigg2019-12-121-2/+2
| | | | PiperOrigin-RevId: 285162061
* Switch from shfl.bfly to shfl.down.Christian Sigg2019-12-121-2/+2
| | | | | | | Both work for the current use case, but the latter allows implementing prefix sums and is a little easier to understand for partial warps. PiperOrigin-RevId: 285145287
* NFC: Fix naming inconsistency: FuncOpLowering -> GPUFuncOpLowering.Christian Sigg2019-12-111-7/+3
| | | | | | Remove nested anonymous namespace. PiperOrigin-RevId: 284987357
* Add a function to get lowering patterns from GPU to NVVM.Stephan Herhut2019-12-111-13/+18
| | | | | | This enables combining the patterns with other patterns into larger lowerings. PiperOrigin-RevId: 284979271
* Update the builder API to take ValueRange instead of ArrayRef<Value *>River Riddle2019-12-071-4/+3
| | | | | | This allows for users to provide operand_range and result_range in builder.create<> calls, instead of requiring an explicit copy into a separate data structure like SmallVector/std::vector. PiperOrigin-RevId: 284360710
* LLVM::GlobalOp: take address space as builder argumentAlex Zinenko2019-12-061-7/+2
| | | | | | | | | Accept the address space of the global as a builder argument when constructing an LLVM::GlobalOp instance. This decreases the reliance of LLVM::GlobalOp users on the internal name of the attribute used for this purpose. Update several uses of the address space in GPU to NVVM conversion. PiperOrigin-RevId: 284233254
* Add conversions of GPU func with memory attributions to LLVM/NVVMAlex Zinenko2019-12-061-1/+144
| | | | | | | | | | | | | | | | | | | GPU functions use memory attributions, a combination of Op attributes and region arguments, to specify function-wide buffers placed in workgroup or private memory spaces. Introduce a lowering pattern for GPU functions to be converted to LLVM functions taking into account memory attributions. Workgroup attributions get transformed into module-level globals with unique names derived from function names. Private attributions get converted into llvm.allocas inside the function body. In both cases, we inject at the beginning of the function the IR that obtains the raw pointer to the data and populates a MemRef descriptor based on the MemRef type of buffer, making attributions compose with the rest of the MemRef lowering and transparent for use with std.load and std.store. While using raw pointers instead of descriptors might have been more efficient, it is better implemented as a canonicalization or a separate transformation so that non-attribution memrefs could also benefit from it. PiperOrigin-RevId: 284208396
* Introduce Linkage attribute to the LLVM dialectAlex Zinenko2019-12-021-2/+2
| | | | | | | | | | | LLVM IR supports linkage on global objects such as global variables and functions. Introduce the Linkage attribute into the LLVM dialect, backed by an integer storage. Use this attribute on LLVM::GlobalOp and make it mandatory. Implement parsing/printing of the attribute and conversion to LLVM IR. See tensorflow/mlir#277. PiperOrigin-RevId: 283309328
* Rename CLI flags -lower-gpu-ops-to-*-ops to -convert-gpu-to-*Alex Zinenko2019-11-181-2/+1
| | | | | | | This makes the flags consistent with the naming scheme used elsewhere in the codebase for dialect conversions. PiperOrigin-RevId: 281027517
* Add custom lowering of ExpOp for NVVM and ROCM.Alexander Belyaev2019-10-241-1/+4
| | | | PiperOrigin-RevId: 276440911
* Unify GPU op definition names with other dialects.Christian Sigg2019-10-211-9/+10
| | | | | | Rename GPU op names from gpu_Foo to GPU_FooOp. PiperOrigin-RevId: 275882232
* Get active source lane predicate from shuffle instruction.Christian Sigg2019-10-191-18/+20
| | | | | | | | nvvm.shfl.sync.bfly optionally returns a predicate whether source lane was active. Support for this was added to clang in https://reviews.llvm.org/D68892. Add an optional 'pred' unit attribute to the instruction to return this predicate. Specify this attribute in the partial warp reduction so we don't need to manually compute the predicate. PiperOrigin-RevId: 275616564
* Add gpu.barrier op to synchronize invocations of a local workgroup.Christian Sigg2019-10-181-0/+4
| | | | | | | | Adding gen table for rewrite patterns from GPU to NVVM dialect. Copy missing op documentation from GPUOps.td to GPU.md. PiperOrigin-RevId: 275419588
* Support custom accumulator provided as region to gpu.all_reduce.Christian Sigg2019-10-161-8/+105
| | | | | | | | | | In addition to specifying the type of accumulation through the 'op' attribute, the accumulation can now also be specified as arbitrary code region. Adds a gpu.yield op to specify the result of the accumulation. Also support more types (integers) and accumulations (mul). PiperOrigin-RevId: 275065447
* Mark GPU dialect as illegal when lowering to NVVM.Christian Sigg2019-10-101-0/+1
| | | | PiperOrigin-RevId: 273948293
* Use llvm.func to define functions with wrapped LLVM IR function typeAlex Zinenko2019-10-101-2/+0
| | | | | | | | | | | | | | This function-like operation allows one to define functions that have wrapped LLVM IR function type, in particular variadic functions. The operation was added in parallel to the existing lowering flow, this commit only switches the flow to use it. Using a custom function type makes the LLVM IR dialect type system more consistent and avoids complex conversion rules for functions that previously had to use the built-in function type instead of a wrapped LLVM IR dialect type and perform conversions during the analysis. PiperOrigin-RevId: 273910855
* Change to doxygen comments. NFC.Christian Sigg2019-10-091-63/+63
| | | | PiperOrigin-RevId: 273707610
* Support reduction of partial warps.Christian Sigg2019-10-071-88/+159
| | | | | | gpu.all_reduce now supports block sizes that are not multiple of 32. PiperOrigin-RevId: 273255204
* Moving the GPUIndexIntrinsicOpLowering template to a common locationDeven Desai2019-10-041-70/+1
| | | | | | | | | | The GPUIndexIntrinsicOpLowering template is currently used by the code in both the GPUToNVVM and GPUToROCDL dirs. Moving it to a common location to remove code duplication. Closes tensorflow/mlir#163 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/163 from deven-amd:deven-refactor-gpu-index-ops-lowering b8dc2a5f5353df196039b6ff2ad42106028693ed PiperOrigin-RevId: 272863297
* Fix typos, NFC.Christian Sigg2019-10-041-1/+1
| | | | PiperOrigin-RevId: 272851237
* Change all_reduce lowering to support 2D and 3D blocks.Christian Sigg2019-10-011-43/+124
| | | | | | | | Perform second reduce only with first warp. This requires an additional __sync_threads(), but doesn't need special handling when the last warp is small. This simplifies support for block sizes that are not multiple of 32. Supporting partial warp reduce will be done in a separate CL. PiperOrigin-RevId: 272168917
* Switch comments from GPU dialect terms to CUDA terms (NFC).Christian Sigg2019-09-301-8/+7
| | | | | | local workgroup -> block, subgroup -> warp, invocation -> thread. PiperOrigin-RevId: 271946342
* Add AllReduceOp to GPU dialect with lowering to NVVM.Christian Sigg2019-09-261-2/+169
| | | | | | | | The reduction operation is currently fixed to "add", and the scope is fixed to "workgroup". The implementation is currently limited to sizes that are multiple 32 (warp size) and no larger than 1024. PiperOrigin-RevId: 271290265
* Outline GPU kernel function into a nested module.Christian Sigg2019-09-231-38/+17
| | | | | | | | Roll forward of commit 5684a12. When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 270639748
* Automated rollback of commit 5684a12434f923d03b6870f2aa16226bfb0b38b6George Karpenkov2019-09-191-17/+38
| | | | PiperOrigin-RevId: 270126672
* Outline GPU kernel function into a nested module.MLIR Team2019-09-191-38/+17
| | | | | | When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 269987720
* NFC: Finish replacing FunctionPassBase/ModulePassBase with OpPassBase.River Riddle2019-09-131-1/+1
| | | | | | These directives were temporary during the generalization of FunctionPass/ModulePass to OpPass. PiperOrigin-RevId: 268970259
* Port mlir-cuda-runner to use dialect conversion framework.Stephan Herhut2019-08-281-66/+98
| | | | | | | | | Instead of lowering the program in two steps (Standard->LLVM followed by GPU->NVVM), leading to invalid IR inbetween, the runner now uses one pattern based rewrite step to go directly from Standard+GPU to LLVM+NVVM. PiperOrigin-RevId: 265861934
* NFC: Move LLVMIR, SDBM, and StandardOps to the Dialect/ directory.River Riddle2019-08-191-2/+2
| | | | PiperOrigin-RevId: 264193915
* Change from llvm::make_unique to std::make_uniqueJacques Pienaar2019-08-171-1/+1
| | | | | | | | Switch to C++14 standard method as llvm::make_unique has been removed ( https://reviews.llvm.org/D66259). Also mark some targets as c++14 to ease next integrates. PiperOrigin-RevId: 263953918
* Express ownership transfer in PassManager API through std::unique_ptr (NFC)Mehdi Amini2019-08-121-2/+2
| | | | | | | | | | | | | | Since raw pointers are always passed around for IR construct without implying any ownership transfer, it can be error prone to have implicit ownership transferred the same way. For example this code can seem harmless: Pass *pass = .... pm.addPass(pass); pm.addPass(pass); pm.run(module); PiperOrigin-RevId: 263053082
* Move GPU dialect to {lib,include/mlir}/DialectAlex Zinenko2019-07-251-1/+1
| | | | | | | Per tacit agreement, individual dialects should now live in lib/Dialect/Name with headers in include/mlir/Dialect/Name and tests in test/Dialect/Name. PiperOrigin-RevId: 259896851
* GPUtoNVVM: adjust integer bitwidth when lowering special register opsAlex Zinenko2019-06-251-6/+27
| | | | | | | | | | | | | | GPU dialect operations (launch and launch_func) use `index` type for thread and block index values inside the kernel, for compatibility with affine loops. NVVM dialect operations, following the NVVM intrinsics, use `!llvm.i32` type, which does not necessarily have the same bit width as the lowered `index` type. Optionally sign-extend (indices are signed) or truncate the result of the NVVM dialect operation to the bit width of the lowered `index` type before passing it to other operations. This behavior is consistent with `std.index_cast`. We cannot use the latter since we are targeting LLVM dialect types directly, rather than standard integer types. PiperOrigin-RevId: 254980868
* Fix GPUToNVVM naming: NNVM should have been NVVMAlex Zinenko2019-06-191-1/+1
| | | | | | Rename `createLowerGpuOpsToNNVMOpsPass` to `createLowerGpuOpsToNVVMOpsPass`. PiperOrigin-RevId: 253801577
* Use llvm::StringSwitch in lowering of GPU ops to NVVM ops.Stephan Herhut2019-06-191-9/+8
| | | | PiperOrigin-RevId: 253767688
* Add lowering pass from GPU dialect operations to LLVM/NVVM intrinsics.Stephan Herhut2019-06-191-0/+119
PiperOrigin-RevId: 253551452
OpenPOWER on IntegriCloud