diff options
| author | Deven Desai <36858332+deven-amd@users.noreply.github.com> | 2019-09-26 23:49:51 -0700 |
|---|---|---|
| committer | A. Unique TensorFlower <gardener@tensorflow.org> | 2019-09-27 00:22:32 -0700 |
| commit | fee40fef5c37fee2b398d4d6ec28958bf5c0c0f5 (patch) | |
| tree | e87354a12e4a24d998d8d7592e5bdeb06880c573 /mlir/test/Target | |
| parent | 7385d8789560a392971c60426c7d17569551bd32 (diff) | |
| download | bcm5719-llvm-fee40fef5c37fee2b398d4d6ec28958bf5c0c0f5.tar.gz bcm5719-llvm-fee40fef5c37fee2b398d4d6ec28958bf5c0c0f5.zip | |
[ROCm] Adding ROCDL Dialect.
This commit introduces the ROCDL Dialect (i.e. the ROCDL ops + the code to lower those ROCDL ops to LLWM intrinsics/functions). Think of ROCDL Dialect as analogous to the NVVM Dialect, but for AMD GPUs. This patch contains just the essentials needed to get a simple example up and running. We expect to make further additions to the ROCDL Dialect.
This is the first of 3 commits, the follow-up will be:
* add a pass that lowers GPU Dialect to ROCDL Dialect
* add a "mlir-rocm-runner" utility
Closes tensorflow/mlir#146
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/146 from deven-amd:deven-rocdl-dialect e78e8005c75a78912631116c78dc844fcc4b0de9
PiperOrigin-RevId: 271511259
Diffstat (limited to 'mlir/test/Target')
| -rw-r--r-- | mlir/test/Target/rocdl.mlir | 35 |
1 files changed, 35 insertions, 0 deletions
diff --git a/mlir/test/Target/rocdl.mlir b/mlir/test/Target/rocdl.mlir new file mode 100644 index 00000000000..5665b7156e8 --- /dev/null +++ b/mlir/test/Target/rocdl.mlir @@ -0,0 +1,35 @@ +// RUN: mlir-translate -mlir-to-rocdlir %s | FileCheck %s + +func @rocdl_special_regs() -> !llvm.i32 { + // CHECK-LABEL: rocdl_special_regs + // CHECK: call i32 @llvm.amdgcn.workitem.id.x() + %1 = rocdl.workitem.id.x : !llvm.i32 + // CHECK: call i32 @llvm.amdgcn.workitem.id.y() + %2 = rocdl.workitem.id.y : !llvm.i32 + // CHECK: call i32 @llvm.amdgcn.workitem.id.z() + %3 = rocdl.workitem.id.z : !llvm.i32 + // CHECK: call i32 @llvm.amdgcn.workgroup.id.x() + %4 = rocdl.workgroup.id.x : !llvm.i32 + // CHECK: call i32 @llvm.amdgcn.workgroup.id.y() + %5 = rocdl.workgroup.id.y : !llvm.i32 + // CHECK: call i32 @llvm.amdgcn.workgroup.id.z() + %6 = rocdl.workgroup.id.z : !llvm.i32 + // CHECK: call i32 @__ockl_get_local_size(i32 0) + %7 = rocdl.workgroup.dim.x : !llvm.i32 + // CHECK: call i32 @__ockl_get_local_size(i32 1) + %8 = rocdl.workgroup.dim.y : !llvm.i32 + // CHECK: call i32 @__ockl_get_local_size(i32 2) + %9 = rocdl.workgroup.dim.z : !llvm.i32 + // CHECK: call i32 @__ockl_get_global_size(i32 0) + %10 = rocdl.grid.dim.x : !llvm.i32 + // CHECK: call i32 @__ockl_get_global_size(i32 1) + %11 = rocdl.grid.dim.y : !llvm.i32 + // CHECK: call i32 @__ockl_get_global_size(i32 2) + %12 = rocdl.grid.dim.z : !llvm.i32 + llvm.return %1 : !llvm.i32 +} + +func @kernel_func() attributes {gpu.kernel} { + // CHECK-LABEL: amdgpu_kernel void @kernel_func + llvm.return +} |

