summaryrefslogtreecommitdiffstats
path: root/mlir/lib/Conversion
Commit message (Collapse)AuthorAgeFilesLines
...
* Moving the GPUIndexIntrinsicOpLowering template to a common locationDeven Desai2019-10-043-139/+96
| | | | | | | | | | The GPUIndexIntrinsicOpLowering template is currently used by the code in both the GPUToNVVM and GPUToROCDL dirs. Moving it to a common location to remove code duplication. Closes tensorflow/mlir#163 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/163 from deven-amd:deven-refactor-gpu-index-ops-lowering b8dc2a5f5353df196039b6ff2ad42106028693ed PiperOrigin-RevId: 272863297
* Fix typos, NFC.Christian Sigg2019-10-043-4/+4
| | | | PiperOrigin-RevId: 272851237
* Add fpext and fptrunc to the Standard dialect and includes conversion to LLVMMLIR Team2019-10-031-3/+13
| | | | PiperOrigin-RevId: 272768027
* Add parentheses around boolean operators in assertAlex Zinenko2019-10-031-2/+3
| | | | | | This removes a warning and is generally a good practice. PiperOrigin-RevId: 272613597
* NFC: rename Conversion/ControlFlowToCFG to Conversion/LoopToStandardAlex Zinenko2019-10-035-17/+18
| | | | | | | | This makes the name of the conversion pass more consistent with the naming scheme, since it actually converts from the Loop dialect to the Standard dialect rather than working with arbitrary control flow operations. PiperOrigin-RevId: 272612112
* Extract MemRefType::getStridesAndOffset as a free function and fix dynamic ↵Nicolas Vasilache2019-10-021-7/+7
| | | | | | | | offset determination. This also adds coverage with a missing test, which uncovered a bug in the conditional for testing whether an offset is dynamic or not. PiperOrigin-RevId: 272505798
* [ROCm] Adding pass to lower GPU Dialect to ROCDL Dialect.Deven Desai2019-10-023-0/+159
| | | | | | | | | This is a follow-up to the PRtensorflow/mlir#146 which introduced the ROCDL Dialect. This PR introduces a pass to lower GPU Dialect to the ROCDL Dialect. As with the previous PR, this one builds on the work done by @whchung, and addresses most of the review comments in the original PR. Closes tensorflow/mlir#154 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/154 from deven-amd:deven-lower-gpu-to-rocdl 809893e08236da5ab6a38e3459692fa04247773d PiperOrigin-RevId: 272390729
* Fix and simplify CallOp/CallIndirectOp to LLVM::CallOp conversionAlex Zinenko2019-10-011-38/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | A recent ABI compatibility change affected the conversion from standard CallOp/CallIndirectOp to LLVM::CallOp by changing its signature. In order to analyze the signature, the code was looking up the callee symbol in the module. This is incorrect since, during the conversion, the module may contain both the original and the converted function op that have the same symbol name. There is no strict guarantee on which of the two symbols will be found by the lookup. The conversion was not failing because the type legalizer converts the LLVM types to themselves making the original and the converted function signatures ultimately produce the same type. Instead of looking up the function signature to get the list of result types, use the types of the CallOp/CallIndirectOp results which must match those of the function in valid IR. These types are guaranteed to be the original, unconverted types when converting the operation. Furthermore, this avoids the need to perform a lookup of a symbol name in the module which may be expensive. Finally, propagate attributes as-is from the original op to the converted op since they share the attribute name for the callee of direct calls and the rest of attributes are not affected by the conversion. This removes the need for additional contorsions between direct and indirect calls to extract the name of the optional callee attribute only to insert it back. This also prevents the conversion from unintentionally dropping the other attributes of the op. PiperOrigin-RevId: 272218871
* Unify Linalg types by using strided memrefsNicolas Vasilache2019-10-011-34/+36
| | | | | | | This CL finishes the implementation of the Linalg + Affine type unification of the [strided memref RFC](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). As a consequence, the !linalg.view type, linalg::DimOp, linalg::LoadOp and linalg::StoreOp can now disappear and Linalg can use standard types everywhere. PiperOrigin-RevId: 272187165
* Change all_reduce lowering to support 2D and 3D blocks.Christian Sigg2019-10-011-43/+124
| | | | | | | | Perform second reduce only with first warp. This requires an additional __sync_threads(), but doesn't need special handling when the last warp is small. This simplifies support for block sizes that are not multiple of 32. Supporting partial warp reduce will be done in a separate CL. PiperOrigin-RevId: 272168917
* Normalize MemRefType lowering to LLVM as strided MemRef descriptorNicolas Vasilache2019-09-301-57/+123
| | | | | | | | | | | | | | | | | | | | | This CL finishes the implementation of the lowering part of the [strided memref RFC](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). Strided memrefs correspond conceptually to the following templated C++ struct: ``` template <typename Elem, size_t Rank> struct { Elem *ptr; int64_t offset; int64_t sizes[Rank]; int64_t strides[Rank]; }; ``` The linearization procedure for address calculation for strided memrefs is the same as for linalg views: `base_offset + SUM_i index_i * stride_i`. The following CL will unify Linalg and Standard by removing !linalg.view in favor of strided memrefs. PiperOrigin-RevId: 272033399
* Switch comments from GPU dialect terms to CUDA terms (NFC).Christian Sigg2019-09-301-8/+7
| | | | | | local workgroup -> block, subgroup -> warp, invocation -> thread. PiperOrigin-RevId: 271946342
* Add TODO to revisit coupling of CallOp to MemRefType loweringNicolas Vasilache2019-09-271-0/+3
| | | | PiperOrigin-RevId: 271619132
* Promote MemRefDescriptor to a pointer to struct when passing function ↵Nicolas Vasilache2019-09-272-15/+199
| | | | | | | | | | | | | boundaries in LLVMLowering. The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues. Solving the ABI problem generally is a very complex problem and most likely involves taking a dependence on clang that we do not want atm. A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement. PiperOrigin-RevId: 271591708
* Add AllReduceOp to GPU dialect with lowering to NVVM.Christian Sigg2019-09-261-2/+169
| | | | | | | | The reduction operation is currently fixed to "add", and the scope is fixed to "workgroup". The implementation is currently limited to sizes that are multiple 32 (warp size) and no larger than 1024. PiperOrigin-RevId: 271290265
* Introduce splat op + provide its LLVM loweringUday Bondhugula2019-09-241-17/+39
| | | | | | | | | | | | | | | - introduce splat op in standard dialect (currently for int/float/index input type, output type can be vector or statically shaped tensor) - implement LLVM lowering (when result type is 1-d vector) - add constant folding hook for it - while on Ops.cpp, fix some stale names Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#141 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/141 from bondhugula:splat 48976a6aa0a75be6d91187db6418de989e03eb51 PiperOrigin-RevId: 270965304
* Normalize lowering of MemRef typesNicolas Vasilache2019-09-241-127/+58
| | | | | | | | | | | | | | | | The RFC for unifying Linalg and Affine compilation passes into an end-to-end flow with a predictable ABI and linkage to external function calls raised the question of why we have variable sized descriptors for memrefs depending on whether they have static or dynamic dimensions (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). This CL standardizes the ABI on the rank of the memrefs. The LLVM struct for a memref becomes equivalent to: ``` template <typename Elem, size_t Rank> struct { Elem *ptr; int64_t sizes[Rank]; }; ``` PiperOrigin-RevId: 270947276
* Add convenience methods to set an OpBuilder insertion point after an ↵Mehdi Amini2019-09-231-1/+1
| | | | | | Operation (NFC) PiperOrigin-RevId: 270727180
* Outline GPU kernel function into a nested module.Christian Sigg2019-09-233-77/+79
| | | | | | | | Roll forward of commit 5684a12. When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 270639748
* Fix a number of Clang-Tidy warnings.Christian Sigg2019-09-232-2/+2
| | | | PiperOrigin-RevId: 270632324
* Add integer sign- and zero-extension and truncation to standard.Manuel Freiberger2019-09-211-2/+18
| | | | | | | | | | | | This adds sign- and zero-extension and truncation of integer types to the standard dialects. This allows to perform integer type conversions without having to go to the LLVM dialect and introduce custom type casts (between standard and LLVM integer types). Closes tensorflow/mlir#134 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/134 from ombre5733:sext-zext-trunc-in-std c7657bc84c0ca66b304e53ec03797e09152e4d31 PiperOrigin-RevId: 270479722
* Automated rollback of commit 5684a12434f923d03b6870f2aa16226bfb0b38b6George Karpenkov2019-09-193-79/+77
| | | | PiperOrigin-RevId: 270126672
* Outline GPU kernel function into a nested module.MLIR Team2019-09-193-77/+79
| | | | | | When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 269987720
* Unify error messages to start with lower-case.MLIR Team2019-09-182-3/+3
| | | | PiperOrigin-RevId: 269803466
* Error out when kernel function is not found while translating GPU calls.MLIR Team2019-09-161-0/+4
| | | | PiperOrigin-RevId: 269327909
* Drop makePositionAttr and the like in favor of Builder::getI64ArrayAttrAlex Zinenko2019-09-161-14/+5
| | | | | | | | | | The helper functions makePositionAttr() and positionAttr() were originally introduced in the lowering-to-LLVM-dialect pass to construct integer array attributes that are used for static positions in extract/insertelement. Constructing an integer array attribute being fairly common, a utility function Builder::getI64ArrayAttr was later introduced into the Builder API. Drop makePositionAttr and similar homegrown functions and use that API instead. PiperOrigin-RevId: 269295836
* Add mechanism to specify extended instruction sets in SPIR-V.Mahesh Ravishankar2019-09-151-20/+5
| | | | | | | | | | | | | | | | | | Add support for specifying extended instructions sets. The operations in SPIR-V dialect are named as 'spv.<extension-name>.<op-name>'. Use this mechanism to define a 'Exp' operation from GLSL(450) instructions. Later CLs will add support for (de)serialization of these operations, and update the dialect generation scripts to auto-generate the specification using the spec directly. Additional changes: Add a Type Constraint to OpBase.td to check for vector of specified lengths. This is used to check that the vector type used in SPIR-V dialect are of lengths 2, 3 or 4. Update SPIRVBase.td to use this Type constraints for vectors. PiperOrigin-RevId: 269234377
* Update SPIR-V symbols and use GLSL450 instead of VulkanKHRLei Zhang2019-09-131-1/+1
| | | | | | | | | | | | | | SPIR-V recently publishes v1.5, which brings a bunch of symbols into core. So the suffix "KHR"/"EXT"/etc. is removed from the symbols. We use a script to pull information from the spec directly. Also changed conversion and tests to use GLSL450 instead of VulkanKHR memory model. GLSL450 is still the main memory model supported by Vulkan shaders and it does not require extra capability to enable. PiperOrigin-RevId: 268992661
* NFC: Finish replacing FunctionPassBase/ModulePassBase with OpPassBase.River Riddle2019-09-1310-11/+11
| | | | | | These directives were temporary during the generalization of FunctionPass/ModulePass to OpPass. PiperOrigin-RevId: 268970259
* Overload LLVM::TerminatorOp::build() for empty operands list.MLIR Team2019-09-091-8/+7
| | | | PiperOrigin-RevId: 268041584
* Retain address space during MLIR > LLVM conversion.MLIR Team2019-09-041-12/+9
| | | | PiperOrigin-RevId: 267206460
* Add folding rule and dialect materialization hook for spv.constantLei Zhang2019-09-032-5/+2
| | | | | | | | | This will allow us to use MLIR's folding infrastructure to deduplicate SPIR-V constants. This CL also changed isValidSPIRVType in SPIRVDialect to a static method. PiperOrigin-RevId: 266984403
* Add missing lowering to CFG in mlir-cpu-runner + related cleanupMehdi Amini2019-09-012-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - the list of passes run by mlir-cpu-runner included -lower-affine and -lower-to-llvm but was missing -lower-to-cfg (because -lower-affine at some point used to lower straight to CFG); add -lower-to-cfg in between. IR with affine ops can now be run by mlir-cpu-runner. - update -lower-to-cfg to be consistent with other passes (create*Pass methods were changed to return unique ptrs, but -lower-to-cfg appears to have been missed). - mlir-cpu-runner was unable to parse custom form of affine op's - fix link options - drop unnecessary run options from test/mlir-cpu-runner/simple.mlir (none of the test cases had loops) - -convert-to-llvmir was changed to -lower-to-llvm at some point, but the create pass method name wasn't updated (this pass converts/lowers to LLVM dialect as opposed to LLVM IR). Fix this. (If we prefer "convert", the cmd-line options could be changed to "-convert-to-llvm/cfg" then.) Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#115 PiperOrigin-RevId: 266666909
* Refactor the 'walk' methods for operations.River Riddle2019-08-291-1/+1
| | | | | | | | | | | | This change refactors and cleans up the implementation of the operation walk methods. After this refactoring is that the explicit template parameter for the operation type is no longer needed for the explicit op walks. For example: op->walk<AffineForOp>([](AffineForOp op) { ... }); is now accomplished via: op->walk([](AffineForOp op) { ... }); PiperOrigin-RevId: 266209552
* Port mlir-cuda-runner to use dialect conversion framework.Stephan Herhut2019-08-281-66/+98
| | | | | | | | | Instead of lowering the program in two steps (Standard->LLVM followed by GPU->NVVM), leading to invalid IR inbetween, the runner now uses one pattern based rewrite step to go directly from Standard+GPU to LLVM+NVVM. PiperOrigin-RevId: 265861934
* Enhance GPU To SPIR-V conversion to support builtins and load/store ops.Mahesh Ravishankar2019-08-273-22/+171
| | | | | | | | | | | | | | | | | To support a conversion of a simple load-compute-store kernel from GPU dialect to SPIR-V dialect, the conversion of operations like "gpu.block_dim", "gpu.thread_id" which allow threads to get the launch conversion is needed. In SPIR-V these are specified as global variables with builin attributes. This CL adds support to specify builtin variables in SPIR-V conversion framework. This is used to convert the relevant operations from GPU dialect to SPIR-V dialect. Also add support for conversion of load/store operation in Standard dialect to SPIR-V dialect. To simplify the conversion add a method to build a spv.AccessChain operation that automatically determines the return type based on the base pointer type and the indices provided. PiperOrigin-RevId: 265718525
* Let LLVMOpLowering specify a PatternBenefit - NFCNicolas Vasilache2019-08-221-3/+3
| | | | | | Currently the benefit is always set to 1 which limits the ability to do A->B->C lowering PiperOrigin-RevId: 264854146
* NFC: Move AffineOps dialect to the Dialect sub-directory.River Riddle2019-08-202-2/+2
| | | | PiperOrigin-RevId: 264482571
* ConvertLaunchFuncToCudaCalls: use LLVM dialect globalsAlex Zinenko2019-08-202-52/+29
| | | | | | | | | | | | This conversion has been using a stack-allocated array of i8 to store the null-terminated kernel name in order to pass it to the CUDA wrappers expecting a C string because the LLVM dialect was missing support for globals. Now that the suport is introduced, use a global instead. Refactor global string construction from GenerateCubinAccessors into a common utility function living in the LLVM namespace. PiperOrigin-RevId: 264382489
* Add support for LLVM lowering of binary ops on n-D vector typesNicolas Vasilache2019-08-201-27/+124
| | | | | | This CL allows binary operations on n-D vector types to be lowered to LLVMIR by performing an (n-1)-D extractvalue, 1-D vector operation and an (n-1)-D insertvalue. PiperOrigin-RevId: 264339118
* Move Linalg and VectorOps dialects to the Dialect subdir - NFCNicolas Vasilache2019-08-191-1/+1
| | | | PiperOrigin-RevId: 264277760
* NFC: Move LLVMIR, SDBM, and StandardOps to the Dialect/ directory.River Riddle2019-08-199-11/+11
| | | | PiperOrigin-RevId: 264193915
* Refactor linalg lowering to LLVMNicolas Vasilache2019-08-192-11/+9
| | | | | | | | | | | | | The linalg.view type used to be lowered to a struct containing a data pointer, offset, sizes/strides information. This was problematic when passing to external functions due to ABI, struct padding and alignment issues. The linalg.view type is now lowered to LLVMIR as a *pointer* to a struct containing the data pointer, offset and sizes/strides. This simplifies the interfacing with external library functions and makes it trivial to add new functions without creating a shim that would go from a value type struct to a pointer type. The consequences are that: 1. lowering explicitly uses llvm.alloca in lieu of llvm.undef and performs the proper llvm.load/llvm.store where relevant. 2. the shim creation function `getLLVMLibraryCallDefinition` disappears. 3. views are passed by pointer, scalars are passed by value. In the future, other structs will be passed by pointer (on a per-need basis). PiperOrigin-RevId: 264183671
* Change from llvm::make_unique to std::make_uniqueJacques Pienaar2019-08-177-13/+13
| | | | | | | | Switch to C++14 standard method as llvm::make_unique has been removed ( https://reviews.llvm.org/D66259). Also mark some targets as c++14 to ease next integrates. PiperOrigin-RevId: 263953918
* Add spirv::GlobalVariableOp that allows module level definition of variablesMahesh Ravishankar2019-08-171-11/+12
| | | | | | | | | | | | | | | | | | | | FuncOps in MLIR use explicit capture. So global variables defined in module scope need to have a symbol name and this should be used to refer to the variable within the function. This deviates from SPIR-V spec, which assigns an SSA value to variables at all scopes that can be used to refer to the variable, which requires SPIR-V functions to allow implicit capture. To handle this add a new op, spirv::GlobalVariableOp that can be used to define module scope variables. Since instructions need an SSA value, an new spirv::AddressOfOp is added to convert a symbol reference to an SSA value for use with other instructions. This also means the spirv::EntryPointOp instruction needs to change to allow initializers to be specified using symbol reference instead of SSA value The current spirv::VariableOp which returns an SSA value (as defined by SPIR-V spec) can still be used to define function-scope variables. PiperOrigin-RevId: 263951109
* Extend vector.outerproduct with an optional 3rd argumentNicolas Vasilache2019-08-161-47/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL adds an optional third argument to the vector.outerproduct instruction. When such a third argument is specified, it is added to the result of the outerproduct and is lowered to FMA intrinsic when the lowering supports it. In the future, we can add an attribute on the `vector.outerproduct` instruction to modify the operations for which to emit code (e.g. "+/*", "max/+", "min/+", "log/exp" ...). This CL additionally performs minor cleanups in the vector lowering and adds tests to improve coverage. This has been independently verified to result in proper fma instructions for haswell as follows. Input: ``` func @outerproduct_add(%arg0: vector<17xf32>, %arg1: vector<8xf32>, %arg2: vector<17x8xf32>) -> vector<17x8xf32> { %2 = vector.outerproduct %arg0, %arg1, %arg2 : vector<17xf32>, vector<8xf32> return %2 : vector<17x8xf32> } } ``` Command: ``` mlir-opt vector-to-llvm.mlir -vector-lower-to-llvm-dialect --disable-pass-threading | mlir-opt -lower-to-cfg -lower-to-llvm | mlir-translate --mlir-to-llvmir | opt -O3 | llc -O3 -march=x86-64 -mcpu=haswell -mattr=fma,avx2 ``` Output: ``` outerproduct_add: # @outerproduct_add # %bb.0: ... vmovaps 112(%rbp), %ymm8 vbroadcastss %xmm0, %ymm0 ... vbroadcastss 64(%rbp), %ymm15 vfmadd213ps 144(%rbp), %ymm8, %ymm0 # ymm0 = (ymm8 * ymm0) + mem ... vfmadd213ps 400(%rbp), %ymm8, %ymm9 # ymm9 = (ymm8 * ymm9) + mem ... ``` PiperOrigin-RevId: 263743359
* Simplify the classes that support SPIR-V conversion.Mahesh Ravishankar2019-08-152-31/+35
| | | | | | | | | | | | | | | | | | | | Modify the Type converters to have a SPIRVBasicTypeConverter which only handles conversion from standard types to SPIRV types. Rename SPIRVEntryFnConverter to SPIRVTypeConverter. This contains the SPIRVBasicTypeConverter within it. Remove SPIRVFnLowering class and have separate utility methods to lower a function as entry function or a non-entry function. The current setup could end with diamond inheritence that is not very friendly to use. For example, you could define the following Op conversion methods that lower from a dialect "Foo" which resuls in diamond inheritance. template<typename OpTy> class FooDialect : public SPIRVOpLowering<OpTy> {...}; class FooFnLowering : public FooDialect, SPIRVFnLowering {...}; PiperOrigin-RevId: 263597101
* GenerateCubinAccessors: use LLVM dialect constantsAlex Zinenko2019-08-131-70/+48
| | | | | | | | | | | | The GenerateCubinAccessors was generating functions that fill dynamically-allocated memory with the binary constant of a CUBIN attached as a stirng attribute to the GPU kernel. This approach was taken to circumvent the missing support for global constants in the LLVM dialect (and MLIR in general). Global constants were recently added to the LLVM dialect. Change the GenerateCubinAccessors pass to emit a global constant array of characters and a function that returns a pointer to the first character in the array. PiperOrigin-RevId: 263092052
* Express ownership transfer in PassManager API through std::unique_ptr (NFC)Mehdi Amini2019-08-127-19/+23
| | | | | | | | | | | | | | Since raw pointers are always passed around for IR construct without implying any ownership transfer, it can be error prone to have implicit ownership transferred the same way. For example this code can seem harmless: Pass *pass = .... pm.addPass(pass); pm.addPass(pass); pm.run(module); PiperOrigin-RevId: 263053082
* Add lowering of vector dialect to LLVM dialect.Nicolas Vasilache2019-08-124-10/+235
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL is step 3/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools. This CL adds support for converting MLIR n-D vector types to (n-1)-D arrays of 1-D LLVM vectors and a conversion VectorToLLVM that lowers the `vector.extractelement` and `vector.outerproduct` instructions to the proper mix of `llvm.vectorshuffle`, `llvm.extractelement` and `llvm.mulf`. This has been independently verified to produce proper avx2 code. Input: ``` func @vec_1d(%arg0: vector<4xf32>, %arg1: vector<8xf32>) -> vector<8xf32> { %2 = vector.outerproduct %arg0, %arg1 : vector<4xf32>, vector<8xf32> %3 = vector.extractelement %2[0 : i32]: vector<4x8xf32> return %3 : vector<8xf32> } ``` Command: ``` mlir-opt vector-to-llvm.mlir -vector-lower-to-llvm-dialect --disable-pass-threading | mlir-opt -lower-to-cfg -lower-to-llvm | mlir-translate --mlir-to-llvmir | opt -O3 | llc -O3 -march=x86-64 -mcpu=haswell -mattr=fma,avx2 ``` Output: ``` vec_1d: # @vec_1d # %bb.0: vbroadcastss %xmm0, %ymm0 vmulps %ymm1, %ymm0, %ymm0 retq ``` PiperOrigin-RevId: 262895929
OpenPOWER on IntegriCloud