summaryrefslogtreecommitdiffstats
path: root/mlir/test
Commit message (Collapse)AuthorAgeFilesLines
...
* Roll-forward initial liveness analysis including test cases.Alexander Belyaev2019-12-113-0/+234
| | | | | | Fix the usage of the map size when appending to the map with []. PiperOrigin-RevId: 284985916
* Automated rollback of commit 98fbf41044d3364dbaf18db81b9e8d9520d14761Alexander Belyaev2019-12-113-234/+0
| | | | PiperOrigin-RevId: 284979684
* [Linalg] Add tiling for IndexedGenericOp with a region.Alexander Belyaev2019-12-111-0/+102
| | | | PiperOrigin-RevId: 284949355
* Add initial liveness analysis including test cases.Marcel Koester2019-12-113-0/+234
| | | | | | Closes tensorflow/mlir#255 PiperOrigin-RevId: 284935454
* [VectorOps] Add lowering of vector.insert to LLVM IRAart Bik2019-12-101-0/+53
| | | | | | | | | | | | | | | | | | | | | | | | | For example, an insert %0 = vector.insert %arg0, %arg1[3 : i32] : f32 into vector<4xf32> becomes %0 = llvm.mlir.constant(3 : i32) : !llvm.i32 %1 = llvm.insertelement %arg0, %arg1[%0 : !llvm.i32] : !llvm<"<4 x float>"> A more elaborate example, inserting an element in a higher dimension vector %0 = vector.insert %arg0, %arg1[3 : i32, 7 : i32, 15 : i32] : f32 into vector<4x8x16xf32> becomes %0 = llvm.extractvalue %arg1[3 : i32, 7 : i32] : !llvm<"[4 x [8 x <16 x float>]]"> %1 = llvm.mlir.constant(15 : i32) : !llvm.i32 %2 = llvm.insertelement %arg0, %0[%1 : !llvm.i32] : !llvm<"<16 x float>"> %3 = llvm.insertvalue %2, %arg1[3 : i32, 7 : i32] : !llvm<"[4 x [8 x <16 x float>]]"> PiperOrigin-RevId: 284882443
* Add VectorOp transform pattern which splits vector TransferReadOps to target ↵Andy Davis2019-12-102-1/+66
| | | | | | vector unroll size. PiperOrigin-RevId: 284880592
* More affine expr simplifications for floordiv and modUday Bondhugula2019-12-103-4/+10
| | | | | | | | | | | | | | | | | | | | | | Add one more simplification for floordiv and mod affine expressions. Examples: (2*d0 + 1) floordiv 2 is simplified to d0 (8*d0 + 4*d1 + d2) floordiv 4 simplified to 4*d0 + d1 + d2 floordiv 4. etc. Similarly, (4*d1 + 1) mod 2 is simplified to 1, (2*d0 + 8*d1) mod 8 simplified to 2*d0 mod 8. Change getLargestKnownDivisor to return int64_t to be consistent and to avoid casting at call sites (since the return value is used in expressions of int64_t/index type). Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#202 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/202 from bondhugula:affine b13fcb2f1c00a39ca5434613a02408e085a80e77 PiperOrigin-RevId: 284866710
* Fold TestLinalgTilePermutePatterns into TestLinalgTransformPatterns - NFCNicolas Vasilache2019-12-107-200/+95
| | | | | | Centralize all patterns that test Linalg transforms in a single pass. PiperOrigin-RevId: 284835938
* [Linalg] Add a Linalg iterator permutation transformationJose Ignacio Gomez2019-12-102-0/+65
| | | | | | | | | | | | | | | This patch closes issue tensorflow/mlir#272 We add a standalone iterator permutation transformation to Linalg. This transformation composes a permutation map with the maps in the "indexing_maps" attribute. It also permutes "iterator_types" accordingly. Change-Id: I7c1e693b8203aeecc595a7c012e738ca1100c857 Closes tensorflow/mlir#307 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/307 from tetuante:issue272 f7908d58792f4111119721885e247045104f1131 PiperOrigin-RevId: 284824102
* Uniformize Vector transforms as patterns on the model of Linalg - NFCNicolas Vasilache2019-12-105-6/+57
| | | | | | This reorganizes the vector transformations to be more easily testable as patterns and more easily composable into fused passes in the future. PiperOrigin-RevId: 284817474
* [VectorOps] Add a ShuffleOp to the VectorOps dialectAart Bik2019-12-093-19/+72
| | | | | | | | | | For example %0 = vector.shuffle %x, %y [3 : i32, 2 : i32, 1 : i32, 0 : i32] : vector<2xf32>, vector<2xf32> yields a vector<4xf32> result with a permutation of the elements of %x and %y PiperOrigin-RevId: 284657191
* [VectorOps] Fix off-by-one error in insert/extract validationAart Bik2019-12-091-0/+14
| | | | PiperOrigin-RevId: 284652653
* [spirv] Add CompositeConstruct operation.Denis Khalikov2019-12-092-1/+58
| | | | | | | Closes tensorflow/mlir#308 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/308 from denis0x0D:sandbox/composite_construct 9ef7180f77f9374bcd05afc4f9e6c1d2d72d02b7 PiperOrigin-RevId: 284613617
* [spirv] Add spv.IAdd, spv.ISub, and spv.IMul foldersLei Zhang2019-12-091-0/+155
| | | | | | | The patterns to be folded away can be commonly generated during lowering to SPIR-V. PiperOrigin-RevId: 284604855
* ODS: Generate named accessors for raw attributesJacques Pienaar2019-12-092-14/+22
| | | | | | | | | | | | | | Currently named accessors are generated for attributes returning a consumer friendly type. But sometimes the attributes are used while transforming an existing op and then the returned type has to be converted back into an attribute or the raw `getAttr` needs to be used. Generate raw named accessor for attributes to reference the raw attributes without having to use the string interface for better compile time verification. This allows calling `blahAttr()` instead of `getAttr("blah")`. Raw here refers to returning the underlying storage attribute. PiperOrigin-RevId: 284583426
* Add lowering for module with gpu.kernel_module attribute.Mahesh Ravishankar2019-12-091-0/+1
| | | | | | | | | | | The existing GPU to SPIR-V lowering created a spv.module for every function with gpu.kernel attribute. A better approach is to lower the module that the function lives in (which has the attribute gpu.kernel_module) to a spv.module operation. This better captures the host-device separation modeled by GPU dialect and simplifies the lowering as well. PiperOrigin-RevId: 284574688
* Unify vector op unrolling transformation.Andy Davis2019-12-091-29/+51
| | | | | | | Unifies vector op unrolling transformation, by using the same unrolling implementation for contraction and elementwise operations. Removes fakefork/join operations which are non longer needed now that we have the InsertStridedSlice operation. PiperOrigin-RevId: 284570784
* Minor spelling tweaksKazuaki Ishizaki2019-12-097-9/+9
| | | | | | Closes tensorflow/mlir#304 PiperOrigin-RevId: 284568358
* [StructuredOps][Linalg] Add a primitive pattern to rewrite the ↵Nicolas Vasilache2019-12-092-0/+40
| | | | | | | | | | | linalg.generic form of matmul to vector form. This CL uses the newly expanded matcher support to easily detect when a linalg.generic has a multiply-accumulate body. A linalg.generic with such a body is rewritten as a vector contraction. This CL additionally limits the rewrite to the case of matrix multiplication on contiguous and statically shaped memrefs for now. Before expanding further, we should harden the infrastructure for expressing custom ops with the structured ops abstraction. PiperOrigin-RevId: 284566659
* Add RegionRange for when need to abstract over different region iterationJacques Pienaar2019-12-091-1/+1
| | | | | | | | | | | | Follows ValueRange in representing a generic abstraction over the different ways to represent a range of Regions. This wrapper is not as ValueRange and only considers the current cases of interest: MutableArrayRef<Region> and ArrayRef<std::unique_ptr<Region>> as occurs during op construction vs op region querying. Note: ArrayRef<std::unique_ptr<Region>> allows for unset regions, so this range returns a pointer to a Region instead of a Region. PiperOrigin-RevId: 284563229
* Post-submit cleanups in RecursiveMatchersNicolas Vasilache2019-12-092-28/+44
| | | | | | | This CL addresses leftover cleanups and adds a test mixing RecursiveMatchers and m_Constant that captures properly. PiperOrigin-RevId: 284551567
* Add a layer of recursive matchers that compose.Nicolas Vasilache2019-12-083-0/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL adds support for building matchers recursively. The following matchers are provided: 1. `m_any()` can match any value 2. `m_val(Value *)` binds to a value and must match it 3. `RecursivePatternMatcher<OpType, Matchers...>` n-arity pattern that matches `OpType` and whose operands must be matched exactly by `Matchers...`. This allows building expression templates for patterns, declaratively, in a very natural fashion. For example pattern `p9` defined as follows: ``` auto mul_of_muladd = m_Op<MulFOp>(m_Op<MulFOp>(), m_Op<AddFOp>()); auto mul_of_anyadd = m_Op<MulFOp>(m_any(), m_Op<AddFOp>()); auto p9 = m_Op<MulFOp>(m_Op<MulFOp>( mul_of_muladd, m_Op<MulFOp>()), m_Op<MulFOp>(mul_of_anyadd, mul_of_anyadd)); ``` Successfully matches `%6` in: ``` %0 = addf %a, %b: f32 %1 = addf %a, %c: f32 // matched %2 = addf %c, %b: f32 %3 = mulf %a, %2: f32 // matched %4 = mulf %3, %1: f32 // matched %5 = mulf %4, %4: f32 // matched %6 = mulf %5, %5: f32 // matched ``` Note that 0-ary matchers can be used as leaves in place of n-ary matchers. This alleviates from passing explicit `m_any()` leaves. In the future, we may add extra patterns to specify that operands may be matched in any order. PiperOrigin-RevId: 284469446
* Update the builder API to take ValueRange instead of ArrayRef<Value *>River Riddle2019-12-075-11/+10
| | | | | | This allows for users to provide operand_range and result_range in builder.create<> calls, instead of requiring an explicit copy into a separate data structure like SmallVector/std::vector. PiperOrigin-RevId: 284360710
* Add a flag to the IRPrinter instrumentation to only print after a pass if ↵River Riddle2019-12-061-0/+8
| | | | | | | | there is a change to the IR. This adds an additional filtering mode for printing after a pass that checks to see if the pass actually changed the IR before printing it. This "change" detection is implemented using a SHA1 hash of the current operation and its children. PiperOrigin-RevId: 284291089
* Change inferReturnTypes to return LogicalResult and valuesJacques Pienaar2019-12-063-13/+15
| | | | | | | | Previously the error case was using a sentinel in the error case which was bad. Also make the one `build` invoke the other `build` to reuse verification there. And follow up on suggestion to use formatv which I missed during previous review. PiperOrigin-RevId: 284265762
* [VecOps] Rename vector.[insert|extract]element to just vector.[insert|extract]Aart Bik2019-12-063-40/+40
| | | | | | | Since these operations lower to [insert|extract][element|value] at LLVM dialect level, neither element nor value would correctly reflect the meaning. PiperOrigin-RevId: 284240727
* [VectorOps] Add lowering of vector.broadcast to LLVM IRAart Bik2019-12-062-3/+211
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For example, a scalar broadcast %0 = vector.broadcast %x : f32 to vector<2xf32> return %0 : vector<2xf32> which expands scalar x into vector [x,x] by lowering to the following LLVM IR dialect to implement the duplication over the leading dimension. %0 = llvm.mlir.undef : !llvm<"<2 x float>"> %1 = llvm.mlir.constant(0 : index) : !llvm.i64 %2 = llvm.insertelement %x, %0[%1 : !llvm.i64] : !llvm<"<2 x float>"> %3 = llvm.shufflevector %2, %0 [0 : i32, 0 : i32] : !llvm<"<2 x float>">, !llvm<"<2 x float>"> return %3 : vector<2xf32> In the trailing dimensions, the operand is simply "passed through", unless a more elaborate "stretch" is required. For example %0 = vector.broadcast %arg0 : vector<1xf32> to vector<4xf32> return %0 : vector<4xf32> becomes %0 = llvm.mlir.undef : !llvm<"<4 x float>"> %1 = llvm.mlir.constant(0 : index) : !llvm.i64 %2 = llvm.extractelement %arg0[%1 : !llvm.i64] : !llvm<"<1 x float>"> %3 = llvm.mlir.constant(0 : index) : !llvm.i64 %4 = llvm.insertelement %2, %0[%3 : !llvm.i64] : !llvm<"<4 x float>"> %5 = llvm.shufflevector %4, %0 [0 : i32, 0 : i32, 0 : i32, 0 : i32] : !llvm<"<4 x float>">, !llvm<"<4 x float>"> llvm.return %5 : !llvm<"<4 x float>"> PiperOrigin-RevId: 284219926
* Generate builder for ops that use InferTypeOpInterface trait in ODSJacques Pienaar2019-12-062-5/+27
| | | | | | | | | | For ops with infer type op interface defined, generate version that calls the inferal method on build. This is intermediate step to removing special casing of SameOperandsAndResultType & FirstAttrDereivedResultType. After that would be generating the inference code, with the initial focus on shaped container types. In between I plan to refactor these a bit to reuse generated paths. The intention would not be to add the type inference trait in multiple places, but rather to take advantage of the current modelling in ODS where possible to emit it instead. Switch the `inferReturnTypes` method to be static. Skipping ops with regions here as I don't like the Region vs unique_ptr<Region> difference at the moment, and I want the infer return type trait to be useful for verification too. So instead, just skip it for now to avoid churn. PiperOrigin-RevId: 284217913
* Add conversions of GPU func with memory attributions to LLVM/NVVMAlex Zinenko2019-12-061-0/+145
| | | | | | | | | | | | | | | | | | | GPU functions use memory attributions, a combination of Op attributes and region arguments, to specify function-wide buffers placed in workgroup or private memory spaces. Introduce a lowering pattern for GPU functions to be converted to LLVM functions taking into account memory attributions. Workgroup attributions get transformed into module-level globals with unique names derived from function names. Private attributions get converted into llvm.allocas inside the function body. In both cases, we inject at the beginning of the function the IR that obtains the raw pointer to the data and populates a MemRef descriptor based on the MemRef type of buffer, making attributions compose with the rest of the MemRef lowering and transparent for use with std.load and std.store. While using raw pointers instead of descriptors might have been more efficient, it is better implemented as a canonicalization or a separate transformation so that non-attribution memrefs could also benefit from it. PiperOrigin-RevId: 284208396
* Use regex to fix failure when stats are disabled.River Riddle2019-12-061-3/+3
| | | | | | | | It would be nice if we could detect if stats were enabled or not and use 'Requires', but this isn't possible to do at configure time. Fixes tensorflow/mlir#296 PiperOrigin-RevId: 284200271
* Unroll vector masks along with their associated vector arguments.Andy Davis2019-12-063-26/+36
| | | | | | | | | Updates vector ContractionOp to use proper vector masks (produced by CreateMaskOp/ConstantMaskOp). Leverages the following canonicalizations in unrolling unit test: CreateMaskOp -> ConstantMaskOp, StridedSliceOp(ConstantMaskOp) -> ConstantMaskOp Removes IndexTupleOp (no longer needed now that we have vector mask ops). Updates all unit tests. PiperOrigin-RevId: 284182168
* DimOp folding for alloc/view dynamic dimensionsUday Bondhugula2019-12-063-29/+74
| | | | | | | | | Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#253 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/253 from bondhugula:dimop a4b464f24ae63fd259114558d87e11b8ee4dae86 PiperOrigin-RevId: 284169689
* minor spelling tweaksKazuaki Ishizaki2019-12-061-2/+2
| | | | | | | Closes tensorflow/mlir#290 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/290 from kiszk:spelling_tweaks_201912 9d9afd16a723dd65754a04698b3976f150a6054a PiperOrigin-RevId: 284169681
* LLVM::AddressOfOp: properly take into account the address spaceAlex Zinenko2019-12-061-0/+16
| | | | | | | | | | The AddressOf operation in the LLVM dialect return a pointer to a global variable. The latter may be in a non-default address space as indicated by the "addr_space" attribute. Check that the address space of the pointer returned by AddressOfOp matches that of the referenced GlobalOp. Update the AddressOfOp builder to respect this constraint. PiperOrigin-RevId: 284138860
* Add include path to the TestDialect to fix broken build.River Riddle2019-12-051-0/+2
| | | | PiperOrigin-RevId: 284067891
* [Linalg] Add permutation information to tilingJose Ignacio Gomez2019-12-055-0/+197
| | | | | | | | | | | This patch closes issue tensorflow/mlir#271. It adds an optional permutation map to declarative tiling transformations. The map is expressed as a list of integers. Closes tensorflow/mlir#288 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/288 from tetuante:issue271 2df2938d6a1f01b3bc404ded08dea2dd1e10b588 PiperOrigin-RevId: 284064151
* Add UnrankedMemRef Typenmostafa2019-12-059-19/+159
| | | | | | | Closes tensorflow/mlir#261 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/261 from nmostafa:nmostafa/unranked 96b6e918f6ed64496f7573b2db33c0b02658ca45 PiperOrigin-RevId: 284037040
* [spirv] Add CompositeInsertOp operationDenis Khalikov2019-12-053-144/+188
| | | | | | | | | | A CompositeInsertOp operation make a copy of a composite object, while modifying one part of it. Closes tensorflow/mlir#292 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/292 from denis0x0D:sandbox/composite_insert 2200962b9057bda53cd2f2866b461e2797196380 PiperOrigin-RevId: 284036551
* Add support for instance specific pass statistics.River Riddle2019-12-052-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Statistics are a way to keep track of what the compiler is doing and how effective various optimizations are. It is useful to see what optimizations are contributing to making a particular program run faster. Pass-instance specific statistics take this even further as you can see the effect of placing a particular pass at specific places within the pass pipeline, e.g. they could help answer questions like "what happens if I run CSE again here". Statistics can be added to a pass by simply adding members of type 'Pass::Statistics'. This class takes as a constructor arguments: the parent pass pointer, a name, and a description. Statistics can be dumped by the pass manager in a similar manner to how pass timing information is dumped, i.e. via PassManager::enableStatistics programmatically; or -pass-statistics and -pass-statistics-display via the command line pass manager options. Below is an example: struct MyPass : public OperationPass<MyPass> { Statistic testStat{this, "testStat", "A test statistic"}; void runOnOperation() { ... ++testStat; ... } }; $ mlir-opt -pass-pipeline='func(my-pass,my-pass)' foo.mlir -pass-statistics Pipeline Display: ===-------------------------------------------------------------------------=== ... Pass statistics report ... ===-------------------------------------------------------------------------=== 'func' Pipeline MyPass (S) 15 testStat - A test statistic MyPass (S) 6 testStat - A test statistic List Display: ===-------------------------------------------------------------------------=== ... Pass statistics report ... ===-------------------------------------------------------------------------=== MyPass (S) 21 testStat - A test statistic PiperOrigin-RevId: 284022014
* Allow specification of the workgroup size for GPUToSPIRV lowering.Mahesh Ravishankar2019-12-051-1/+2
| | | | | | | | | | SPIR-V/Vulkan spec requires the workgroups size to be specified with the spv.ExecutionMode operation. This was hard-wired to be set to a particular value. It is now changed to be configurable by clients of the pass or of the patterns that implement the lowering from GPU to SPIRV. PiperOrigin-RevId: 284017482
* Add spv.AtomicCompareExchangeWeakLei Zhang2019-12-052-0/+45
| | | | PiperOrigin-RevId: 283997917
* [spirv] Fix nested loop (de)serializationLei Zhang2019-12-051-0/+83
| | | | | | | | | | | | | | | | | | | | | | For serialization, when we have nested ops, the inner loop will create multiple SPIR-V blocks. If the outer loop has block arguments (which corresponds to OpPhi instructions), we defer the handling of OpPhi's parent block handling until we serialized all blocks and then fix it up with the result <id>. These two cases happening together was generating invalid SPIR-V blob because we previously assume the parent block to be the block containing the terminator. That is not true anymore when the block contains structured control flow ops. If that happens, it should be fixed to use the structured control flow op's merge block. For deserialization, we record a map from header blocks to their corresponding merge and continue blocks during the initial deserialization and then use the info to construct spv.selection/spv.loop. The existing implementation will also fall apart when we have nested loops. If so, we clone all blocks for the outer loop, including the ones for the inner loop, to the spv.loop's region. So the map for header blocks' merge info need to be updated; otherwise we are operating on already deleted blocks. PiperOrigin-RevId: 283949230
* Move ModuleManager functionality into mlir::SymbolTable.Tres Popp2019-12-055-2/+39
| | | | | | | | | | Note for broken code, the following transformations occurred: ModuleManager::insert(Block::iterator, Operation*) - > SymbolTable::insert(Operation*, Block::iterator) ModuleManager::lookupSymbol -> SymbolTable::lookup ModuleManager::getModule() -> SymbolTable::getOp() ModuleManager::getContext() -> SymbolTable::getOp()->getContext() ModuleManager::* -> SymbolTable::* PiperOrigin-RevId: 283944635
* Add a CL option to Standard to LLVM lowering to use alloca instead of ↵Nicolas Vasilache2019-12-041-0/+6
| | | | | | | | | malloc/free. In the future, a more configurable malloc and free interface should be used and exposed via extra parameters to the `createLowerToLLVMPass`. Until requirements are gathered, a simple CL flag allows generating code that runs successfully on hardware that cannot use the stdlib. PiperOrigin-RevId: 283833424
* Add canonicalization patterns for vector CreateMaskOp and StridedSliceOp to ↵Andy Davis2019-12-043-0/+129
| | | | | | | | | | | be used in the unroll vector op transformation. Adds a ConstantMaskOp to the vector ops dialect. Adds the following canonicalization patterns: CreateMaskOp -> ConstantMaskOp StridedSliceOp(ConstantMaskOp) -> ConstantMaskOp PiperOrigin-RevId: 283816752
* Drop MaterializeVectorTransfers in favor of simpler declarative unrollingNicolas Vasilache2019-12-045-387/+25
| | | | | | Now that we have unrolling as a declarative pattern, we can drop a full pass that has gone stale. In the future we may want to add specific unrolling patterns for VectorTransferReadOp. PiperOrigin-RevId: 283806880
* Print out large elementsattr's such that they are parseable.Sean Silva2019-12-041-2/+10
| | | | | | | | | | | I found that when running crash reproducers, the elided elementsattr's would prevent parsing the IR repro. I found myself manually going and replacing the "..." with some valid IR. With this change, we now print elided attrs as `opaque<"", "0xDEADBEEF">` to clearly delineate them as being elided while still being parseable. PiperOrigin-RevId: 283781806
* [spirv] Adding sqrt op in the GLSL extension.Scott Todd2019-12-042-1/+39
| | | | PiperOrigin-RevId: 283769736
* Loop coalescing: fix pointer chainsing in use-chain traversalAlex Zinenko2019-12-041-0/+32
| | | | | | | | | | | | | | | In the replaceAllUsesExcept utility function called from loop coalescing the iteration over the use-chain is incorrect. The use list nodes (IROperands) have next/prev links, and bluntly resetting the use would make the loop to continue on uses of the value that was replaced instead of the original one. As a result, it could miss the existing uses and update the wrong ones. Make sure we increment the iterator before updating the use in the loop body. Reported-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#291. PiperOrigin-RevId: 283754195
* Added new FAbs, FCeil, Cos, Neg, Sign, Tanh operations.Julian Gross2019-12-041-1/+76
| | | | | | | Closes tensorflow/mlir#251 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/251 from dfki-jugr:new_ops 0398997bf9953016898f873068e22916a062eb2b PiperOrigin-RevId: 283750699
OpenPOWER on IntegriCloud