| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- turn canonicalizeMapAndOperands into a template that works on both
sets and maps, and use it to introduce a utility to canonicalize an
affine integer set and its operands
- add pattern to canonicalize affine if op's.
- rename IntegerSet::getNumOperands -> IntegerSet::getNumInputs to be
consistent with AffineMap
- add missing accessors for IntegerSet
Doesn't need extensive testing since canonicalizeSetAndOperands just
reuses canonicalizeMapAndOperands' logic, and the latter is tested on
affine.apply map + operands; the new method works the same way on an
integer set + operands of an affine if op for example.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closes tensorflow/mlir#112
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/112 from bondhugula:set-canonicalize eff72f23250b96fa7d9f5caff3877440f5de2cec
PiperOrigin-RevId: 267532876
|
|
|
|
|
|
| |
This commit defines an initial implementation of the DialectInlinerInterface for the AffineOps dialect. This change allows for affine operations to be inlined into any region that is not an affine region. Inlining into affine regions requires special handling for dimension/symbol identifiers that will be added in followups.
PiperOrigin-RevId: 267467078
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SPIR-V can explicitly declare structured control-flow constructs using merge
instructions. These explicitly declare a header block before the control
flow diverges and a merge block where control flow subsequently converges.
These blocks delimit constructs that must nest, and can only be entered
and exited in structured ways.
Instead of having a `spv.LoopMerge` op to directly model loop merge
instruction for indicating the merge and continue target, we use regions
to delimit the boundary of the loop: the merge target is the next op
following the `spv.loop` op and the continue target is the block that
has a back-edge pointing to the entry block inside the `spv.loop`'s region.
This way it's easier to discover all blocks belonging to a construct and
it plays nicer with the MLIR system.
Updated the SPIR-V.md doc.
PiperOrigin-RevId: 267431010
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This defines a set of initial utilities for inlining a region(or a FuncOp), and defines a simple inliner pass for testing purposes.
A new dialect interface is defined, DialectInlinerInterface, that allows for dialects to override hooks controlling inlining legality. The interface currently provides the following hooks, but these are just premilinary and should be changed/added to/modified as necessary:
* isLegalToInline
- Determine if a region can be inlined into one of this dialect, *or* if an operation of this dialect can be inlined into a given region.
* shouldAnalyzeRecursively
- Determine if an operation with regions should be analyzed recursively for legality. This allows for child operations to be closed off from the legality checks for operations like lambdas.
* handleTerminator
- Process a terminator that has been inlined.
This cl adds support for inlining StandardOps, but other dialects will be added in followups as necessary.
PiperOrigin-RevId: 267426759
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The refactoring of ExecutionEngine dropped the usage of the irTransform function used to pass -O3 and other options to LLVM. As a consequence, the proper optimizations do not kick in in LLMV-land.
This CL makes use of the transform function and allows producing avx512 instructions, on an internal example, when using:
`mlir-cpu-runner -dump-object-file=1 -object-filename=foo.o` combined with `objdump -D foo.o`.
Assembly produced resembles:
```
2b2e: 62 72 7d 48 18 04 0e vbroadcastss (%rsi,%rcx,1),%zmm8
2b35: 62 71 7c 48 28 ce vmovaps %zmm6,%zmm9
2b3b: 62 72 3d 48 a8 c9 vfmadd213ps %zmm1,%zmm8,%zmm9
2b41: 62 f1 7c 48 28 cf vmovaps %zmm7,%zmm1
2b47: 62 f2 3d 48 a8 c8 vfmadd213ps %zmm0,%zmm8,%zmm1
2b4d: 62 f2 7d 48 18 44 0e vbroadcastss 0x4(%rsi,%rcx,1),%zmm0
2b54: 01
2b55: 62 71 7c 48 28 c6 vmovaps %zmm6,%zmm8
2b5b: 62 72 7d 48 a8 c3 vfmadd213ps %zmm3,%zmm0,%zmm8
2b61: 62 f1 7c 48 28 df vmovaps %zmm7,%zmm3
2b67: 62 f2 7d 48 a8 da vfmadd213ps %zmm2,%zmm0,%zmm3
2b6d: 62 f2 7d 48 18 44 0e vbroadcastss 0x8(%rsi,%rcx,1),%zmm0
2b74: 02
2b75: 62 f2 7d 48 a8 f5 vfmadd213ps %zmm5,%zmm0,%zmm6
2b7b: 62 f2 7d 48 a8 fc vfmadd213ps %zmm4,%zmm0,%zmm7
```
etc.
Fixes tensorflow/mlir#120
PiperOrigin-RevId: 267281097
|
|
|
|
| |
PiperOrigin-RevId: 267206460
|
|
|
|
|
|
| |
This function is only called from the verifier.
PiperOrigin-RevId: 267145495
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for replaceMemRefUsesWith / pipeline-data-transfer
- address remaining comments from PR tensorflow/mlir#87 for better test coverage for
pipeline-data-transfer/replaceAllMemRefUsesWith
- remove dead tag allocs the same way they are removed for the replaced buffers
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closes tensorflow/mlir#106
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/106 from bondhugula:followup 9e868666d047e8d43e5f82f43e4093b838c710fa
PiperOrigin-RevId: 267144774
|
|
|
|
|
|
|
| |
It is generally beneficial to pass less arguments to a kernel, so cloning constants
into the kernel is beneficial.
PiperOrigin-RevId: 267139084
|
|
|
|
| |
PiperOrigin-RevId: 267121729
|
|
|
|
|
|
| |
This CL adds support for proper cloning of Linalg ops that have regions (i.e. the generic linalg op). This is used to properly implement tiling and fusion for such ops. Adequate tests are added.
PiperOrigin-RevId: 267027176
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- introduce utility to convert memrefs with non-identity layout maps to
ones with identity layout maps: convert the type and rewrite/remap all
its uses
- add this utility to -simplify-affine-structures pass for testing
purposes
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closes tensorflow/mlir#104
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/104 from bondhugula:memref-normalize f2c914aa1890e8860326c9e33f9aa160b3d65e6d
PiperOrigin-RevId: 266985317
|
|
|
|
|
|
|
|
|
| |
This will allow us to use MLIR's folding infrastructure to deduplicate
SPIR-V constants.
This CL also changed isValidSPIRVType in SPIRVDialect to a static method.
PiperOrigin-RevId: 266984403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- the [begin, end) range identified for copying could end in between the
block, which makes hoisting invalid in some cases. Change the range
identification to always end with end of block.
- add test case to exercise these (with fast mem capacity set to minimal so
that single element memref buffers are generated at the innermost loop)
- the location of begin/end of the block range for data copying was
being confused with the insert points for copy in and copy out code.
In cases, where we choose to hoist transfers, these are separate.
- when copy loops are single iteration ones, promote their bodies at
the end of the pass.
- change default fast mem space to 1 (setting it to zero made it
generate DMA op's that won't verify in the default case - since the
DMA ops have a check for src/dest memref spaces being different).
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Co-Authored-By: Mehdi Amini <joker.eph@gmail.com>
Closes tensorflow/mlir#88
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/88 from bondhugula:datacopy 88697267c45e850c3ced87671e16e4a930c02a42
PiperOrigin-RevId: 266980911
|
|
|
|
|
|
|
|
| |
The assert assumed that the escaped character could not appear at the end of the string.
Fixes tensorflow/mlir#117
PiperOrigin-RevId: 266975471
|
|
|
|
|
|
|
|
|
|
| |
Remove unused variables and attributes from BaseViewConversionHelper
on mlir/lib/Dialect/Linalg/Transforms/LowerToLLVMDialect.cpp
Closes tensorflow/mlir#116
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/116 from alexst07:fix-warnings 5f638e4677492cf71a9cc040eeb6b57427d32e06
PiperOrigin-RevId: 266972082
|
|
|
|
|
|
|
|
|
|
| |
Some of the operations in the LLVM dialect are required to model the LLVM IR in
MLIR, for example "constant" operations are needed to declare a constant value
since MLIR, unlike LLVM, does not support immediate values as operands. To
avoid confusion with actual LLVM operations, we prefix such axuiliary
operations with "mlir.".
PiperOrigin-RevId: 266942838
|
|
|
|
| |
PiperOrigin-RevId: 266863802
|
|
|
|
|
|
| |
The SelectOp models the semantics of OpSelect from SPIR-V spec.
PiperOrigin-RevId: 266849559
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change generalizes the structure of the pass manager to allow arbitrary nesting pass managers for other operations, at any level. The only user visible change to existing code is the fact that a PassManager must now provide an MLIRContext on construction. A new class `OpPassManager` has been added that represents a pass manager on a specific operation type. `PassManager` will remain the top-level entry point into the pipeline, with OpPassManagers being nested underneath. OpPassManagers will still be implicitly nested if the operation type on the pass differs from the pass manager. To explicitly build a pipeline, the 'nest' methods on OpPassManager may be used:
// Pass manager for the top-level module.
PassManager pm(ctx);
// Nest a pipeline operating on FuncOp.
OpPassManager &fpm = pm.nest<FuncOp>();
fpm.addPass(...);
// Nest a pipeline under the FuncOp pipeline that operates on spirv::ModuleOp
OpPassManager &spvModulePM = pm.nest<spirv::ModuleOp>();
// Nest a pipeline on FuncOps inside of the spirv::ModuleOp.
OpPassManager &spvFuncPM = spvModulePM.nest<FuncOp>();
To help accomplish this a new general OperationPass is added that operates on opaque Operations. This pass can be inserted in a pass manager of any type to operate on any operation opaquely. An example of this opaque OperationPass is a VerifierPass, that simply runs the verifier opaquely on the current operation.
/// Pass to verify an operation and signal failure if necessary.
class VerifierPass : public OperationPass<VerifierPass> {
void runOnOperation() override {
Operation *op = getOperation();
if (failed(verify(op)))
signalPassFailure();
markAllAnalysesPreserved();
}
};
PiperOrigin-RevId: 266840344
|
|
|
|
|
|
| |
This interface will allow for providing hooks to interrop with operation folding. The first hook, 'shouldMaterializeInto', will allow for controlling which region to insert materialized constants into. The folder will generally materialize constants into the top-level isolated region, this allows for materializing into a lower level ancestor region if it is more profitable/correct.
PiperOrigin-RevId: 266702972
|
|
|
|
|
|
|
|
|
| |
pointer (NFC)
This is a convenient utility around the existing `getUsedValuesDefinedAbove()`
that take two regions.
PiperOrigin-RevId: 266686854
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- the list of passes run by mlir-cpu-runner included -lower-affine and
-lower-to-llvm but was missing -lower-to-cfg (because -lower-affine at
some point used to lower straight to CFG); add -lower-to-cfg in
between. IR with affine ops can now be run by mlir-cpu-runner.
- update -lower-to-cfg to be consistent with other passes (create*Pass methods
were changed to return unique ptrs, but -lower-to-cfg appears to have been
missed).
- mlir-cpu-runner was unable to parse custom form of affine op's - fix
link options
- drop unnecessary run options from test/mlir-cpu-runner/simple.mlir
(none of the test cases had loops)
- -convert-to-llvmir was changed to -lower-to-llvm at some point, but the
create pass method name wasn't updated (this pass converts/lowers to LLVM
dialect as opposed to LLVM IR). Fix this.
(If we prefer "convert", the cmd-line options could be changed to
"-convert-to-llvm/cfg" then.)
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closes tensorflow/mlir#115
PiperOrigin-RevId: 266666909
|
|
|
|
|
|
| |
AffineForOp themselves are pure and can be removed if there are no internal operations.
PiperOrigin-RevId: 266481293
|
|
|
|
|
|
| |
This pass class generalizes the current functionality between FunctionPass and ModulePass, and allows for operating on any operation type. The pass manager currently only supports OpPasses operating on FuncOp and ModuleOp, but this restriction will be relaxed in follow-up changes. A utility class OpPassBase<OpT> allows for generically referring to operation specific passes: e.g. FunctionPassBase == OpPassBase<FuncOp>.
PiperOrigin-RevId: 266442239
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces the bits to be able to dump JIT-compile
objects to external files by passing an object cache to OrcJit.
The new functionality is tested in mlir-cpu-runner under the flag
`dump-object-file`.
Closes tensorflow/mlir#95
PiperOrigin-RevId: 266439265
|
|
|
|
|
|
| |
Similar to enum, added a generator for structured data. This provide Dictionary that stores a fixed set of values and guarantees the values are valid. It is intended to store a fixed number of values by a given name.
PiperOrigin-RevId: 266437460
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is done by providing a walk callback that returns a WalkResult. This result is either `advance` or `interrupt`. `advance` means that the walk should continue, whereas `interrupt` signals that the walk should stop immediately. An example is shown below:
auto result = op->walk([](Operation *op) {
if (some_invariant)
return WalkResult::interrupt();
return WalkResult::advance();
});
if (result.wasInterrupted())
...;
PiperOrigin-RevId: 266436700
|
|
|
|
|
|
|
|
| |
This CL just covers the op definition, its parsing, printing,
and verification. (De)serialization is to be implemented
in a subsequent CL.
PiperOrigin-RevId: 266431077
|
|
|
|
|
|
| |
This avoids potential memory leaks from misuse of the API.
PiperOrigin-RevId: 266305750
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change refactors and cleans up the implementation of the operation walk methods. After this refactoring is that the explicit template parameter for the operation type is no longer needed for the explicit op walks. For example:
op->walk<AffineForOp>([](AffineForOp op) { ... });
is now accomplished via:
op->walk([](AffineForOp op) { ... });
PiperOrigin-RevId: 266209552
|
|
|
|
| |
PiperOrigin-RevId: 266198057
|
|
|
|
|
|
| |
We should consider both signed and narrow_range cases.
PiperOrigin-RevId: 266167366
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- extend canonicalizeMapAndOperands to propagate constant operands into
the map's expressions (and thus drop those operands).
- canonicalizeMapAndOperands previously only dropped duplicate and
unused operands; however, operands that were constants were
retained.
This change makes IR maps/expressions generated by various
utilities/passes even simpler; also makes some of the test checks more
accurate and simpler -- for eg., 0' instead of symbol(%{{.*}}).
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closes tensorflow/mlir#107
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/107 from bondhugula:canonicalize-maps c889a51486d14fbf7db489f224f881e7e1ff7d72
PiperOrigin-RevId: 266085289
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- fix operand mapping while cloning sub-blocks to jam - was incorrect
for imperfect nests where def/use was across sub-blocks
- strengthen/generalize the first test case to cover the previously
missed scenario
- clean up the other cases while on this.
Previously, unroll-jamming the following nest
```
affine.for %arg0 = 0 to 2048 {
%0 = alloc() : memref<512x10xf32>
affine.for %arg1 = 0 to 10 {
%1 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
}
dealloc %0 : memref<512x10xf32>
}
```
would yield
```
%0 = alloc() : memref<512x10xf32>
%1 = affine.apply #map0(%arg0)
%2 = alloc() : memref<512x10xf32>
affine.for %arg1 = 0 to 10 {
%4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
%5 = affine.apply #map0(%arg0)
%6 = affine.load %0[%5, %arg1] : memref<512x10xf32>
}
dealloc %0 : memref<512x10xf32>
%3 = affine.apply #map0(%arg0)
dealloc %0 : memref<512x10xf32>
```
instead of
```
module {
affine.for %arg0 = 0 to 2048 step 2 {
%0 = alloc() : memref<512x10xf32>
%1 = affine.apply #map0(%arg0)
%2 = alloc() : memref<512x10xf32>
affine.for %arg1 = 0 to 10 {
%4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
%5 = affine.apply #map0(%arg0)
%6 = affine.load %2[%5, %arg1] : memref<512x10xf32>
}
dealloc %0 : memref<512x10xf32>
%3 = affine.apply #map0(%arg0)
dealloc %2 : memref<512x10xf32>
}
```
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closes tensorflow/mlir#98
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/98 from bondhugula:ujam ddbc853f69b5608b3e8ff9b5ac1f6a5a0bb315a4
PiperOrigin-RevId: 266073460
|
|
|
|
| |
PiperOrigin-RevId: 266073204
|
|
|
|
| |
PiperOrigin-RevId: 266022088
|
|
|
|
|
|
|
|
| |
nesting.
The pass manager is moving towards being able to run on operations at arbitrary nesting. An operation may have both parent and child operations, and the AnalysisManager must be able to handle this generalization. The AnalysisManager class now contains generic 'getCachedParentAnalysis' and 'getChildAnalysis/getCachedChildAnalysis' functions to query analyses on parent/child operations. This removes the hard coded nesting relationship between Module/Function.
PiperOrigin-RevId: 266003636
|
|
|
|
|
|
|
|
|
|
|
| |
Tweak to the pretty type parser to recognize that `->` is a special token that
shouldn't be split into two characters. This change allows dialect
types to wrap function types as in `!my.ptr_type<(i32) -> i32>`.
Closes tensorflow/mlir#105
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/105 from schweitzpgi:parse-arrow 8b2d768053f419daae5a1a864121a44c4319acbe
PiperOrigin-RevId: 265986240
|
|
|
|
|
|
| |
This change adds definitions, parsing and verification for both ops.
PiperOrigin-RevId: 265954051
|
|
|
|
|
|
|
|
|
| |
Instead of lowering the program in two steps (Standard->LLVM followed
by GPU->NVVM), leading to invalid IR inbetween, the runner now uses
one pattern based rewrite step to go directly from Standard+GPU to
LLVM+NVVM.
PiperOrigin-RevId: 265861934
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor replaceAllMemRefUsesWith to split it into two methods: the new
method does the replacement on a single op, and is used by the existing
one.
- make the methods return LogicalResult instead of bool
- Earlier, when replacement failed (due to non-deferencing uses of the
memref), the set of ops that had already been processed would have
been replaced leaving the IR in an inconsistent state. Now, a
pass is made over all ops to first check for non-deferencing
uses, and then replacement is performed. No test cases were affected
because all clients of this method were first checking for
non-deferencing uses before calling this method (for other reasons).
This isn't true for a use case in another upcoming PR (scalar
replacement); clients can now bail out with consistent IR on failure
of replaceAllMemRefUsesWith. Add test case.
- multiple deferencing uses of the same memref in a single op is
possible (we have no such use cases/scenarios), and this has always
remained unsupported. Add an assertion for this.
- minor fix to another test pipeline-data-transfer case.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closes tensorflow/mlir#87
PiperOrigin-RevId: 265808183
|
|
|
|
|
|
| |
block-wide reduce.
PiperOrigin-RevId: 265720077
|
|
|
|
|
|
|
|
| |
Each basic block in SPIR-V must start with an OpLabel instruction.
We don't support control flow yet, so this CL just makes sure that
the entry block follows this rule and is valid.
PiperOrigin-RevId: 265718841
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support a conversion of a simple load-compute-store kernel from GPU
dialect to SPIR-V dialect, the conversion of operations like
"gpu.block_dim", "gpu.thread_id" which allow threads to get the launch
conversion is needed. In SPIR-V these are specified as global
variables with builin attributes. This CL adds support to specify
builtin variables in SPIR-V conversion framework. This is used to
convert the relevant operations from GPU dialect to SPIR-V dialect.
Also add support for conversion of load/store operation in Standard
dialect to SPIR-V dialect.
To simplify the conversion add a method to build a spv.AccessChain
operation that automatically determines the return type based on the
base pointer type and the indices provided.
PiperOrigin-RevId: 265718525
|
|
|
|
|
|
|
|
| |
Add Block decoration for top-level spv.struct.
Closes tensorflow/mlir#102
PiperOrigin-RevId: 265716241
|
|
|
|
|
|
| |
The context can easily be recovered from the Location in these situations.
PiperOrigin-RevId: 265578574
|
|
|
|
|
|
| |
The context can be recovered by other means in these methods and doesn't need to be passed explicitly.
PiperOrigin-RevId: 265532956
|
|
|
|
|
|
| |
This fixes a bug when folding ops with inner ops and inner ops are still being visited.
PiperOrigin-RevId: 265475780
|
|
|
|
| |
PiperOrigin-RevId: 265190168
|