| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
Add support for translating recently introduced llvm.global operations to
global variables in the LLVM IR proper.
PiperOrigin-RevId: 262564700
|
|
|
|
|
|
|
|
|
|
| |
This CL introduces the ability to generate the external library name for Linalg operations.
The problem is that neither mlir or C support overloading and we want a simplified form of name mangling that is still reasonable to read.
This CL creates the name of the external call that Linalg expects from the operation name and the type of its arguments.
The interface library names are updated and use new cases are added for FillOp.
PiperOrigin-RevId: 262556833
|
|
|
|
|
|
|
|
|
| |
This CL adds the ability for linalg.view to act as a bitcast operation.
This will be used when promoting views into faster memory and casting to vector types.
In the process, linalg.view is moved to ODS.
PiperOrigin-RevId: 262556246
|
|
|
|
|
|
|
|
| |
This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.
This CL adds the vector.outerproduct operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL.
PiperOrigin-RevId: 262552027
|
|
|
|
|
|
|
|
| |
This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.
This CL adds the vector.extractelement operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL.
PiperOrigin-RevId: 262545089
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL is step 1/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.
This CL adds the 3 instructions `llvm.extractelement`, `llvm.insertelement` and `llvm.shufflevector` as documented in the LLVM LangRef "Vector Instructions" section.
The "Experimental Vector Reduction Intrinsics" are left out for now and can be added in the future on a per-need basis.
Appropriate roundtrip and LLVM Target tests are added.
PiperOrigin-RevId: 262542095
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce an operation that defines global constants and variables in the LLVM
dialect, to reflect the corresponding LLVM IR capability. This operation is
expected to live in the top-level module and behaves similarly to
llvm.constant. It currently does not model many of the attributes supported by
the LLVM IR for global values (memory space, alignment, thread-local, linkage)
and will be extended as the relevant use cases appear.
PiperOrigin-RevId: 262539445
|
|
|
|
|
|
|
|
| |
This adds support for fcmp to the LLVM dialect and adds any necessary lowerings, as well as support for EDSCs.
Closes tensorflow/mlir#69
PiperOrigin-RevId: 262475255
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM function type has first-class support for variadic functions. In the
current lowering pipeline, it is emulated using an attribute on functions of
standard function type. In LLVMFuncOp that has LLVM function type, this can be
modeled directly. Introduce parsing support for variadic arguments to the
function and use it to support variadic function declarations in LLVMFuncOp.
Function definitions are currently not supported as that would require modeling
va_start/va_end LLVM intrinsics in the dialect and we don't yet have a
consistent story for LLVM intrinsics.
PiperOrigin-RevId: 262372651
|
|
|
|
|
|
|
|
|
|
| |
Now that modules are also operations, nothing prevents one from defining SSA
values in the module. Doing so in an implicit top-level module, i.e. outside
of a `module` operation, was leading to a crash because the implicit module was
not associated with an SSA name scope. Create a name scope before parsing the
top-level module to fix this.
PiperOrigin-RevId: 262366891
|
|
|
|
|
|
|
|
| |
This CL introduces canonicalization patterns for linalg.dim.
This allows the dimenions of chains of view, slice and subview operations to simplify.
Down the line, when mixed with cse, this also allows better composition of linalg tiling and fusion by tracking operations that give the same result (not in this CL).
PiperOrigin-RevId: 262365865
|
|
|
|
|
|
|
|
| |
http://llvm.org/docs/LangRef.html#unreachable-instruction
Closes tensorflow/mlir#64
PiperOrigin-RevId: 262301557
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
affine.load/affine.store/std.load/std.store
Verification complained when using zero-dimensional memrefs in
affine.load, affine.store, std.load and std.store. This PR extends
verification so that those memrefs can be used.
Closes tensorflow/mlir#58
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/58 from dcaballe:dcaballe/zero-dim 49bcdcd45c52c48beca776431328e5ce551dfa9e
PiperOrigin-RevId: 262164916
|
|
|
|
| |
PiperOrigin-RevId: 261962104
|
|
|
|
|
|
|
|
| |
via GreedyPatternRewriteDriver::replaceOp.
This fixes a bug where ops inside the parent op are visited even though the parent op has been removed.
PiperOrigin-RevId: 261953580
|
|
|
|
| |
PiperOrigin-RevId: 261944712
|
|
|
|
|
|
|
|
|
|
| |
This CL extends the Linalg GenericOp with an alternative way of specifying the body of the computation based on a single block region. The "fun" attribute becomes optional.
Either a SymbolRef "fun" attribute or a single block region must be specified to describe the side-effect-free computation. Upon lowering to loops, the new region body is inlined in the innermost loop.
The parser, verifier and pretty printer are extended.
Appropriate roundtrip, negative and lowering to loop tests are added.
PiperOrigin-RevId: 261895568
|
|
|
|
|
|
|
| |
This CL modifies the LowerLinalgToLoopsPass to use RewritePattern.
This will make it easier to inline Linalg generic functions and regions when emitting to loops in a subsequent CL.
PiperOrigin-RevId: 261894120
|
|
|
|
|
|
| |
This allows for proper forward declaration, as opposed to leaking the internal implementation via a using directive. This also allows for all pattern building to go through 'insert' methods on the OwningRewritePatternList, replacing uses of 'push_back' and 'RewriteListBuilder'.
PiperOrigin-RevId: 261816316
|
|
|
|
|
|
| |
This op is not useful.
PiperOrigin-RevId: 261665736
|
|
|
|
|
|
|
|
| |
This trait provides the ensureTerminator() utility function and
the checks to make sure a spv.module is indeed terminated with
spv._module_end.
PiperOrigin-RevId: 261664153
|
|
|
|
|
|
|
| |
Similar to all LLVM dialect operations, llvm.func needs to have the custom
syntax. Use the generic FunctionLike printer and parser to implement it.
PiperOrigin-RevId: 261641755
|
|
|
|
|
|
|
|
|
|
| |
llvm ir printer was changed at LLVM r367755.
Prints value numbers for unnamed functions argument.
Closes tensorflow/mlir#67
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/67 from denis0x0D:sandbox/fix_mlir_translate ae46844e66f34a02e0cf86782ddadc5bce58b30d
PiperOrigin-RevId: 261640048
|
|
|
|
|
|
|
| |
This CL added a new NonNegativeIntAttrBase class and two instantiations,
one for I32 and the other for I64.
PiperOrigin-RevId: 261513292
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL introduces a linalg.generic op to represent generic tensor contraction operations on views.
A linalg.generic operation requires a numbers of attributes that are sufficient to emit the computation in scalar form as well as compute the appropriate subviews to enable tiling and fusion.
These attributes are very similar to the attributes for existing operations such as linalg.matmul etc and existing operations can be implemented with the generic form.
In the future, most existing operations can be implemented using the generic form.
This CL starts by splitting out most of the functionality of the linalg::NInputsAndOutputs trait into a ViewTrait that queries the per-instance properties of the op. This allows using the attribute informations.
This exposes an ordering of verifiers issue where ViewTrait::verify uses attributes but the verifiers for those attributes have not been run. The desired behavior would be for the verifiers of the attributes specified in the builder to execute first but it is not the case atm. As a consequence, to emit proper error messages and avoid crashing, some of the
linalg.generic methods are defensive as such:
```
unsigned getNumInputs() {
// This is redundant with the `n_views` attribute verifier but ordering of verifiers
// may exhibit cases where we crash instead of emitting an error message.
if (!getAttr("n_views") || n_views().getValue().size() != 2)
return 0;
```
In pretty-printed form, the specific attributes required for linalg.generic are factored out in an independent dictionary named "_". When parsing its content is flattened and the "_name" is dropped. This allows using aliasing for reducing boilerplate at each linalg.generic invocation while benefiting from the Tablegen'd verifier form for each named attribute in the dictionary.
For instance, implementing linalg.matmul in terms of linalg.generic resembles:
```
func @mac(%a: f32, %b: f32, %c: f32) -> f32 {
%d = mulf %a, %b: f32
%e = addf %c, %d: f32
return %e: f32
}
#matmul_accesses = [
(m, n, k) -> (m, k),
(m, n, k) -> (k, n),
(m, n, k) -> (m, n)
]
#matmul_trait = {
doc = "C(m, n) += A(m, k) * B(k, n)",
fun = @mac,
indexing_maps = #matmul_accesses,
library_call = "linalg_matmul",
n_views = [2, 1],
n_loop_types = [2, 1, 0]
}
```
And can be used in multiple places as:
```
linalg.generic #matmul_trait %A, %B, %C [other-attributes] :
!linalg.view<?x?xf32>, !linalg.view<?x?xf32>, !linalg.view<?x?xf32>
```
In the future it would be great to have a mechanism to alias / register a new
linalg.op as a pair of linalg.generic, #trait.
Also, note that with one could theoretically only specify the `doc` string and parse all the attributes from it.
PiperOrigin-RevId: 261338740
|
|
|
|
|
|
|
|
|
| |
Add StdIndexedValue to EDSC helper so that we can use it
to generated std.load and std.store in EDSC.
Closes tensorflow/mlir#59
PiperOrigin-RevId: 261324965
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Explicit copying to contiguous buffers is a standard technique to avoid
conflict misses and TLB misses, and improve hardware prefetching
performance. When done in conjunction with cache tiling, it nearly
eliminates all cache conflict and TLB misses, and a single hardware
prefetch stream is needed per data tile.
- generalize/extend DMA generation pass (renamed data copying pass) to
perform either point-wise explicit copies to fast memory buffers or
DMAs (depending on a cmd line option). All logic is the same as
erstwhile -dma-generate.
- -affine-dma-generate is now renamed -affine-data-copy; when -dma flag is
provided, DMAs are generated, or else explicit copy loops are generated
(point-wise) by default.
- point-wise copying could be used for CPUs (or GPUs); some indicative
performance numbers with a "C" version of the MLIR when compiled with
and without this optimization (about 2x improvement here).
With a matmul on 4096^2 matrices on a single core of an Intel Core i7
Skylake i7-8700K with clang 8.0.0:
clang -O3: 518s
clang -O3 with MLIR tiling (128x128): 24.5s
clang -O3 with MLIR tiling + data copying 12.4s
(code equivalent to test/Transforms/data-copy.mlir func @matmul)
- fix some misleading comments.
- change default fast-mem space to 0 (more intuitive now with the
default copy generation using point-wise copies instead of DMAs)
On a simple 3-d matmul loop nest, code generated with -affine-data-copy:
```
affine.for %arg3 = 0 to 4096 step 128 {
affine.for %arg4 = 0 to 4096 step 128 {
%0 = affine.apply #map0(%arg3, %arg4)
%1 = affine.apply #map1(%arg3, %arg4)
%2 = alloc() : memref<128x128xf32, 2>
// Copy-in Out matrix.
affine.for %arg5 = 0 to 128 {
%5 = affine.apply #map2(%arg3, %arg5)
affine.for %arg6 = 0 to 128 {
%6 = affine.apply #map2(%arg4, %arg6)
%7 = load %arg2[%5, %6] : memref<4096x4096xf32>
affine.store %7, %2[%arg5, %arg6] : memref<128x128xf32, 2>
}
}
affine.for %arg5 = 0 to 4096 step 128 {
%5 = affine.apply #map0(%arg3, %arg5)
%6 = affine.apply #map1(%arg3, %arg5)
%7 = alloc() : memref<128x128xf32, 2>
// Copy-in LHS.
affine.for %arg6 = 0 to 128 {
%11 = affine.apply #map2(%arg3, %arg6)
affine.for %arg7 = 0 to 128 {
%12 = affine.apply #map2(%arg5, %arg7)
%13 = load %arg0[%11, %12] : memref<4096x4096xf32>
affine.store %13, %7[%arg6, %arg7] : memref<128x128xf32, 2>
}
}
%8 = affine.apply #map0(%arg5, %arg4)
%9 = affine.apply #map1(%arg5, %arg4)
%10 = alloc() : memref<128x128xf32, 2>
// Copy-in RHS.
affine.for %arg6 = 0 to 128 {
%11 = affine.apply #map2(%arg5, %arg6)
affine.for %arg7 = 0 to 128 {
%12 = affine.apply #map2(%arg4, %arg7)
%13 = load %arg1[%11, %12] : memref<4096x4096xf32>
affine.store %13, %10[%arg6, %arg7] : memref<128x128xf32, 2>
}
}
// Compute.
affine.for %arg6 = #map7(%arg3) to #map8(%arg3) {
affine.for %arg7 = #map7(%arg4) to #map8(%arg4) {
affine.for %arg8 = #map7(%arg5) to #map8(%arg5) {
%11 = affine.load %7[-%arg3 + %arg6, -%arg5 + %arg8] : memref<128x128xf32, 2>
%12 = affine.load %10[-%arg5 + %arg8, -%arg4 + %arg7] : memref<128x128xf32, 2>
%13 = affine.load %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2>
%14 = mulf %11, %12 : f32
%15 = addf %13, %14 : f32
affine.store %15, %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2>
}
}
}
dealloc %10 : memref<128x128xf32, 2>
dealloc %7 : memref<128x128xf32, 2>
}
%3 = affine.apply #map0(%arg3, %arg4)
%4 = affine.apply #map1(%arg3, %arg4)
// Copy out result matrix.
affine.for %arg5 = 0 to 128 {
%5 = affine.apply #map2(%arg3, %arg5)
affine.for %arg6 = 0 to 128 {
%6 = affine.apply #map2(%arg4, %arg6)
%7 = affine.load %2[%arg5, %arg6] : memref<128x128xf32, 2>
store %7, %arg2[%5, %6] : memref<4096x4096xf32>
}
}
dealloc %2 : memref<128x128xf32, 2>
}
}
```
With -affine-data-copy -dma:
```
affine.for %arg3 = 0 to 4096 step 128 {
%0 = affine.apply #map3(%arg3)
%1 = alloc() : memref<128xf32, 2>
%2 = alloc() : memref<1xi32>
affine.dma_start %arg2[%arg3], %1[%c0], %2[%c0], %c128_0 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32>
affine.dma_wait %2[%c0], %c128_0 : memref<1xi32>
%3 = alloc() : memref<1xi32>
affine.for %arg4 = 0 to 4096 step 128 {
%5 = affine.apply #map0(%arg3, %arg4)
%6 = affine.apply #map1(%arg3, %arg4)
%7 = alloc() : memref<128x128xf32, 2>
%8 = alloc() : memref<1xi32>
affine.dma_start %arg0[%arg3, %arg4], %7[%c0, %c0], %8[%c0], %c16384, %c4096, %c128_2 : memref<4096x4096xf32>, memref<128x128xf32, 2>, memref<1xi32>
affine.dma_wait %8[%c0], %c16384 : memref<1xi32>
%9 = affine.apply #map3(%arg4)
%10 = alloc() : memref<128xf32, 2>
%11 = alloc() : memref<1xi32>
affine.dma_start %arg1[%arg4], %10[%c0], %11[%c0], %c128_1 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32>
affine.dma_wait %11[%c0], %c128_1 : memref<1xi32>
affine.for %arg5 = #map3(%arg3) to #map5(%arg3) {
affine.for %arg6 = #map3(%arg4) to #map5(%arg4) {
%12 = affine.load %7[-%arg3 + %arg5, -%arg4 + %arg6] : memref<128x128xf32, 2>
%13 = affine.load %10[-%arg4 + %arg6] : memref<128xf32, 2>
%14 = affine.load %1[-%arg3 + %arg5] : memref<128xf32, 2>
%15 = mulf %12, %13 : f32
%16 = addf %14, %15 : f32
affine.store %16, %1[-%arg3 + %arg5] : memref<128xf32, 2>
}
}
dealloc %11 : memref<1xi32>
dealloc %10 : memref<128xf32, 2>
dealloc %8 : memref<1xi32>
dealloc %7 : memref<128x128xf32, 2>
}
%4 = affine.apply #map3(%arg3)
affine.dma_start %1[%c0], %arg2[%arg3], %3[%c0], %c128 : memref<128xf32, 2>, memref<4096xf32>, memref<1xi32>
affine.dma_wait %3[%c0], %c128 : memref<1xi32>
dealloc %3 : memref<1xi32>
dealloc %2 : memref<1xi32>
dealloc %1 : memref<128xf32, 2>
}
```
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closes tensorflow/mlir#50
PiperOrigin-RevId: 261221903
|
|
|
|
|
|
|
|
| |
This CL extends the existing spv.constant op to also support
specialization constant by adding an extra unit attribute
on it.
PiperOrigin-RevId: 261194869
|
|
|
|
|
|
|
|
|
|
|
| |
Add binary logical operations regarding to the spec section 3.32.15:
OpIEqual, OpINotEqual, OpUGreaterThan, OpSGreaterThan,
OpUGreaterThanEqual, OpSGreaterThanEqual, OpULessThan, OpSLessThan,
OpULessThanEqual, OpSLessThanEqual.
Closes tensorflow/mlir#61
PiperOrigin-RevId: 261181281
|
|
|
|
|
|
|
|
|
| |
verifyUnusedValue is a bit strange given that it is specified in a
result pattern but used to generate match statements. Now we are
able to support multi-result ops better, we can retire it and replace
it with a HasNoUseOf constraint. This reduces the number of mechanisms.
PiperOrigin-RevId: 261166863
|
|
|
|
| |
PiperOrigin-RevId: 261045611
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We allow to generate more ops than what are needed for replacing
the matched root op. Only the last N static values generated are
used as replacement; the others serve as auxiliary ops/values for
building the replacement.
With the introduction of multi-result op support, an op, if used
as a whole, may be used to replace multiple static values of
the matched root op. We need to consider this when calculating
the result range an generated op is to replace.
For example, we can have the following pattern:
```tblgen
def : Pattern<(ThreeResultOp ...),
[(OneResultOp ...), (OneResultOp ...), (OneResultOp ...)]>;
// Two op to replace all three results
def : Pattern<(ThreeResultOp ...),
[(TwoResultOp ...), (OneResultOp ...)]>;
// One op to replace all three results
def : Pat<(ThreeResultOp ...), (ThreeResultOp ...)>;
def : Pattern<(ThreeResultOp ...),
[(AuxiliaryOp ...), (ThreeResultOp ...)]>;
```
PiperOrigin-RevId: 261017235
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we use one single method with lots of branches to
generate multiple builders. This makes the method difficult
to follow and modify. This CL splits the method into multiple
dedicated ones, by extracting common logic into helper methods
while leaving logic specific to each builder in their own
methods.
PiperOrigin-RevId: 261011082
|
|
|
|
|
|
|
|
|
|
|
| |
During serialization, the operand number must be used to get the
values assocaited with an operand. Using the argument number in Op
specification was wrong since some of the elements in the arguments
list might be attributes on the operation. This resulted in a segfault
during serialization.
Add a test that exercise that path.
PiperOrigin-RevId: 260977758
|
|
|
|
|
|
|
|
|
| |
Add binary operations such as: OpUdiv, OpSDiv, OpUMod, OpSRem, OpSMod.
Closes tensorflow/mlir#56
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/56 from denis0x0D:sandbox/bin_ops_int 4959325a693b4658b978a8b97f79b8237eb39764
PiperOrigin-RevId: 260961681
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the recently introduced support for hexadecimal float literals to tensor
literals, which may also contain special floating point values such as
infinities and NaNs.
Modify TensorLiteralParser to store the list of tokens representing values
until the type is parsed instead of trying to guess the tensor element type
from the token kinds (hexadecimal values can be either integers or floats, and
can be mixed with both). Maintain the error reports as close as possible to
the existing implementation to avoid disturbing the tests. They can be
improved in a separate clean-up if deemed necessary.
PiperOrigin-RevId: 260794716
|
|
|
|
|
|
|
|
|
|
|
|
| |
All non-argument attributes specified for an operation are treated as
decorations on the result value and (de)serialized using OpDecorate
instruction. An error is generated if an attribute is not an argument,
and the name doesn't correspond to a Decoration enum. Name of the
attributes that represent decoerations are to be the snake-case-ified
version of the Decoration name.
Add utility methods to convert to snake-case and camel-case.
PiperOrigin-RevId: 260792638
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MLIR does not have support for parsing special floating point values such as
infinities and NaNs. If programmatically constructed, these values are printed
as NaN and (+-)Inf and cannot be parsed back. Add parser support for
hexadecimal literals in float attributes, following LLVM IR. The literal
corresponds to the in-memory representation of the floating point value.
IEEE 754 defines a range of possible values for NaNs, storing the bitwise
representation allows MLIR to properly roundtrip NaNs with different bit values
of significands.
The initial version of this commit was missing support for float literals that
used to be printed in decimal notation as a fallback, but ended up being
printed in hexadecimal format which became the fallback for special values.
The decimal fallback behavior was not exercised by tests. It is currently
reinstated and tested by the newly added test @f32_potential_precision_loss in
parser.mlir.
PiperOrigin-RevId: 260790900
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL adds an initial implementation for translation of kernel
function in GPU Dialect (used with a gpu.launch_kernel) op to a
spv.Module. The original function is translated into an entry
function.
Most of the heavy lifting is done by adding TypeConversion and other
utility functions/classes that provide most of the functionality to
translate from Standard Dialect to SPIR-V Dialect. These are intended
to be reusable in implementation of different dialect conversion
pipelines.
Note : Some of the files for have been renamed to be consistent with
the norm used by the other Conversion frameworks.
PiperOrigin-RevId: 260759165
|
|
|
|
|
|
|
|
|
| |
Add binary operations such as: OpIAdd, OpFAdd, OpISub, OpFSub, OpIMul,
OpFDiv, OpFRem, OpFMod.
Closes tensorflow/mlir#54
PiperOrigin-RevId: 260734166
|
|
|
|
|
|
|
|
|
| |
RewriterGen was emitting invalid C++ code if the pattern required to create a
zero-result operation due to the absence of a special case that would avoid
generating a spurious comma. Handle this case. Also add rewriter tests for
zero-argument operations.
PiperOrigin-RevId: 260576998
|
|
|
|
|
|
|
|
|
|
| |
The code was written with the assumption that on failure an error would be
issued by another verifier. However verification is stopping on the first
failure which lead to an empty output. Instead we make sure an error is
displayed.
Also add tests in the test dialect for this trait.
PiperOrigin-RevId: 260541290
|
|
|
|
|
|
|
|
|
|
| |
Automatic generation of spirv::AccessChainOp (de)serialization needs
the (de)serialization emitters to handle argument specified as
Variadic<...>. To handle this correctly, this argument can only be
the last entry in the arguments list.
Add a test to (de)serialize spirv::AccessChainOp
PiperOrigin-RevId: 260532598
|
|
|
|
|
|
| |
operation (NFC)
PiperOrigin-RevId: 260532592
|
|
|
|
|
|
|
|
| |
definitions
This allows classes to refer to each other in the ODS file, for instance for traits.
PiperOrigin-RevId: 260532419
|
|
|
|
|
|
| |
dimension or symbol identifiers.
PiperOrigin-RevId: 260197567
|
|
|
|
| |
PiperOrigin-RevId: 260136255
|
|
|
|
|
|
|
|
|
| |
This CL adds a few specializations for sgemm.
A minor change to alpha is made in cblas_interface.cpp to be compatible with actual BLAS calls.
For now this is for internal testing purposes only.
PiperOrigin-RevId: 260129027
|
|
|
|
|
|
|
|
| |
It's quite common that we want to put further constraints on the matched
multi-result op's specific results. This CL enables referencing symbols
bound to source op with the `__N` syntax.
PiperOrigin-RevId: 260122401
|
|
|
|
|
|
|
| |
In the backward slice computation, BlockArgument coming from function arguments represent a natural boundary for the traversal and should not trigger llvm_unreachable.
This CL also improves the error message and adds a relevant test.
PiperOrigin-RevId: 260118630
|