summaryrefslogtreecommitdiffstats
path: root/mlir/test/Transforms/Vectorize
Commit message (Collapse)AuthorAgeFilesLines
* [mlir] Change the syntax of AffineMapAttr and IntegerSetAttr to avoid ↵River Riddle2020-01-138-86/+86
| | | | | | | | | | conflicts with function types. Summary: The current syntax for AffineMapAttr and IntegerSetAttr conflict with function types, making it currently impossible to round-trip function types(and e.g. FuncOp) in the IR. This revision changes the syntax for the attributes by wrapping them in a keyword. AffineMapAttr is wrapped with `affine_map<>` and IntegerSetAttr is wrapped with `affine_set<>`. Reviewed By: nicolasvasilache, ftynse Differential Revision: https://reviews.llvm.org/D72429
* More affine expr simplifications for floordiv and modUday Bondhugula2019-12-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | Add one more simplification for floordiv and mod affine expressions. Examples: (2*d0 + 1) floordiv 2 is simplified to d0 (8*d0 + 4*d1 + d2) floordiv 4 simplified to 4*d0 + d1 + d2 floordiv 4. etc. Similarly, (4*d1 + 1) mod 2 is simplified to 1, (2*d0 + 8*d1) mod 8 simplified to 2*d0 mod 8. Change getLargestKnownDivisor to return int64_t to be consistent and to avoid casting at call sites (since the return value is used in expressions of int64_t/index type). Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#202 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/202 from bondhugula:affine b13fcb2f1c00a39ca5434613a02408e085a80e77 PiperOrigin-RevId: 284866710
* Drop MaterializeVectorTransfers in favor of simpler declarative unrollingNicolas Vasilache2019-12-044-387/+0
| | | | | | Now that we have unrolling as a declarative pattern, we can drop a full pass that has gone stale. In the future we may want to add specific unrolling patterns for VectorTransferReadOp. PiperOrigin-RevId: 283806880
* Refactor the LowerVectorTransfers pass to use the RewritePattern infra - NFCNicolas Vasilache2019-11-141-213/+0
| | | | | | This is step 1/n in refactoring infrastructure along the Vector dialect to make it ready for retargetability and composable progressive lowering. PiperOrigin-RevId: 280529784
* Move VectorOps to Tablegen - (almost) NFCNicolas Vasilache2019-11-147-34/+43
| | | | | | | | | | | | | | | | This CL moves VectorOps to Tablegen and cleans up the implementation. This is almost NFC but 2 changes occur: 1. an interface change occurs in the padding value specification in vector_transfer_read: the value becomes non-optional. As a shortcut we currently use %f0 for all paddings. This should become an OpInterface for vectorization in the future. 2. the return type of vector.type_cast is trivial and simplified to `memref<vector<...>>` Relevant roundtrip and invalid tests that used to sit in core are moved to the vector dialect. The op documentation is moved to the .td file. PiperOrigin-RevId: 280430869
* Lower vector transfer ops to loop.for operations.Nicolas Vasilache2019-10-181-11/+19
| | | | | | This allows mixing linalg operations with vector transfer operations (with additional modifications to affine ops) and is a step towards solving tensorflow/mlir#189. PiperOrigin-RevId: 275543361
* Use "standard" load and stores in LowerVectorTransfersNicolas Vasilache2019-07-261-4/+4
| | | | | | | Clipping creates non-affine memory accesses, use std_load and std_store instead of affine_load and affine_store. In the future we may also want a fill with the neutral element rather than clip, this would make the accesses affine if we wanted more analyses and transformations to happen post lowering to pointwise copies. PiperOrigin-RevId: 260110503
* EDSC: use affine.load/store instead of std.load/storeAlex Zinenko2019-07-121-4/+4
| | | | | | | | | | | Standard load and store operations are evolving to be separated from the Affine constructs. Special affine.load/store have been introduced to uphold the restrictions of the Affine control flow constructs on their operands. EDSC-produced loads and stores were originally intended to uphold those restrictions as well so they should use affine.load/store instead of std.load/store. PiperOrigin-RevId: 257443307
* Standardize the value numbering in the AsmPrinter.River Riddle2019-07-0912-228/+228
| | | | | | Change the AsmPrinter to number values breadth-first so that values in adjacent regions can have the same name. This allows for ModuleOp to contain operations that produce results. This also standardizes the special name of region entry arguments to "arg[0-9+]" now that Functions are also operations. PiperOrigin-RevId: 257225069
* Globally change load/store/dma_start/dma_wait operations over to ↵Andy Davis2019-07-039-95/+82
| | | | | | | | | | | affine.load/store/dma_start/dma_wait. In most places, this is just a name change (with the exception of affine.dma_start swapping the operand positions of its tag memref and num_elements operands). Significant code changes occur here: *) Vectorization: LoopAnalysis.cpp, Vectorize.cpp *) Affine Transforms: Transforms/Utils/Utils.cpp PiperOrigin-RevId: 256395088
* Change the attribute dictionary syntax to separate name and value with '='.River Riddle2019-06-2512-106/+106
| | | | | | | | | | | The current syntax separates the name and value with ':', but ':' is already overloaded by several other things(e.g. trailing types). This makes the syntax difficult to parse in some situtations: Old: "foo: 10 : i32" New: "foo = 10 : i32" PiperOrigin-RevId: 255097928
* Modify the syntax of the the ElementsAttrs to print the type as a colon type.River Riddle2019-06-258-36/+36
| | | | | | | | | This is the standard syntax for types on operations, and is also already used by IntegerAttr and FloatAttr. Example: dense<5> : tensor<i32> dense<[3]> : tensor<1xi32> PiperOrigin-RevId: 255069157
* Refactor SplatElementsAttr to inherit from DenseElementsAttr as opposed to ↵River Riddle2019-06-198-42/+42
| | | | | | being a separate Attribute type. DenseElementsAttr provides a better internal representation for splat values as well as better API for accessing elements. PiperOrigin-RevId: 253138287
* Add support for saving and restoring the insertion point of a ↵River Riddle2019-05-201-1/+1
| | | | | | | | FuncBuilder. This also updates the edsc::ScopedContext to use a single builder that saves/restores insertion points. This is necessary for using edscs within RewritePatterns. -- PiperOrigin-RevId: 248812645
* Simplify the parser/printer of ConstantOp now that all attributes have ↵River Riddle2019-05-108-36/+36
| | | | | | | | types. This has the added benefit of removing type redundancy from the pretty form. As a consequence, IntegerAttr/FloatAttr will now always print the type even if it is i64/f64. -- PiperOrigin-RevId: 247295828
* Verify that attribute type and constant op return type matches.Jacques Pienaar2019-05-101-2/+2
| | | | | | -- PiperOrigin-RevId: 247263129
* Prepend an "affine-" prefix to Affine pass option names - NFCNicolas Vasilache2019-05-0614-16/+16
| | | | | | | | | | | Trying to activate both LLVM and MLIR passes in mlir-cpu-runner showed name collisions when registering pass names. One possible way of disambiguating that should also work across dialects is to prepend the dialect name to the passes that specifically operate on that dialect. With this CL, mlir-cpu-runner tests still run when both LLVM and MLIR passes are registered -- PiperOrigin-RevId: 246539917
* Apply patterns repeatly if the function is modifiedFeng Liu2019-04-231-5/+3
| | | | | | | | | | | | | | | | | | During the pattern rewrite, if the function is changed, i.e. ops created, deleted or swapped, the pattern rewriter needs to re-scan the function entirely and apply the patterns again, so the patterns whose root ops have been popped out from the working list nor an immediate users of the changed ops can be reconsidered. A command line flag is added to set the max number of iterations rescanning the function for pattern match. If the rewrite doesn' converge after this number, this compiling will continue and the result can be sub-optimal. One unit test is updated because this change fixed the missing optimization opportunities. -- PiperOrigin-RevId: 244754190
* Fix test that fails on non-determinism in LowerVectorTransfersNicolas Vasilache2019-04-031-27/+17
| | | | | | | | | This CL fixes the non-determinism across compilers in an edsc::select expression used in LowerVectorTransfers. This is achieved by factoring the expression out of the function call to ensure a deterministic order of evaluation. Since the expression is now factored out, fewer IR is generated and the test is updated accordingly. -- PiperOrigin-RevId: 241679962
* Remove MLPatternLoweringPass and rewrite LowerVectorTransfers to use ↵River Riddle2019-04-021-1/+1
| | | | | | | | RewritePattern instead. -- PiperOrigin-RevId: 241455472
* Cleanup SuperVectorization dialect printing and parsing.Nicolas Vasilache2019-03-2911-155/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | On the read side, ``` %3 = vector_transfer_read %arg0, %i2, %i1, %i0 {permutation_map: (d0, d1, d2)->(d2, d0)} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32> ``` becomes: ``` %3 = vector_transfer_read %arg0[%i2, %i1, %i0] {permutation_map: (d0, d1, d2)->(d2, d0)} : memref<?x?x?xf32>, vector<32x256xf32> ``` On the write side, ``` vector_transfer_write %0, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32>, index, index ``` becomes ``` vector_transfer_write %0, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32> ``` Documentation will be cleaned up in a followup commit that also extracts a proper .md from the top of the file comments. PiperOrigin-RevId: 241021879
* Refactor vectorization patternsNicolas Vasilache2019-03-291-6/+57
| | | | | | | | | | This CL removes the reliance of the vectorize pass on the specification of a `fastestVaryingDim` parameter. This parameter is a restriction meant to more easily target a particular loop/memref combination for vectorization and is mainly used for testing. This also had the side-effect of restricting vectorization patterns to only the ones in which all memrefs were contiguous along the same loop dimension. This simple restriction prevented matmul to vectorize in 2-D. this CL removes the restriction and adds the matmul test which vectorizes in 2-D along the parallel loops. Support for reduction loops is left for future work. PiperOrigin-RevId: 240993827
* Change the vectorizer test pass to output via diagnostics instead of ↵River Riddle2019-03-292-3/+3
| | | | | | llvm::outs. This allows for the output to be deterministic when multi-threading is enabled. PiperOrigin-RevId: 240905858
* Change the muli-return syntax for operations. The name of the operation ↵River Riddle2019-03-291-2/+2
| | | | | | | | | | | | | | | result now contains the number of results that it refers to if the number of results is greater than 1. Example: %call:2 = call @multi_return() : () -> (f32, i32) use(%calltensorflow/mlir#0, %calltensorflow/mlir#1) This cl also adds parser support for uniquely named result values. This means that a test writer can now write something like: %foo, %bar = call @multi_return() : () -> (f32, i32) use(%foo, %bar) Note: The printer will still print the collapsed form. PiperOrigin-RevId: 240860058
* Cleanup vectorize_1d.mlir test - NFCNicolas Vasilache2019-03-291-82/+247
| | | | | | This CL splits a large monolithic test function into smaller ones that are each CHECK-LABEL'd PiperOrigin-RevId: 240684979
* Make vectorization aware of loop semanticsNicolas Vasilache2019-03-291-0/+16
| | | | | | Now that we have a dependence analysis, we can check that loops are indeed parallel and make vectorization correct. PiperOrigin-RevId: 240682727
* NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for' and ↵River Riddle2019-03-2912-192/+192
| | | | | | set the namespace of the AffineOps dialect to 'affine'. PiperOrigin-RevId: 240165792
* NFC: Rename the 'if' operation in the AffineOps dialect to 'affine.if'.River Riddle2019-03-291-1/+1
| | | | PiperOrigin-RevId: 240071154
* Support composition of symbols in AffineApplyOpNicolas Vasilache2019-03-291-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This CL revisits the composition of AffineApplyOp for the special case where a symbol itself comes from an AffineApplyOp. This is achieved by rewriting such symbols into dims to allow composition to occur mathematically. The implementation is also refactored to improve readability. Rationale for locally rewriting symbols as dims: ================================================ The mathematical composition of AffineMap must always concatenate symbols because it does not have enough information to do otherwise. For example, composing `(d0)[s0] -> (d0 + s0)` with itself must produce `(d0)[s0, s1] -> (d0 + s0 + s1)`. The result is only equivalent to `(d0)[s0] -> (d0 + 2 * s0)` when applied to the same mlir::Value* for both s0 and s1. As a consequence mathematical composition of AffineMap always concatenates symbols. When AffineMaps are used in AffineApplyOp however, they may specify composition via symbols, which is ambiguous mathematically. This corner case is handled by locally rewriting such symbols that come from AffineApplyOp into dims and composing through dims. PiperOrigin-RevId: 239791597
* Port LowerVectorTransfers from EDSC + AST to declarative buildersNicolas Vasilache2019-03-291-43/+60
| | | | | | | | | | | | | | | This CL removes the dependency of LowerVectorTransfers on the AST version of EDSCs which will be retired. This exhibited a pretty fundamental staging difference in AST-based vs declarative based emission. Since the delayed creation with an AST was staged, the loop order came into existence after the clipping expressions were computed. This now changes as the loops first need to be created declaratively in fixed order and then the clipping expressions are created. Also, due to lack of staging, coalescing cannot be done on the fly anymore and needs to be done either as a pre-pass (current implementation) or as a local transformation on the generated IR (future work). Tests are updated accordingly. PiperOrigin-RevId: 238971631
* Automated rollback of changelist 232728977.Uday Bondhugula2019-03-291-1/+1
| | | | PiperOrigin-RevId: 232944889
* Automated rollback of changelist 232717775.Uday Bondhugula2019-03-2912-192/+192
| | | | PiperOrigin-RevId: 232807986
* Rename the 'if' operation in the AffineOps dialect to 'affine.if' and namespaceRiver Riddle2019-03-291-1/+1
| | | | | | the AffineOps dialect with 'affine'. PiperOrigin-RevId: 232728977
* NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for'. ↵River Riddle2019-03-2912-192/+192
| | | | | | The is the second step to adding a namespace to the AffineOps dialect. PiperOrigin-RevId: 232717775
* NFC: Rename affine_apply to affine.apply. This is the first step to adding a ↵River Riddle2019-03-296-179/+179
| | | | | | namespace to the affine dialect. PiperOrigin-RevId: 232707862
* Update tests using affine maps to not rely on specific map numbers in the ↵River Riddle2019-03-291-1/+1
| | | | | | output IR. This is necessary to remove the dependency on ForInst not numbering the AffineMap bounds it has custom formatting for. PiperOrigin-RevId: 231634812
* More updates of tests to move towards single result affine maps.Chris Lattner2019-03-291-21/+25
| | | | PiperOrigin-RevId: 230991929
* AffineExpr pretty print - add missing handling to print expr * - 1 as -exprUday Bondhugula2019-03-291-1/+1
| | | | | | | | - print multiplication by -1 as unary negate; expressions like s0 * -1, d0 * -1 + d1 will now appear as -s0, -d0 + d1 resp. - a minor cleanup while on printAffineExprInternal PiperOrigin-RevId: 230222151
* Fix improperly indexed DimOp in LowerVectorTransfers.cppNicolas Vasilache2019-03-291-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL fixes a misunderstanding in how to build DimOp which triggered execution issues in the CPU path. The problem is that, given a `memref<?x4x?x8x?xf32>`, the expressions to construct the dynamic dimensions should be: `dim %arg, 0 : memref<?x4x?x8x?xf32>` `dim %arg, 2 : memref<?x4x?x8x?xf32>` and `dim %arg, 4 : memref<?x4x?x8x?xf32>` Before this CL, we wold construct: `dim %arg, 0 : memref<?x4x?x8x?xf32>` `dim %arg, 1 : memref<?x4x?x8x?xf32>` `dim %arg, 2 : memref<?x4x?x8x?xf32>` and expect the other dimensions to be constants. This assumption seems consistent at first glance with the syntax of alloc: ``` %tensor = alloc(%M, %N, %O) : memref<?x4x?x8x?xf32> ``` But this was actuallyincorrect. This CL also makes the relevant functions available to EDSCs and removes duplication of the incorrect function. PiperOrigin-RevId: 229622766
* Fix typo in lower_vector_transfers.mlirNicolas Vasilache2019-03-291-2/+2
| | | | PiperOrigin-RevId: 229010160
* [MLIR] Clip all access dimensions during LowerVectorTransfersNicolas Vasilache2019-03-292-79/+118
| | | | | | | | | | | | | | | | | | This CL adds a short term remedy to an issue that was found during execution tests. Lowering of vector transfer ops uses the permutation map to determine which ForInst have been super-vectorized. During materialization to HW vector sizes however, some of those dimensions may be fully unrolled and do not appear in the permutation map. Such dimensions were then not clipped and may have accessed out of bounds. This CL conservatively clips all dimensions to ensure no out of bounds access. The longer term solution is still up for debate but will probably require either passing more information between Materialization and lowering, or just merging the 2 passes. PiperOrigin-RevId: 228980787
* Uniformize composition of AffineApplyOp by constructionNicolas Vasilache2019-03-295-74/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL is the 5th on the path to simplifying AffineMap composition. This removes the distinction between normalized single-result AffineMap and more general composed multi-result map. One nice byproduct of making the implementation driven by single-result is that the multi-result extension is a trivial change: the implementation is still single-result and we just use: ``` unsigned idx = getIndexOf(...); map.getResult(idx); ``` This CL also fixes an AffineNormalizer implementation issue related to symbols. Namely it stops performing substitutions on symbols in AffineNormalizer and instead concatenates them all to be consistent with the call to `AffineMap::compose(AffineMap)`. This latter call to `compose` cannot perform simplifications of symbols coming from different maps based on positions only: i.e. dims are applied and renumbered but symbols must be concatenated. The only way to determine whether symbols from different AffineApply are the same is to look at the concrete values. The canonicalizeMapAndOperands is thus extended with behavior to support replacing operands that appear multiple times. Lastly, this CL demonstrates that the implementation is correct by rewriting ComposeAffineMaps using only `makeComposedAffineApply`. The implementation uses a matcher because AffineApplyOp are introduced as composed operations on the fly instead of iteratively forwardSubstituting. For this purpose, a walker would revisit freshly introduced AffineApplyOp. Regardless, ComposeAffineMaps is scheduled to disappear, this CL replaces the implementation based on iterative `forwardSubstitute` by a composed-by-construction `makeComposedAffineApply`. Remaining calls to `forwardSubstitute` will be removed in the next CL. PiperOrigin-RevId: 228830443
* [MLIR] Make SuperVectorization use normalized AffineApplyOpNicolas Vasilache2019-03-294-201/+246
| | | | | | | | | Supervectorization does not plan on handling multi-result AffineMaps and non-canonical chains of > 1 AffineApplyOp. This CL uses the simpler single-result unbounded AffineApplyOp in the MaterializeVectors pass. PiperOrigin-RevId: 228469085
* Introduce AffineMap::compose(AffineMap)Nicolas Vasilache2019-03-291-3/+10
| | | | | | | | | | | This CL is the 2nd on the path to simplifying AffineMap composition. This CL uses the now accepted `AffineExpr::compose(AffineMap)` to implement `AffineMap::compose(AffineMap)`. Implications of keeping the simplification function in Analysis are documented where relevant. PiperOrigin-RevId: 228276646
* Introduce AffineExpr::compose(AffineMap)Nicolas Vasilache2019-03-291-0/+7
| | | | | | | | | | | This CL is the 1st on the path to simplifying AffineMap composition. This CL uses the now accepted AffineExpr.replaceDimsAndSymbols to implement `AffineExpr::compose(AffineMap)`. Arguably, `simplifyAffineExpr` should be part of IR and not Analysis but this CL does not yet pull the trigger on that. PiperOrigin-RevId: 228265845
* Iterate on vector rather than DenseMap during AffineMap normalizationNicolas Vasilache2019-03-291-3/+3
| | | | | | | This CL removes a flakyness associated to a spurious iteration on DenseMap iterators when normalizing AffineMap. PiperOrigin-RevId: 228160074
* [MLIR] Introduce normalized single-result unbounded AffineApplyOpNicolas Vasilache2019-03-291-0/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Supervectorization does not plan on handling multi-result AffineMaps and non-canonical chains of > 1 AffineApplyOp. This CL introduces a simpler abstraction and composition of single-result unbounded AffineApplyOp by using the existing unbound AffineMap composition. This CL adds a simple API call and relevant tests: ```c++ OpPointer<AffineApplyOp> makeNormalizedAffineApply( FuncBuilder *b, Location loc, AffineMap map, ArrayRef<Value*> operands); ``` which creates a single-result unbounded AffineApplyOp. The operands of AffineApplyOp are not themselves results of AffineApplyOp by consrtuction. This represent the simplest possible interface to complement the composition of (mathematical) AffineMap, for the cases when we are interested in applying it to Value*. In this CL the composed AffineMap is not compressed (i.e. there exist operands that are not part of the result). A followup commit will compress to normal form. The single-result unbounded AffineApplyOp abstraction will be used in a followup CL to support the MaterializeVectors pass. PiperOrigin-RevId: 227879021
* Eliminate extfunc/cfgfunc/mlfunc as a concept, and just use 'func' instead.Chris Lattner2019-03-2913-34/+34
| | | | | | | | | | | | | The entire compiler now looks at structural properties of the function (e.g. does it have one block, does it contain an if/for stmt, etc) so the only thing holding up this difference is round tripping through the parser/printer syntax. Removing this shrinks the compile by ~140LOC. This is step 31/n towards merging instructions and statements. The last step is updating the docs, which I will do as a separate patch in order to split it from this mostly mechanical patch. PiperOrigin-RevId: 227540453
* [MLIR] Sketch a simple set of EDSCs to declaratively write MLIRNicolas Vasilache2019-03-291-39/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL introduces a simple set of Embedded Domain-Specific Components (EDSCs) in MLIR components: 1. a `Type` system of shell classes that closely matches the MLIR type system. These types are subdivided into `Bindable` leaf expressions and non-bindable `Expr` expressions; 2. an `MLIREmitter` class whose purpose is to: a. maintain a map of `Bindable` leaf expressions to concrete SSAValue*; b. provide helper functionality to specify bindings of `Bindable` classes to SSAValue* while verifying comformable types; c. traverse the `Expr` and emit the MLIR. This is used on a concrete example to implement MemRef load/store with clipping in the LowerVectorTransfer pass. More specifically, the following pseudo-C++ code: ```c++ MLFuncBuilder *b = ...; Location location = ...; Bindable zero, one, expr, size; // EDSL expression auto access = select(expr < zero, zero, select(expr < size, expr, size - one)); auto ssaValue = MLIREmitter(b) .bind(zero, ...) .bind(one, ...) .bind(expr, ...) .bind(size, ...) .emit(location, access); ``` is used to emit all the MLIR for a clipped MemRef access. This simple EDSL can easily be extended to more powerful patterns and should serve as the counterpart to pattern matchers (and could potentially be unified once we get enough experience). In the future, most of this code should be TableGen'd but for now it has concrete valuable uses: make MLIR programmable in a declarative fashion. This CL also adds Stmt, proper supporting free functions and rewrites VectorTransferLowering fully using EDSCs. The code for creating the EDSCs emitting a VectorTransferReadOp as loops with clipped loads is: ```c++ Stmt block = Block({ tmpAlloc = alloc(tmpMemRefType), vectorView = vector_type_cast(tmpAlloc, vectorMemRefType), ForNest(ivs, lbs, ubs, steps, { scalarValue = load(scalarMemRef, accessInfo.clippedScalarAccessExprs), store(scalarValue, tmpAlloc, accessInfo.tmpAccessExprs), }), vectorValue = load(vectorView, zero), tmpDealloc = dealloc(tmpAlloc.getLHS())}); emitter.emitStmt(block); ``` where `accessInfo.clippedScalarAccessExprs)` is created with: ```c++ select(i + ii < zero, zero, select(i + ii < N, i + ii, N - one)); ``` The generated MLIR resembles: ```mlir %1 = dim %0, 0 : memref<?x?x?x?xf32> %2 = dim %0, 1 : memref<?x?x?x?xf32> %3 = dim %0, 2 : memref<?x?x?x?xf32> %4 = dim %0, 3 : memref<?x?x?x?xf32> %5 = alloc() : memref<5x4x3xf32> %6 = vector_type_cast %5 : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>> for %i4 = 0 to 3 { for %i5 = 0 to 4 { for %i6 = 0 to 5 { %7 = affine_apply #map0(%i0, %i4) %8 = cmpi "slt", %7, %c0 : index %9 = affine_apply #map0(%i0, %i4) %10 = cmpi "slt", %9, %1 : index %11 = affine_apply #map0(%i0, %i4) %12 = affine_apply #map1(%1, %c1) %13 = select %10, %11, %12 : index %14 = select %8, %c0, %13 : index %15 = affine_apply #map0(%i3, %i6) %16 = cmpi "slt", %15, %c0 : index %17 = affine_apply #map0(%i3, %i6) %18 = cmpi "slt", %17, %4 : index %19 = affine_apply #map0(%i3, %i6) %20 = affine_apply #map1(%4, %c1) %21 = select %18, %19, %20 : index %22 = select %16, %c0, %21 : index %23 = load %0[%14, %i1, %i2, %22] : memref<?x?x?x?xf32> store %23, %5[%i6, %i5, %i4] : memref<5x4x3xf32> } } } %24 = load %6[%c0] : memref<1xvector<5x4x3xf32>> dealloc %5 : memref<5x4x3xf32> ``` In particular notice that only 3 out of the 4-d accesses are clipped: this corresponds indeed to the number of dimensions in the super-vector. This CL also addresses the cleanups resulting from the review of the prevous CL and performs some refactoring to simplify the abstraction. PiperOrigin-RevId: 227367414
* Have the asmprinter take advantage of the new capabilities of the asmparser, byChris Lattner2019-03-291-1/+1
| | | | | | | | | | printing the entry block in a CFG function's argument line. Since I'm touching all of the testcases anyway, change the argument list from printing as "%arg : type" to "%arg: type" which is more consistent with bb arguments. In addition to being more consistent, this is a much nicer look for cfg functions. PiperOrigin-RevId: 227240069
OpenPOWER on IntegriCloud