| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
The is the second step to adding a namespace to the AffineOps dialect.
PiperOrigin-RevId: 232717775
|
|
|
|
|
|
| |
namespace to the affine dialect.
PiperOrigin-RevId: 232707862
|
|
|
|
|
|
| |
Function/Block/Instruction.
PiperOrigin-RevId: 232388113
|
|
|
|
| |
PiperOrigin-RevId: 232323671
|
|
|
|
|
|
| |
the Instruction variants.
PiperOrigin-RevId: 232322030
|
|
|
|
|
|
| |
mechanical, i.e. changing usages of ForInst to OpPointer<AffineForOp>. An important difference is that upon construction an AffineForOp no longer automatically creates the body and induction variable. To generate the body/iv, 'createBody' can be called on an AffineForOp with no body.
PiperOrigin-RevId: 232060516
|
|
|
|
|
|
| |
works with it, and updating the g3docs.
PiperOrigin-RevId: 231120927
|
|
|
|
|
|
| |
instead of the ForInst itself. This is a necessary step in converting ForInst into an operation.
PiperOrigin-RevId: 231064139
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Addresses b/122486036
This CL addresses some leftover crumbs in AffineMap and IntegerSet by removing
the Null method and cleaning up the constructors.
As the ::Null uses were tracked down, opportunities appeared to untangle some
of the Parsing logic and make it explicit where AffineMap/IntegerSet have
ambiguous syntax. Previously, ambiguous cases were hidden behind the implicit
pointer values of AffineMap* and IntegerSet* that were passed as function
parameters. Depending the values of those pointers one of 3 behaviors could
occur.
This parsing logic convolution is one of the rare cases where I would advocate
for code duplication. The more proper fix would be to make the syntax
unambiguous or to allow some lookahead.
PiperOrigin-RevId: 231058512
|
|
|
|
|
|
|
|
|
|
| |
- Update createAffineComputationSlice to generate a sequence of single result
affine apply ops instead of one multi-result affine apply
- update pipeline-data-transfer test case; while on this, also update the test
case to use only single result affine maps, and make it more robust to
change.
PiperOrigin-RevId: 230965478
|
|
|
|
| |
PiperOrigin-RevId: 230605756
|
|
|
|
|
|
| |
- readability changes
PiperOrigin-RevId: 229443430
|
|
|
|
|
|
|
|
|
| |
- the double buffer should be indexed (iv floordiv step) % 2 and NOT (iv % 2);
step wasn't being accounted for.
- fix test cases, enable failing test cases
PiperOrigin-RevId: 228635726
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
simplifying them in minor ways. The only significant cleanup here
is the constant folding pass. All the other changes are simple and easy,
but this is still enough to shrink the compiler by 45LOC.
The one pass left to merge is the CSE pass, which will be move involved, so I'm
splitting it out to its own patch (which I'll tackle right after this).
This is step 28/n towards merging instructions and statements.
PiperOrigin-RevId: 227328115
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- introduce PostDominanceInfo in the right/complete way and use that for post
dominance check in store-load forwarding
- replace all uses of Analysis/Utils::dominates/properlyDominates with
DominanceInfo::dominates/properlyDominates
- drop all redundant copies of dominance methods in Analysis/Utils/
- in pipeline-data-transfer, replace dominates call with a much less expensive
check; similarly, substitute dominates() in checkMemRefAccessDependence with
a simpler check suitable for that context
- fix a bug in properlyDominates
- improve doc for 'for' instruction 'body'
PiperOrigin-RevId: 227320507
|
|
|
|
|
|
|
|
|
| |
consistent and moving the using declarations over. Hopefully this is the last
truly massive patch in this refactoring.
This is step 21/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227178245
|
|
|
|
|
|
|
|
|
|
|
| |
did not make an effort to rename all of the 'bb' names in the codebase, since they are still correct and any specific missed once can be fixed up on demand.
The last major renaming is Statement -> Instruction, which is why Statement and
Stmt still appears in various places.
This is step 19/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227163082
|
|
|
|
|
|
|
|
| |
Function.
This is step 18/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227139399
|
|
|
|
|
|
|
|
| |
StmtResult -> InstResult, StmtOperand -> InstOperand, and remove the old names.
This is step 17/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227121537
|
|
|
|
|
|
|
|
| |
OperationInst. This is a big mechanical patch.
This is step 16/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227093712
|
|
|
|
|
|
|
|
| |
FuncBuilder class. Also rename SSAValue.cpp to Value.cpp
This is step 12/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227067644
|
|
|
|
|
|
|
|
|
| |
is the new base of the SSA value hierarchy. This CL also standardizes all the
nomenclature and comments to use 'Value' where appropriate. This also eliminates a large number of cast<MLValue>(x)'s, which is very soothing.
This is step 11/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227064624
|
|
|
|
| |
PiperOrigin-RevId: 226939383
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
StmtBlock. This is more consistent with IfStmt and also conceptually makes
more sense - a forstmt "isn't" its body, it contains its body.
This is step 1/N towards merging BasicBlock and StmtBlock. This is required
because in the new regime StmtBlock will have a use list (just like BasicBlock
does) of operands, and ForStmt already has a use list for its induction
variable.
This is a mechanical patch, NFC.
PiperOrigin-RevId: 226684158
|
|
|
|
|
|
|
|
|
|
|
| |
- loop step wasn't handled and there wasn't a TODO or an assertion; fix this.
- rename 'delay' to shift for consistency/readability.
- other readability changes.
- remove duplicate attribute print for DmaStartOp; fix misplaced attribute
print for DmaWaitOp
- add build method for AddFOp (unrelated to this CL, but add it anyway)
PiperOrigin-RevId: 224892958
|
|
|
|
|
|
|
|
|
|
| |
- adding a conservative check for now (TODO: use the dependence analysis pass
once the latter is extended to deal with DMA ops). resolve an existing bug on
a test case.
- update test cases
PiperOrigin-RevId: 224869526
|
|
|
|
|
|
|
|
|
|
|
| |
- fix replaceAllMemRefUsesWith call to replace only inside loop body.
- handle the case where DMA buffers are dynamic; extend doubleBuffer() method
to handle dynamically shaped DMA buffers (pass the right operands to AllocOp)
- place alloc's for DMA buffers at the depth at which pipelining is being done
(instead of at top-level)
- add more test cases
PiperOrigin-RevId: 224852231
|
|
|
|
|
|
|
|
| |
buffering in the auto DMA overlap pass.
This is done online in the pass.
PiperOrigin-RevId: 222313640
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and getMemRefRegion() to work with specified loop depths; add support for
outgoing DMAs, store op's.
- add support for getMemRefRegion symbolic in outer loops - hence support for
DMAs symbolic in outer surrounding loops.
- add DMA generation support for outgoing DMAs (store op's to lower memory
space); extend getMemoryRegion to store op's. -memref-bound-check now works
with store op's as well.
- fix dma-generate (references to the old memref in the dma_start op were also
being replaced with the new buffer); we need replace all memref uses to work
only on a subset of the uses - add a new optional argument for
replaceAllMemRefUsesWith. update replaceAllMemRefUsesWith to take an optional
'operation' argument to serve as a filter - if provided, only those uses that
are dominated by the filter are replaced.
- Add missing print for attributes for dma_start, dma_wait op's.
- update the FlatAffineConstraints API
PiperOrigin-RevId: 221889223
|
|
|
|
|
|
| |
The passID is not currently stored in Pass but this avoids the unused variable warning. The passID is used to uniquely identify passes, currently this is only stored/used in PassInfo.
PiperOrigin-RevId: 220485662
|
|
|
|
|
|
|
|
| |
Add static pass registration and change mlir-opt to use it. Future work is needed to refactor the registration for PassManager usage.
Change build targets to alwayslink to enforce registration.
PiperOrigin-RevId: 220390178
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce analysis to check memref accesses (in MLFunctions) for out of bound
ones. It works as follows:
$ mlir-opt -memref-bound-check test/Transforms/memref-bound-check.mlir
/tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1
%x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
^
/tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1
%x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
^
/tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#2
%x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
^
/tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#2
%x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
^
/tmp/single.mlir:12:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1
%y = load %B[%idy] : memref<128 x i32>
^
/tmp/single.mlir:12:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1
%y = load %B[%idy] : memref<128 x i32>
^
#map0 = (d0, d1) -> (d0, d1)
#map1 = (d0, d1) -> (d0 * 128 - d1)
mlfunc @test() {
%0 = alloc() : memref<9x9xi32>
%1 = alloc() : memref<128xi32>
for %i0 = -1 to 9 {
for %i1 = -1 to 9 {
%2 = affine_apply #map0(%i0, %i1)
%3 = load %0[%2tensorflow/mlir#0, %2tensorflow/mlir#1] : memref<9x9xi32>
%4 = affine_apply #map1(%i0, %i1)
%5 = load %1[%4] : memref<128xi32>
}
}
return
}
- Improves productivity while manually / semi-automatically developing MLIR for
testing / prototyping; also provides an indirect way to catch errors in
transformations.
- This pass is an easy way to test the underlying affine analysis
machinery including low level routines.
Some code (in getMemoryRegion()) borrowed from @andydavis cl/218263256.
While on this:
- create mlir/Analysis/Passes.h; move Pass.h up from mlir/Transforms/ to mlir/
- fix a bug in AffineAnalysis.cpp::toAffineExpr
TODO: extend to non-constant loop bounds (straightforward). Will transparently
work for all accesses once floordiv, mod, ceildiv are supported in the
AffineMap -> FlatAffineConstraints conversion.
PiperOrigin-RevId: 219397961
|
|
|
|
|
|
| |
This is done by changing Type to be a POD interface around an underlying pointer storage and adding in-class support for isa/dyn_cast/cast.
PiperOrigin-RevId: 219372163
|
|
|
|
|
|
|
| |
distinction. FunctionPasses can now choose to get called on all functions, or
have the driver split CFG/ML Functions up for them. NFC.
PiperOrigin-RevId: 218775885
|
|
|
|
|
|
| |
- return success as long as IR is in a valid state.
PiperOrigin-RevId: 218225317
|
|
|
|
|
|
|
| |
the pattern matcher / canonicalizer, and rename existing eraseFromBlock methods
to align with it.
PiperOrigin-RevId: 218104455
|
|
|
|
|
|
|
|
|
| |
Also rename Operation::is to Operation::isa
Introduce Operation::cast
All of these are for consistency with global dyn_cast/cast/isa operators.
PiperOrigin-RevId: 217878786
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
resolve
multiple TODOs.
- replace the fake test pass (that worked on just the first loop in the
MLFunction) to perform DMA pipelining on all suitable loops.
- nested DMAs work now (DMAs in an outer loop, more DMAs in nested inner loops)
- fix bugs / assumptions: correctly copy memory space and elemental type of source
memref for double buffering.
- correctly identify matching start/finish statements, handle multiple DMAs per
loop.
- introduce dominates/properlyDominates utitilies for MLFunction statements.
- move checkDominancePreservationOnShifts to LoopAnalysis.h; rename it
getShiftValidity
- refactor getContainingStmtPos -> findAncestorStmtInBlock - move into
Analysis/Utils.h; has two users.
- other improvements / cleanup for related API/utilities
- add size argument to dma_wait - for nested DMAs or in general, it makes it
easy to obtain the size to use when lowering the dma_wait since we wouldn't
want to identify the matching dma_start, and more importantly, in general/in the
future, there may not always be a dma_start dominating the dma_wait.
- add debug information in the pass
PiperOrigin-RevId: 217734892
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- add util to create a private / exclusive / single use affine
computation slice for an op stmt (see method doc comment); a single
multi-result affine_apply op is prepended to the op stmt to provide all
results needed for its operands as a function of loop iterators and symbols.
- use it for DMA pipelining (to create private slices for DMA start stmt's);
resolve TODOs/feature request (b/117159533)
- move createComposedAffineApplyOp to Transforms/Utils; free it from taking a
memref as input / generalize it.
PiperOrigin-RevId: 216926818
|
|
|
|
|
|
|
|
| |
* Move Return, Constant and AffineApply out into BuiltinOps;
* BuiltinOps are always registered, while StandardOps follow the same dynamic registration;
* Kept isValidX in MLValue as we don't have a verify on AffineMap so need to keep it callable from Parser (I wanted to move it to be called in verify instead);
PiperOrigin-RevId: 216592527
|
|
|
|
|
|
|
|
| |
This CL applies the same pattern as AffineExpr to AffineMap: a simple struct
that acts as the storage is allocated in the bump pointer. The AffineMap is
immutable and accessed everywhere by value.
PiperOrigin-RevId: 216445930
|
|
|
|
|
|
|
|
|
|
|
| |
Add target independent standard DMA ops: dma.start, dma.wait. Update pipeline
data transfer to use these to detect DMA ops.
While on this
- return failure from mlir-opt::performActions if a pass generates invalid output
- improve error message for verify 'n' operand traits
PiperOrigin-RevId: 216429885
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL introduces a series of cleanups for AffineExpr value types:
1. to make it clear that the value types should be used, the pointer
AffineExpr types are put in the detail namespace. Unfortunately, since the
value type operator-> only forwards to the underlying pointer type, we
still
need to expose this in the include file for now;
2. AffineExprKind is ok to use, it thus comes out of detail and thus of
AffineExpr
3. getAffineDimExpr, getAffineSymbolExpr, getAffineConstantExpr are
similarly
extracted as free functions and their naming is mande consistent across
Builder, MLContext and AffineExpr
4. AffineBinaryOpEx::simplify functions are made into static free
functions.
In particular it is moved away from AffineMap.cpp where it does not belong
5. operator AffineExprType is made explicit
6. uses the binary operators everywhere possible
7. drops the pointer usage everywhere outside of AffineExpr.cpp,
MLIRContext.cpp and AsmPrinter.cpp
PiperOrigin-RevId: 216207212
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with a new one (of a potentially different rank/shape) with an optional index
remapping.
- introduce Utils::replaceAllMemRefUsesWith
- use this for DMA double buffering
(This CL also adds a few temporary utilities / code that will be done away with
once:
1) abstract DMA op's are added
2) memref deferencing side-effect / trait is available on op's
3) b/117159533 is resolved (memref index computation slices).
PiperOrigin-RevId: 215831373
|
|
- loopBodySkew shifts statements of a loop body by stmt-wise delays, and is
typically meant to be used to:
- allow overlap of non-blocking start/wait until completion operations with
other computation
- allow shifting of statements (for better register
reuse/locality/parallelism)
- software pipelining (when applied to the innermost loop)
- an additional argument specifies whether to unroll the prologue and epilogue.
- add method to check SSA dominance preservation.
- add a fake loop pipeline pass to test this utility.
Sample input/output are below. While on this, fix/add following:
- fix minor bug in getAddMulPureAffineExpr
- add additional builder methods for common affine map cases
- fix const_operand_iterator's for ForStmt, etc. When there is no such thing
as 'const MLValue', the iterator shouldn't be returning const MLValue's.
Returning MLValue is const correct.
Sample input/output examples:
1) Simplest case: shift second statement by one.
Input:
for %i = 0 to 7 {
%y = "foo"(%i) : (affineint) -> affineint
%x = "bar"(%i) : (affineint) -> affineint
}
Output:
#map0 = (d0) -> (d0 - 1)
mlfunc @loop_nest_simple1() {
%c8 = constant 8 : affineint
%c0 = constant 0 : affineint
%0 = "foo"(%c0) : (affineint) -> affineint
for %i0 = 1 to 7 {
%1 = "foo"(%i0) : (affineint) -> affineint
%2 = affine_apply #map0(%i0)
%3 = "bar"(%2) : (affineint) -> affineint
}
%4 = affine_apply #map0(%c8)
%5 = "bar"(%4) : (affineint) -> affineint
return
}
2) DMA overlap: shift dma.wait and compute by one.
Input
for %i = 0 to 7 {
%pingpong = affine_apply (d0) -> (d0 mod 2) (%i)
"dma.enqueue"(%pingpong) : (affineint) -> affineint
%pongping = affine_apply (d0) -> (d0 mod 2) (%i)
"dma.wait"(%pongping) : (affineint) -> affineint
"compute1"(%pongping) : (affineint) -> affineint
}
Output
#map0 = (d0) -> (d0 mod 2)
#map1 = (d0) -> (d0 - 1)
#map2 = ()[s0] -> (s0 + 7)
mlfunc @loop_nest_dma() {
%c8 = constant 8 : affineint
%c0 = constant 0 : affineint
%0 = affine_apply #map0(%c0)
%1 = "dma.enqueue"(%0) : (affineint) -> affineint
for %i0 = 1 to 7 {
%2 = affine_apply #map0(%i0)
%3 = "dma.enqueue"(%2) : (affineint) -> affineint
%4 = affine_apply #map1(%i0)
%5 = affine_apply #map0(%4)
%6 = "dma.wait"(%5) : (affineint) -> affineint
%7 = "compute1"(%5) : (affineint) -> affineint
}
%8 = affine_apply #map1(%c8)
%9 = affine_apply #map0(%8)
%10 = "dma.wait"(%9) : (affineint) -> affineint
%11 = "compute1"(%9) : (affineint) -> affineint
return
}
3) With arbitrary affine bound maps:
Shift last two statements by two.
Input:
for %i = %N to ()[s0] -> (s0 + 7)()[%N] {
%y = "foo"(%i) : (affineint) -> affineint
%x = "bar"(%i) : (affineint) -> affineint
%z = "foo_bar"(%i) : (affineint) -> (affineint)
"bar_foo"(%i) : (affineint) -> (affineint)
}
Output
#map0 = ()[s0] -> (s0 + 1)
#map1 = ()[s0] -> (s0 + 2)
#map2 = ()[s0] -> (s0 + 7)
#map3 = (d0) -> (d0 - 2)
#map4 = ()[s0] -> (s0 + 8)
#map5 = ()[s0] -> (s0 + 9)
for %i0 = %arg0 to #map0()[%arg0] {
%0 = "foo"(%i0) : (affineint) -> affineint
%1 = "bar"(%i0) : (affineint) -> affineint
}
for %i1 = #map1()[%arg0] to #map2()[%arg0] {
%2 = "foo"(%i1) : (affineint) -> affineint
%3 = "bar"(%i1) : (affineint) -> affineint
%4 = affine_apply #map3(%i1)
%5 = "foo_bar"(%4) : (affineint) -> affineint
%6 = "bar_foo"(%4) : (affineint) -> affineint
}
for %i2 = #map4()[%arg0] to #map5()[%arg0] {
%7 = affine_apply #map3(%i2)
%8 = "foo_bar"(%7) : (affineint) -> affineint
%9 = "bar_foo"(%7) : (affineint) -> affineint
}
4) Shift one by zero, second by one, third by two
for %i = 0 to 7 {
%y = "foo"(%i) : (affineint) -> affineint
%x = "bar"(%i) : (affineint) -> affineint
%z = "foobar"(%i) : (affineint) -> affineint
}
#map0 = (d0) -> (d0 - 1)
#map1 = (d0) -> (d0 - 2)
#map2 = ()[s0] -> (s0 + 7)
%c9 = constant 9 : affineint
%c8 = constant 8 : affineint
%c1 = constant 1 : affineint
%c0 = constant 0 : affineint
%0 = "foo"(%c0) : (affineint) -> affineint
%1 = "foo"(%c1) : (affineint) -> affineint
%2 = affine_apply #map0(%c1)
%3 = "bar"(%2) : (affineint) -> affineint
for %i0 = 2 to 7 {
%4 = "foo"(%i0) : (affineint) -> affineint
%5 = affine_apply #map0(%i0)
%6 = "bar"(%5) : (affineint) -> affineint
%7 = affine_apply #map1(%i0)
%8 = "foobar"(%7) : (affineint) -> affineint
}
%9 = affine_apply #map0(%c8)
%10 = "bar"(%9) : (affineint) -> affineint
%11 = affine_apply #map1(%c8)
%12 = "foobar"(%11) : (affineint) -> affineint
%13 = affine_apply #map1(%c9)
%14 = "foobar"(%13) : (affineint) -> affineint
5) SSA dominance violated; no shifting if a shift is specified for the second
statement.
for %i = 0 to 7 {
%x = "foo"(%i) : (affineint) -> affineint
"bar"(%x) : (affineint) -> affineint
}
PiperOrigin-RevId: 214975731
|