| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
This CL removes a flakyness associated to a spurious iteration on DenseMap
iterators when normalizing AffineMap.
PiperOrigin-RevId: 228160074
|
|
|
|
| |
PiperOrigin-RevId: 227938032
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- this is CL 1/2 that does a clean up and gets rid of one limitation in an
underlying method - as a result, fusion works for more cases.
- fix bugs/incomplete impl. in toAffineMapFromEq
- fusing across rank changing reshapes for example now just works
For eg. given a rank 1 memref to rank 2 memref reshape (64 -> 8 x 8) like this,
-loop-fusion -memref-dataflow-opt now completely fuses and inlines/store-forward
to get rid of the temporary:
INPUT
// Rank 1 -> Rank 2 reshape
for %i0 = 0 to 64 {
%v = load %A[%i0]
store %v, %B[%i0 floordiv 8, i0 mod 8]
}
for %i1 = 0 to 8
for %i2 = 0 to 8
%w = load %B[%i1, i2]
"foo"(%w) : (f32) -> ()
OUTPUT
$ mlir-opt -loop-fusion -memref-dataflow-opt fuse_reshape.mlir
#map0 = (d0, d1) -> (d0 * 8 + d1)
mlfunc @fuse_reshape(%arg0: memref<64xf32>) {
for %i0 = 0 to 8 {
for %i1 = 0 to 8 {
%0 = affine_apply #map0(%i0, %i1)
%1 = load %arg0[%0] : memref<64xf32>
"foo"(%1) : (f32) -> ()
}
}
}
AFAIK, there is no polyhedral tool / compiler that can perform such fusion -
because it's not really standard loop fusion, but possible through a
generalized slicing-based approach such as ours.
PiperOrigin-RevId: 227918338
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Supervectorization does not plan on handling multi-result AffineMaps and
non-canonical chains of > 1 AffineApplyOp.
This CL introduces a simpler abstraction and composition of single-result
unbounded AffineApplyOp by using the existing unbound AffineMap composition.
This CL adds a simple API call and relevant tests:
```c++
OpPointer<AffineApplyOp> makeNormalizedAffineApply(
FuncBuilder *b, Location loc, AffineMap map, ArrayRef<Value*> operands);
```
which creates a single-result unbounded AffineApplyOp.
The operands of AffineApplyOp are not themselves results of AffineApplyOp by
consrtuction.
This represent the simplest possible interface to complement the composition
of (mathematical) AffineMap, for the cases when we are interested in applying
it to Value*.
In this CL the composed AffineMap is not compressed (i.e. there exist operands
that are not part of the result). A followup commit will compress to normal
form.
The single-result unbounded AffineApplyOp abstraction will be used in a
followup CL to support the MaterializeVectors pass.
PiperOrigin-RevId: 227879021
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The strict requirement (i.e. at least 2 HW vectors in a super-vector) was a
premature optimization to avoid interfering with other vector code potentially
introduced via other means.
This CL avoids this premature optimization and the spurious errors it causes
when super-vector size == HW vector size (which is a possible corner case).
This may be revisited in the future.
PiperOrigin-RevId: 227763966
|
|
|
|
|
|
|
| |
The omission of an early exit created opportunities for unitialized memory
reads. This CL fixes the issue.
PiperOrigin-RevId: 227761814
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The entire compiler now looks at structural properties of the function (e.g.
does it have one block, does it contain an if/for stmt, etc) so the only thing
holding up this difference is round tripping through the parser/printer syntax.
Removing this shrinks the compile by ~140LOC.
This is step 31/n towards merging instructions and statements. The last step
is updating the docs, which I will do as a separate patch in order to split it
from this mostly mechanical patch.
PiperOrigin-RevId: 227540453
|
|
|
|
|
|
|
|
|
|
| |
runOnCFG/MLFunction override locations. Passes that care can handle this
filtering if they choose. Also, eliminate one needless difference between
CFG/ML functions in the parser.
This is step 30/n towards merging instructions and statements.
PiperOrigin-RevId: 227515912
|
|
|
|
|
|
|
|
| |
- drop these ununsed/incomplete sketches given the new design
@albertcohen is working on, and given that FlatAffineConstraints is now
stable and fast enough for all the analyses/transforms that depend on it.
PiperOrigin-RevId: 227322739
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- introduce PostDominanceInfo in the right/complete way and use that for post
dominance check in store-load forwarding
- replace all uses of Analysis/Utils::dominates/properlyDominates with
DominanceInfo::dominates/properlyDominates
- drop all redundant copies of dominance methods in Analysis/Utils/
- in pipeline-data-transfer, replace dominates call with a much less expensive
check; similarly, substitute dominates() in checkMemRefAccessDependence with
a simpler check suitable for that context
- fix a bug in properlyDominates
- improve doc for 'for' instruction 'body'
PiperOrigin-RevId: 227320507
|
|
|
|
|
|
|
|
|
| |
- dominates() for blocks was assuming that there was only a single block at the
top level whenever there was a hierarchy of blocks (as in the case of 'for'/'if'
instructions).
- fix the comments as well
PiperOrigin-RevId: 227319738
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
function pass, and eliminating the need to copy over code and do
interprocedural updates. While here, also improve it to make fewer empty
blocks, and rename it to "LowerIfAndFor" since that is what it does. This is
a net reduction of ~170 lines of code.
As drive-bys, change the splitBlock method to *not* insert an unconditional
branch, since that behavior is annoying for all clients. Also improve the
AsmPrinter to not crash when a block is referenced that isn't linked into a
function.
PiperOrigin-RevId: 227308856
|
|
|
|
|
|
|
| |
PrintOpStatsPass is maintaining state (op stats ) across functions and doing
per-module work - it should be a module pass.
PiperOrigin-RevId: 227294151
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- the load/store forwarding relies on memref dependence routines as well as
SSA/dominance to identify the memref store instance uniquely supplying a value
to a memref load, and replaces the result of that load with the value being
stored. The memref is also deleted when possible if only stores remain.
- add methods for post dominance for MLFunction blocks.
- remove duplicated getLoopDepth/getNestingDepth - move getNestingDepth,
getMemRefAccess, getNumCommonSurroundingLoops into Analysis/Utils (were
earlier static)
- add a helper method in FlatAffineConstraints - isRangeOneToOne.
PiperOrigin-RevId: 227252907
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
better order.
- update isEmpty() to eliminate IDs in a better order. Speed improvement for
complex cases (for eg. high-d reshape's involving mod's/div's).
- minor efficiency update to projectOut (was earlier making an extra albeit
benign call to gaussianEliminateIds) (NFC).
- move getBestIdToEliminate further up in the file (NFC).
- add the failing test case.
- add debug info to checkMemRefAccessDependence.
PiperOrigin-RevId: 227244634
|
|
|
|
|
|
|
|
|
|
|
| |
Function::walk functionality into f->walkInsts/Ops which allows visiting all
instructions, not just ops. Eliminate Function::getBody() and
Function::getReturn() helpers which crash in CFG functions, and were only kept
around as a bridge.
This is step 25/n towards merging instructions and statements.
PiperOrigin-RevId: 227243966
|
|
|
|
|
|
|
|
|
|
| |
requires enhancing DominanceInfo to handle the structure of an ML function,
which is required anyway. Along the way, this also fixes a const correctness
problem with Instruction::getBlock().
This is step 24/n towards merging instructions and statements.
PiperOrigin-RevId: 227228900
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the function signature, giving them common functionality to ml functions. This
is a strictly additive patch that adds new capability without changing behavior
in a significant way (other than a few diagnostic cleanups). A subsequent
patch will change the printer to use this behavior, which will require updating
a ton of testcases. :)
This exposes the fact that we need to make a grammar change for block
arguments, as is tracked by b/122119779
This is step 23/n towards merging instructions and statements, and one of the
first steps towards eliminating the "cfg vs ml" distinction at a syntax and
semantic level.
PiperOrigin-RevId: 227228342
|
|
|
|
| |
PiperOrigin-RevId: 227196077
|
|
|
|
|
|
|
|
|
| |
consistent and moving the using declarations over. Hopefully this is the last
truly massive patch in this refactoring.
This is step 21/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227178245
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- extend/complete dependence tester to utilize local var info while adding
access function equality constraints; one more step closer to get slicing
based fusion working in the general case of affine_apply's involving mod's/div's.
- update test case to reflect more accurate dependence information; remove
inaccurate comment on test case mod_deps.
- fix a minor "bug" in equality addition in addMemRefAccessConstraints (doesn't
affect correctness, but the fixed version is more intuitive).
- some more surrounding code clean up
- move simplifyAffineExpr out of anonymous AffineExprFlattener class - the
latter has state, and the former should reside outside.
PiperOrigin-RevId: 227175600
|
|
|
|
|
|
|
|
|
|
|
| |
did not make an effort to rename all of the 'bb' names in the codebase, since they are still correct and any specific missed once can be fixed up on demand.
The last major renaming is Statement -> Instruction, which is why Statement and
Stmt still appears in various places.
This is step 19/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227163082
|
|
|
|
|
|
|
|
| |
Function.
This is step 18/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227139399
|
|
|
|
|
|
|
|
| |
StmtResult -> InstResult, StmtOperand -> InstOperand, and remove the old names.
This is step 17/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227121537
|
|
|
|
|
|
|
|
| |
OperationInst. This is a big mechanical patch.
This is step 16/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227093712
|
|
|
|
|
|
|
| |
- inconsistent local var constraint size when repeatedly using the same
flattener for all expressions in a map.
PiperOrigin-RevId: 227067836
|
|
|
|
|
|
|
|
| |
FuncBuilder class. Also rename SSAValue.cpp to Value.cpp
This is step 12/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227067644
|
|
|
|
|
|
|
|
|
| |
is the new base of the SSA value hierarchy. This CL also standardizes all the
nomenclature and comments to use 'Value' where appropriate. This also eliminates a large number of cast<MLValue>(x)'s, which is very soothing.
This is step 11/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227064624
|
|
|
|
|
|
|
|
|
|
|
|
| |
ExtFunction classes, using the Statement/StmtBlock hierarchy and Function instead.
This *only* changes the internal data structures, it does not affect the user visible syntax or structure of MLIR code. Function gets new "isCFG()" sorts of predicates as a transitional measure.
This patch is gross in a number of ways, largely in an effort to reduce the amount of mechanical churn in one go. It introduces a bunch of using decls to keep the old names alive for now, and a bunch of stuff needs to be renamed.
This is step 10/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 227044402
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Existing implementation of isContiguousAccess asserts that one of the
function arguments is within certain range, depending on another parameter.
However, the value of this argument may come from outside, in particular in the
loop vectorization pass it may come from command line arguments. This leads
to 'mlir-opt' crashing on an assertion depending on flags. Handle the error
gracefully by reporting error returning a negative result instead. This
negative result prevents any further transformation by the vectorizer so the IR
remains valid.
PiperOrigin-RevId: 227029496
|
|
|
|
|
|
|
|
| |
Move PrintOpStatsPass out of tools and to other passes (moved to Analysis as it
doesn't modify the program but it is different than the other analysis passes
as it is only consumer at present is the user).
PiperOrigin-RevId: 227018996
|
|
|
|
|
|
|
|
|
|
|
|
| |
making it more similar to the CFG side of things. It is true that in a deeply
nested case that this is not a guaranteed O(1) time operation, and that 'get'
could lead compiler hackers to think this is cheap, but we need to merge these
and we can look into solutions for this in the future if it becomes a problem
in practice.
This is step 9/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 226983931
|
|
|
|
|
|
|
|
|
|
| |
from it. This is necessary progress to squaring away the parent relationship
that a StmtBlock has with its enclosing if/for/fn, and makes room for functions
to have more than one block in the future. This also removes IfClause and ForStmtBody.
This is step 5/n towards merging instructions and statements, NFC.
PiperOrigin-RevId: 226936541
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
StmtBlock. This is more consistent with IfStmt and also conceptually makes
more sense - a forstmt "isn't" its body, it contains its body.
This is step 1/N towards merging BasicBlock and StmtBlock. This is required
because in the new regime StmtBlock will have a use list (just like BasicBlock
does) of operands, and ForStmt already has a use list for its induction
variable.
This is a mechanical patch, NFC.
PiperOrigin-RevId: 226684158
|
|
|
|
|
|
|
|
| |
which specify the source loop nest depth at which to perform iteration space slicing, and the destination loop nest depth at which to insert the compution slice.
Updates LoopFusion pass to take these parameters as command line flags for experimentation.
PiperOrigin-RevId: 226514297
|
|
|
|
|
|
| |
equality constraints (working on test cases).
PiperOrigin-RevId: 226399089
|
|
|
|
|
|
|
|
| |
constraints used to calculate computation slice for loop fusion.
This done so that the dominance check between ancestors of op statements from src/dst memref accesses will be run.
PiperOrigin-RevId: 226350443
|
|
|
|
|
|
| |
tests cases.
PiperOrigin-RevId: 226277453
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reuse existing ones.
- drop IterationDomainContext, redundant since FlatAffineConstraints has
MLValue information associated with its dimensions.
- refactor to use existing support
- leads to a reduction in LOC
- as a result of these changes, non-constant loop bounds get naturally
supported for dep analysis.
- update test cases to include a couple with non-constant loop bounds
- rename addBoundsFromForStmt -> addForStmtDomain
- complete TODO for getLoopIVs (handle 'if' statements)
PiperOrigin-RevId: 226082008
|
|
|
|
|
|
|
|
|
|
| |
- when adding constraints from a 'for' stmt into FlatAffineConstraints,
correctly add bound operands of the 'for' stmt as a dimensional identifier or
a symbolic identifier depending on whether the bound operand is a valid
MLFunction symbol
- update test case to exercise this.
PiperOrigin-RevId: 225988511
|
|
|
|
|
|
|
|
|
|
|
| |
addDomainConstraints; add support for mod/div for dependence testing.
- add support for mod/div expressions in dependence analysis
- refactor addMemRefAccessConstraints to use getFlattenedAffineExprs (instead
of getFlattenedAffineExpr); update addDomainConstraints.
- rename AffineExprFlattener::cst -> localVarCst
PiperOrigin-RevId: 225933306
|
|
|
|
|
|
| |
memref-dep-check / getIterationDomainContext
PiperOrigin-RevId: 225857762
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As MLIR moves towards dialect-specific types, a generic Type::getBitWidth does
not make sense for all of them. Even with the current type system, the bit
width is not defined (and causes the method in question to abort) for all
TensorFlow types.
This commit restricts the bit width definition to primitive standard types that
have a number of bits appearing verbatim in their type, i.e., integers and
floats. As a side effect, it delegates the decision on the bit width of the
`index` to the backends. Existing backends currently hardcode it to 64 bits.
The Type::getBitWidth method is replaced by Type::getIntOrFloatBitWidth that
only applies to integers and floats. The call sites are updated to use the new
method, where applicable, or rewritten so as not rely on it. Incidentally,
this fixes a utility method that did not account for memrefs being allowed to
have vectors as element types in the size computation.
As an observation, several places in the code use Type in places where a more
specific type could be used instead. Some of those are fixed by this commit.
PiperOrigin-RevId: 225844792
|
|
|
|
|
|
|
|
|
|
|
|
| |
fusion based on slicing; encompasses standard loop fusion.
*) Adds simple greedy fusion algorithm to drive experimentation. This algorithm greedily fuses loop nests with single-writer/single-reader memref dependences to improve locality.
*) Adds support for fusing slices of a loop nest computation: fusing one loop nest into another by adjusting the source loop nest's iteration bounds (after it is fused into the destination loop nest). This is accomplished by solving for the source loop nest's IVs in terms of the destination loop nests IVs and symbols using the dependece polyhedron, then creating AffineMaps of these functions for the loop bounds of the fused source loop.
*) Adds utility function 'insertMemRefComputationSlice' which computes and inserts computation slice from loop nest surrounding a source memref access into the loop nest surrounding the destingation memref access.
*) Adds FlatAffineConstraints::toAffineMap function which returns and AffineMap which represents an equality contraint where one dimension identifier is represented as a function of all others in the equality constraint.
*) Adds multiple fusion unit tests.
PiperOrigin-RevId: 225842944
|
|
|
|
|
|
|
|
|
| |
- use addBoundsForForStmt
- getLoopIVs can return a vector of ForStmt * instead of const ForStmt *; the
returned things aren't owned / part of the stmt on which it's being called.
- other minor API cleanup
PiperOrigin-RevId: 225774301
|
|
|
|
|
|
|
|
| |
- extend memref-bound-check to store op's
- make the bound check an analysis util and move to lib/Analysis/Utils.cpp (so that
one doesn't need to always create a pass to use it)
PiperOrigin-RevId: 225564830
|
|
|
|
|
|
|
|
|
|
|
| |
From the beginning, vector_transfer_read and vector_transfer_write opreations
were intended as a mid-level vectorization abstraction. In particular, they
are lowered to the StandardOps dialect before further processing. As such, it
does not make sense to keep them at the same level as StandardOps. Introduce
the new SuperVectorOps dialect and move vector_transfer_* operations there.
This will be used as a testbed for the generic lowering/legalization pass.
PiperOrigin-RevId: 225554492
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- if a local id was already for a specific mod/div expression, just reuse it if
the expression repeats (instead of adding a new one).
- drastically reduces the number of local variables added during flattening for
real use cases - since the same div's and mod expressions often repeat.
- add getFlattenedAffineExprs for AffineMap, IntegerSet based on the above
As a natural result of the above:
- FlatAffineConstraints(IntegerSet) ctor now deals with integer sets that have mod
and div constraints as well, and these get simplified as well from -simplify-affine-structures
PiperOrigin-RevId: 225452174
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
trivially redundant constraints. Update projectOut to eliminate identifiers in
a more efficient order. Fix b/120801118.
- add method to remove duplicate / trivially redundant constraints from
FlatAffineConstraints (use a hashing-based approach with DenseSet)
- update projectOut to eliminate identifiers in a more efficient order
(A sequence of affine_apply's like this (from a real use case) finally exposed
the lack of the above trivial/low hanging simplifications).
for %ii = 0 to 64 {
for %jj = 0 to 9 {
%a0 = affine_apply (d0, d1) -> (d0 * (9 * 1024) + d1 * 128) (%ii, %jj)
%a1 = affine_apply (d0) ->
(d0 floordiv (2 * 3 * 3 * 128 * 128),
(d0 mod 294912) floordiv (3 * 3 * 128 * 128),
(((d0 mod 294912) mod 147456) floordiv 1152) floordiv 8,
(((d0 mod 294912) mod 147456) mod 1152) floordiv 384,
((((d0 mod 294912) mod 147456) mod 1152) mod 384) floordiv 128,
(((((d0 mod 294912) mod 147456) mod 1152) mod 384) mod 128)
floordiv 128) (%a0)
%v0 = load %in[%a1tensorflow/mlir#0, %a1tensorflow/mlir#1, %a1tensorflow/mlir#3, %a1tensorflow/mlir#4, %a1tensorflow/mlir#2, %a1tensorflow/mlir#5]
: memref<2x2x3x3x16x1xi32>
}
}
- update FlatAffineConstraints::print to print number of constraints.
PiperOrigin-RevId: 225397480
|
|
|
|
|
|
|
|
|
|
|
| |
- getDimensionBounds() was added initially for quick experimentation - no
longer used (getConstantBoundOnDimSize is the more powerful/complete
replacement).
- FlatAffineConstraints::getConstantLower/UpperBound are incomplete,
functionality/naming-wise misleading, and not used currently. Removing these;
complete/fixed version will be added in an upcoming CL.
PiperOrigin-RevId: 225075061
|