summaryrefslogtreecommitdiffstats
path: root/mlir/lib/Transforms/LoopUnroll.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Introduce memref bound checking.Uday Bondhugula2019-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce analysis to check memref accesses (in MLFunctions) for out of bound ones. It works as follows: $ mlir-opt -memref-bound-check test/Transforms/memref-bound-check.mlir /tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#2 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#2 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:12:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1 %y = load %B[%idy] : memref<128 x i32> ^ /tmp/single.mlir:12:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1 %y = load %B[%idy] : memref<128 x i32> ^ #map0 = (d0, d1) -> (d0, d1) #map1 = (d0, d1) -> (d0 * 128 - d1) mlfunc @test() { %0 = alloc() : memref<9x9xi32> %1 = alloc() : memref<128xi32> for %i0 = -1 to 9 { for %i1 = -1 to 9 { %2 = affine_apply #map0(%i0, %i1) %3 = load %0[%2tensorflow/mlir#0, %2tensorflow/mlir#1] : memref<9x9xi32> %4 = affine_apply #map1(%i0, %i1) %5 = load %1[%4] : memref<128xi32> } } return } - Improves productivity while manually / semi-automatically developing MLIR for testing / prototyping; also provides an indirect way to catch errors in transformations. - This pass is an easy way to test the underlying affine analysis machinery including low level routines. Some code (in getMemoryRegion()) borrowed from @andydavis cl/218263256. While on this: - create mlir/Analysis/Passes.h; move Pass.h up from mlir/Transforms/ to mlir/ - fix a bug in AffineAnalysis.cpp::toAffineExpr TODO: extend to non-constant loop bounds (straightforward). Will transparently work for all accesses once floordiv, mod, ceildiv are supported in the AffineMap -> FlatAffineConstraints conversion. PiperOrigin-RevId: 219397961
* Simplify FunctionPass to eliminate the CFGFunctionPass/MLFunctionPassChris Lattner2019-03-291-2/+3
| | | | | | | distinction. FunctionPasses can now choose to get called on all functions, or have the driver split CFG/ML Functions up for them. NFC. PiperOrigin-RevId: 218775885
* Change typedef to using to be consistent across the codebaseLei Zhang2019-03-291-1/+1
| | | | | | Google C++ style guide also prefers using to typedef. PiperOrigin-RevId: 218541849
* Split BuiltinOps out of StandardOps.Jacques Pienaar2019-03-291-1/+1
| | | | | | | | * Move Return, Constant and AffineApply out into BuiltinOps; * BuiltinOps are always registered, while StandardOps follow the same dynamic registration; * Kept isValidX in MLValue as we don't have a verify on AffineMap so need to keep it callable from Parser (I wanted to move it to be called in verify instead); PiperOrigin-RevId: 216592527
* [MLIR] AffineMap value typeNicolas Vasilache2019-03-291-6/+6
| | | | | | | | This CL applies the same pattern as AffineExpr to AffineMap: a simple struct that acts as the storage is allocated in the bump pointer. The AffineMap is immutable and accessed everywhere by value. PiperOrigin-RevId: 216445930
* [MLIR] Cleanup AffineExprNicolas Vasilache2019-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This CL introduces a series of cleanups for AffineExpr value types: 1. to make it clear that the value types should be used, the pointer AffineExpr types are put in the detail namespace. Unfortunately, since the value type operator-> only forwards to the underlying pointer type, we still need to expose this in the include file for now; 2. AffineExprKind is ok to use, it thus comes out of detail and thus of AffineExpr 3. getAffineDimExpr, getAffineSymbolExpr, getAffineConstantExpr are similarly extracted as free functions and their naming is mande consistent across Builder, MLContext and AffineExpr 4. AffineBinaryOpEx::simplify functions are made into static free functions. In particular it is moved away from AffineMap.cpp where it does not belong 5. operator AffineExprType is made explicit 6. uses the binary operators everywhere possible 7. drops the pointer usage everywhere outside of AffineExpr.cpp, MLIRContext.cpp and AsmPrinter.cpp PiperOrigin-RevId: 216207212
* [MLIR] Remove uses of AffineExpr* outside of IRNicolas Vasilache2019-03-291-3/+2
| | | | | | | | | | | | | This CL uniformizes the uses of AffineExprWrap outside of IR. The public API of AffineExpr builder is modified to only use AffineExprWrap. A few places access AffineExprWrap.expr, this is only while the API is in transition to easily keep track (i.e. make expr private and let the compiler track the errors). Parser.cpp exhibits patterns that are dependent on nullptr values so converting it is left for another CL. PiperOrigin-RevId: 215642005
* Change behavior of loopUnrollFull with unroll factor 1Uday Bondhugula2019-03-291-3/+8
| | | | | | | Using loopUnrollFull with unroll factor 1 should promote the loop body as opposed to doing nothing. PiperOrigin-RevId: 214812126
* Extend loop unroll/unroll-and-jam to affine bounds + refactor related code.Uday Bondhugula2019-03-291-81/+33
| | | | | | | | | | | | | | | | | - extend loop unroll-jam similar to loop unroll for affine bounds - extend both loop unroll/unroll-jam to deal with cleanup loop for non multiple of unroll factor. - extend promotion of single iteration loops to work with affine bounds - fix typo bugs in loop unroll - refactor common code b/w loop unroll and loop unroll-jam - move prototypes of non-pass transforms to LoopUtils.h - add additional builder methods. - introduce loopUnrollUpTo(factor) to unroll by either factor or trip count, whichever is less. - remove Statement::isInnermost (not used for now - will come back at the right place/in right form later) PiperOrigin-RevId: 213471227
* Store 'then' clause statements directly in the 'if' statement.Tatiana Shpeisman2019-03-291-2/+3
| | | | | | Also a few minor changes. PiperOrigin-RevId: 213359024
* Misc changes to builder's and Transforms/ API to allow code generation.Uday Bondhugula2019-03-291-6/+3
| | | | | | | | | - add builder method for ReturnOp - expose API from Transforms/ to work on specific ML statements (do this for LoopUnroll, LoopUnrollAndJam) - add MLFuncBuilder::getForStmtBodyBuilder, ::getBlock PiperOrigin-RevId: 213074178
* Add PassResult and have passes return PassResult to indicate failure/success.Jacques Pienaar2019-03-291-3/+4
| | | | | | For FunctionPass's for passes that want to stop upon error encountered. PiperOrigin-RevId: 213058651
* Extend getConstantTripCount to deal with a larger subset of loop bounds; ↵Uday Bondhugula2019-03-291-30/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | make loop unroll/unroll-and-jam more powerful; add additional affine expr builder methods - use previously added analysis/simplification to infer multiple of unroll factor trip counts, making loop unroll/unroll-and-jam more general. - for loop unroll, support bounds that are single result affine map's with the same set of operands. For unknown loop bounds, loop unroll will now work as long as trip count can be determined to be a multiple of unroll factor. - extend getConstantTripCount to deal with single result affine map's with the same operands. move it to mlir/Analysis/LoopAnalysis.cpp - add additional builder utility methods for affine expr arithmetic (difference, mod/floordiv/ceildiv w.r.t postitive constant). simplify code to use the utility methods. - move affine analysis routines to AffineAnalysis.cpp/.h from AffineStructures.cpp/.h. - Rename LoopUnrollJam to LoopUnrollAndJam to match class name. - add an additional simplification for simplifyFloorDiv, simplifyCeilDiv - Rename AffineMap::getNumOperands() getNumInputs: an affine map by itself does not have operands. Operands are passed to it through affine_apply, from loop bounds/if condition's, etc., operands are stored in the latter. This should be sufficiently powerful for now as far as unroll/unroll-and-jam go for TPU code generation, and can move to other analyses/transformations. Loop nests like these are now unrolled without any cleanup loop being generated. for %i = 1 to 100 { // unroll factor 4: no cleanup loop will be generated. for %j = (d0) -> (d0) (%i) to (d0) -> (5*d0 + 3) (%i) { %x = "foo"(%j) : (affineint) -> i32 } } for %i = 1 to 100 { // unroll factor 4: no cleanup loop will be generated. for %j = (d0) -> (d0) (%i) to (d0) -> (d0 - d mod 4 - 1) (%i) { %y = "foo"(%j) : (affineint) -> i32 } } for %i = 1 to 100 { for %j = (d0) -> (d0) (%i) to (d0) -> (d0 + 128) (%i) { %x = "foo"() : () -> i32 } } TODO(bondhugula): extend this to LoopUnrollAndJam as well in the next CL (with minor changes). PiperOrigin-RevId: 212661212
* Add utility to promote single iteration loops. Add methods for getting constantUday Bondhugula2019-03-291-55/+24
| | | | | | | | | | | loop counts. Improve / refactor loop unroll / loop unroll and jam. - add utility to remove single iteration loops. - use this utility to promote single iteration loops after unroll/unroll-and-jam - use loopUnrollByFactor for loopUnrollFull and remove most of the latter. - add methods for getting constant loop trip count PiperOrigin-RevId: 212039569
* Introduce loop unroll jam transformation.Uday Bondhugula2019-03-291-6/+6
| | | | | | | | | | - for test purposes, the unroll-jam pass unroll jams the first outermost loop. While on this: - fix StmtVisitor to allow overriding of function to iterate walk over children of a stmt. PiperOrigin-RevId: 210644813
* Implement operands for the lower and upper bounds of the for statement.Tatiana Shpeisman2019-03-291-13/+20
| | | | | | | | | | | | | | | This revamps implementation of the loop bounds in the ForStmt, using general representation that supports operands. The frequent case of constant bounds is supported via special access methods. This also includes: - Operand iterators for the Statement class. - OpPointer::is() method to query the class of the Operation. - Support for the bound shorthand notation parsing and printing. - Validity checks for the bound operands used as dim ids and symbols I didn't mean this CL to be so large. It just happened this way, as one thing led to another. PiperOrigin-RevId: 210204858
* Push location information more tightly into the IR, providing space for everyChris Lattner2019-03-291-2/+4
| | | | | | | | | | | | | | operation and statement to have a location, and make it so a location is required to be specified whenever you make one (though a null location is still allowed). This is to encourage compiler authors to propagate loc info properly, allowing our failability story to work well. This is still a WIP - it isn't clear if we want to continue abusing Attribute for location information, or whether we should introduce a new class heirarchy to do so. This is good step along the way, and unblocks some of the tf/xla work that builds upon it. PiperOrigin-RevId: 210001406
* Extend loop unrolling to unroll by a given factor; add builder for affineUday Bondhugula2019-03-291-32/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | apply op. - add builder for AffineApplyOp (first one for an operation that has non-zero operands) - add support for loop unrolling by a given factor; uses the affine apply op builder. While on this, change 'step' of ForStmt to be 'unsigned' instead of AffineConstantExpr *. Add setters for ForStmt lb, ub, step. Sample Input: // CHECK-LABEL: mlfunc @loop_nest_unroll_cleanup() { mlfunc @loop_nest_unroll_cleanup() { for %i = 1 to 100 { for %j = 0 to 17 { %x = "addi32"(%j, %j) : (affineint, affineint) -> i32 %y = "addi32"(%x, %x) : (i32, i32) -> i32 } } return } Output: $ mlir-opt -loop-unroll -unroll-factor=4 /tmp/single2.mlir #map0 = (d0) -> (d0 + 1) #map1 = (d0) -> (d0 + 2) #map2 = (d0) -> (d0 + 3) mlfunc @loop_nest_unroll_cleanup() { for %i0 = 1 to 100 { for %i1 = 0 to 17 step 4 { %0 = "addi32"(%i1, %i1) : (affineint, affineint) -> i32 %1 = "addi32"(%0, %0) : (i32, i32) -> i32 %2 = affine_apply #map0(%i1) %3 = "addi32"(%2, %2) : (affineint, affineint) -> i32 %4 = affine_apply #map1(%i1) %5 = "addi32"(%4, %4) : (affineint, affineint) -> i32 %6 = affine_apply #map2(%i1) %7 = "addi32"(%6, %6) : (affineint, affineint) -> i32 } for %i2 = 16 to 17 { %8 = "addi32"(%i2, %i2) : (affineint, affineint) -> i32 %9 = "addi32"(%8, %8) : (i32, i32) -> i32 } } return } PiperOrigin-RevId: 209676220
* ShortLoopUnroll - bug fix.Uday Bondhugula2019-03-291-3/+7
| | | | | | | | | Collect loops through a post order walk instead of a pre-order so that loops are collected from inner loops are collected before outer surrounding ones. Add a complex test case. PiperOrigin-RevId: 209041057
* Move Pass.{h,cpp} from lib/IR/ to lib/Transforms/.Uday Bondhugula2019-03-291-1/+1
| | | | PiperOrigin-RevId: 208571437
* Rework the cloning infrastructure for statements to be able to take and updateChris Lattner2019-03-291-91/+22
| | | | | | | | an operand mapping, which simplifies it a bit. Implement cloning for IfStmt, rename getThenClause() to getThen() which is unambiguous and less repetitive in use cases. PiperOrigin-RevId: 207915990
* Loop unrolling pass updateUday Bondhugula2019-03-291-69/+116
| | | | | | | | | | | | - fix/complete forStmt cloning for unrolling to work for outer loops - create IV const's only when needed - test outer loop unrolling by creating a short trip count unroll pass for loops with trip counts <= <parameter> - add unrolling test cases for multiple op results, outer loop unrolling - fix/clean up StmtWalker class while on this - switch unroll loop iterator values from i32 to affineint PiperOrigin-RevId: 207645967
* Loop unrolling update.Uday Bondhugula2019-03-291-32/+66
| | | | | | | | | | | | | | | | - deal with non-operation stmt's (if/for stmt's) in loops being unrolled (unrolling of non-innermost loops works). - update uses in unrolled bodies to use results of new operations that may be introduced in the unrolled bodies. Unrolling now works for all kinds of loop nests - perfect nests, imperfect nests, loops at any depth, and with any kind of operation in the body. (IfStmt support not done, hence untested there). Added missing dump/print method for StmtBlock. TODO: add test case for outer loop unrolling. PiperOrigin-RevId: 207314286
* MLStmt cloning and IV replacement for loop unrolling, add constant pool toUday Bondhugula2019-03-291-19/+47
| | | | | | | | | | | | | | MLFunctions. - MLStmt cloning and IV replacement - While at this, fix the innermostLoopGatherer to actually gather all the innermost loops (it was stopping its walk at the first innermost loop it found) - Improve comments for MLFunction statement classes, fix inheritance order. - Fixed StmtBlock destructor. PiperOrigin-RevId: 207049173
* Clean up and extend MLFuncBuilder to allow creating statements in the middle ↵Tatiana Shpeisman2019-03-291-3/+1
| | | | | | | | of a statement block. Rename Statement::getFunction() and StmtBlock()::getFunction() to findFunction() to make it clear that this is not a constant time getter. Fix b/112039912 - we were recording 'i' instead of '%i' for loop induction variables causing "use of undefined SSA value" error. PiperOrigin-RevId: 206884644
* LoopUnroll post order walk: fix misleading namingUday Bondhugula2019-03-291-12/+13
| | | | PiperOrigin-RevId: 206609084
* Prepare for implementation of TensorFlow passes:Chris Lattner2019-03-291-23/+9
| | | | | | | | | | - Sketch out a TensorFlow/IR directory that will hold op definitions and common TF support logic. We will eventually have TensorFlow/TF2HLO, TensorFlow/Grappler, TensorFlow/TFLite, etc. - Add sketches of a Switch/Merge op definition, including some missing stuff like the TwoResults trait. Add a skeleton of a pass to raise this form. - Beef up the Pass/FunctionPass definitions slightly, moving the common code out of LoopUnroll.cpp into a new IR/Pass.cpp file. - Switch ConvertToCFG.cpp to be a ModulePass. - Allow _ to start bare identifiers, since this is important for TF attributes. PiperOrigin-RevId: 206502517
* Stmt visitors and walkers.Uday Bondhugula2019-03-291-8/+40
| | | | | | | | | | | | | | | | | | - Update InnermostLoopGatherer to use a post order traversal (linear time/single traversal). - Drop getNumNestedLoops(). - Update isInnermost() to use the StmtWalker. When using return values in conjunction with walkers, the StmtWalker CRTP pattern doesn't appear to be of any use. It just requires overriding nearly all of the methods, which is what InnermostLoopGatherer currently does. Please see FIXME/ENLIGHTENME comments. TODO: figure this out from this CL discussion. Note - Comments on visitor/walker base class are out of date; will update when this CL is finalized. PiperOrigin-RevId: 206340901
* Implement MLValue, statement operands, operation statement operands and ↵Tatiana Shpeisman2019-03-291-1/+3
| | | | | | values. ML functions now have full support for expressing operations. Induction variables, function arguments and return values are still todo. PiperOrigin-RevId: 206253643
* Implement a proper function list in module, which auto-maintain the parentChris Lattner2019-03-291-2/+2
| | | | | | | | | pointer, and ensure that functions are deleted when the module is destroyed. This exposed the fact that MLFunction had no dtor, and that the dtor in CFGFunction was broken with cyclic references. Fix both of these problems. PiperOrigin-RevId: 206051666
* Sketch out loop unrolling transformation.Uday Bondhugula2019-03-291-0/+109
- Implement a full loop unroll for innermost loops. - Use it to implement a pass that unroll all the innermost loops of all mlfunction's in a module. ForStmt's parsed currently have constant trip counts (and constant loop bounds). - Implement StmtVisitor based (Visitor pattern) Loop IVs aren't currently parsed and represented as SSA values. Replacing uses of loop IVs in unrolled bodies is thus a TODO. Class comments are sparse at some places - will add them after one round of comments. A cmd-line flag triggers this for now. Original: mlfunc @loops() { for x = 1 to 100 step 2 { for x = 1 to 4 { "Const"(){value: 1} : () -> () } } return } After unrolling: mlfunc @loops() { for x = 1 to 100 step 2 { "Const"(){value: 1} : () -> () "Const"(){value: 1} : () -> () "Const"(){value: 1} : () -> () "Const"(){value: 1} : () -> () } return } PiperOrigin-RevId: 205933235
OpenPOWER on IntegriCloud