bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Add op stats pass to mlir-opt.	Jacques Pienaar	2019-03-29	1	-1/+1
\| \| \| \| \| \|	op-stats pass currently returns the number of occurrences of different operations in a Module. Useful for verifying transformation properties (e.g., 3 ops of specific dialect, 0 of another), but probably not useful outside of that so keeping it local to mlir-opt. This does not consider op attributes when counting. PiperOrigin-RevId: 222259727
*	[MLIR][VectorAnalysis] Add a VectorAnalysis and standalone tests	Nicolas Vasilache	2019-03-29	2	-0/+117
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL adds some vector support in prevision of the upcoming vector materialization pass. In particular this CL adds 2 functions to: 1. compute the multiplicity of a subvector shape in a supervector shape; 2. help match operations on strict super-vectors. This is defined for a given subvector shape as an operation that manipulates a vector type that is an integral multiple of the subtype, with multiplicity at least 2. This CL also adds a TestUtil pass where we can dump arbitrary testing of functions and analysis that operate at a much smaller granularity than a pass (e.g. an analysis for which it is convenient to write a bit of artificial MLIR and write some custom test). This is in order to keep using Filecheck for things that essentially look and feel like C++ unit tests. PiperOrigin-RevId: 222250910
*	Updates to transformation/analysis passes/utilities. Update DMA generation pass	Uday Bondhugula	2019-03-29	4	-114/+242
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and getMemRefRegion() to work with specified loop depths; add support for outgoing DMAs, store op's. - add support for getMemRefRegion symbolic in outer loops - hence support for DMAs symbolic in outer surrounding loops. - add DMA generation support for outgoing DMAs (store op's to lower memory space); extend getMemoryRegion to store op's. -memref-bound-check now works with store op's as well. - fix dma-generate (references to the old memref in the dma_start op were also being replaced with the new buffer); we need replace all memref uses to work only on a subset of the uses - add a new optional argument for replaceAllMemRefUsesWith. update replaceAllMemRefUsesWith to take an optional 'operation' argument to serve as a filter - if provided, only those uses that are dominated by the filter are replaced. - Add missing print for attributes for dma_start, dma_wait op's. - update the FlatAffineConstraints API PiperOrigin-RevId: 221889223
*	[MLIR] Rename OperationInst to Instruction.	River Riddle	2019-03-29	1	-2/+2
\| \| \| \|	PiperOrigin-RevId: 221795407
*	Replace TerminatorInst with builtin terminator operations.	River Riddle	2019-03-29	1	-29/+13
\| \| \| \| \|	Note: Terminators will be merged into the operations list in a follow up patch. PiperOrigin-RevId: 221670037
*	ConvertToCFG: properly remap nested function attributes.	Alex Zinenko	2019-03-29	2	-34/+78
\| \| \| \| \| \| \| \| \| \| \| \|	Array attributes can nested and function attributes can appear anywhere at that level. They should be remapped to point to the generated CFGFunction after ML-to-CFG conversion, similarly to plain function attributes. Extract the nested attribute remapping functionality from the Parser to Utils. Extract out the remapping function for individual Functions from the module remapping function. Use these new functions in the ML-to-CFG conversion pass and in the parser. PiperOrigin-RevId: 221510997
*	Move definitions of lopoUnroll* functions to LoopUtils.cpp.	Alex Zinenko	2019-03-29	2	-117/+117
\| \| \| \| \| \| \| \| \| \| \|	These functions are declared in Transforms/LoopUtils.h (included to the Transforms/Utils library) but were defined in the loop unrolling pass in Transforms/LoopUnroll.cpp. As a result, targets depending only on TransformUtils library but not on Transforms could get link errors. Move the definitions to Transforms/Utils/LoopUtils.cpp where they should actually live. This does not modify any code. PiperOrigin-RevId: 221508882
*	[MLIR] Support for vectorizing operations.	Nicolas Vasilache	2019-03-29	1	-255/+383
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL adds support for and a vectorization test to perform scalar 2-D addf. The support extension notably comprises: 1. extend vectorizable test to exclude vector_transfer operations and expose them to LoopAnalysis where they are needed. This is a temporary solution a concrete MLIR Op exists; 2. add some more functional sugar mapKeys, apply and ScopeGuard (which became relevant again); 3. fix improper shifting during coarsening; 4. rename unaligned load/store to vector_transfer_read/write and simplify the design removing the unnecessary AllocOp that were introduced prematurely: vector_transfer_read currently has the form: (memref<?x?x?xf32>, index, index, index) -> vector<32x64x256xf32> vector_transfer_write currently has the form: (vector<32x64x256xf32>, memref<?x?x?xf32>, index, index, index) -> () 5. adds vectorizeOperations which traverses the operations in a ForStmt and rewrites them to their vector form; 6. add support for vector splat from a constant. The relevant tests are also updated. PiperOrigin-RevId: 221421426
*	Basic conversion of MLFunctions to CFGFunctions.	Alex Zinenko	2019-03-29	1	-36/+322
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement a pass converting a subset of MLFunctions to CFGFunctions. Currently supports arbitrarily complex imperfect loop nests with statically constant (i.e., not affine map) bounds filled with operations. Does NOT support branches and non-constant loop bounds. Conversion is performed per-function and the function names are preserved to avoid breaking any external references to the current module. In-memory IR is updated to point to the right functions in direct calls and constant loads. This behavior is tested via a really hidden flag that enables function renaming. Inside each function, the control flow conversion is based on single-entry single-exit regions, i.e. subgraphs of the CFG that have exactly one incoming and exactly one outgoing edge. Since an MLFunction must have a single "return" statement as per MLIR spec, it constitutes an SESE region. Individual operations are appended to this region. Control flow statements are recursively converted into such regions that are concatenated with the current region. Bodies of the compound statement also form SESE regions, which allows to nest control flow statements easily. Note that SESE regions are not materialized in the code. It is sufficent to keep track of the end of the region as the current instruction insertion point as long as all recursive calls update the insertion point in the end. The converter maintains a mapping between SSA values in ML functions and their CFG counterparts. The mapping is used to find the operands for each operation and is updated to contain the results of each operation as the conversion continues. PiperOrigin-RevId: 221162602
*	Switch IntegerAttr to use APInt.	Jacques Pienaar	2019-03-29	1	-4/+4
\| \| \| \| \| \| \| \|	Change the storage type to APInt from int64_t for IntegerAttr (following the change to APFloat storage in FloatAttr). Effectively a direct change from int64_t to 64-bit APInt throughout (the bitwidth hardcoded). This change also adds a getInt convenience method to IntegerAttr and replaces previous getValue calls with getInt calls. While this changes updates the storage type, it does not update all constant folding calls. PiperOrigin-RevId: 221082788
*	Adds support for returning the direction of the dependence between memref ↵	MLIR Team	2019-03-29	1	-2/+6
\| \| \| \| \| \| \| \| \| \|	accesses (distance/direction vectors). Updates MemRefDependenceCheck to check and report on all memref access pairs at all loop nest depths. Updates old and adds new memref dependence check tests. Resolves multiple TODOs. PiperOrigin-RevId: 220816515
*	Automatic DMA generation for simple cases.	Uday Bondhugula	2019-03-29	2	-1/+224
\| \| \| \| \| \| \| \| \| \| \| \| \|	- constant bounded memory regions, static shapes, no handling of overlapping/duplicate regions (through union) for now; also only, load memory op's. - add build methods for DmaStartOp, DmaWaitOp. - move getMemoryRegion() into Analysis/Utils and expose it. - fix addIndexSet, getMemoryRegion() post switch to exclusive upper bounds; update test cases for memref-bound-check and memref-dependence-check for exclusive bounds (missed in a previous CL) PiperOrigin-RevId: 220729810
*	Implement value type abstraction for locations.	River Riddle	2019-03-29	3	-6/+6
\| \| \| \| \| \|	Value type abstraction for locations differ from others in that a Location can NOT be null. NOTE: dyn_cast returns an Optional<T>. PiperOrigin-RevId: 220682078
*	Complete migration to exclusive upper bound	Uday Bondhugula	2019-03-29	1	-7/+12
\| \| \| \| \| \| \| \|	cl/220448963 had missed a part of the updates. - while on this, clean up some of the test cases to use ops' custom forms. PiperOrigin-RevId: 220675303
*	Initialize Pass with PassID.	Jacques Pienaar	2019-03-29	12	-7/+18
\| \| \| \| \| \|	The passID is not currently stored in Pass but this avoids the unused variable warning. The passID is used to uniquely identify passes, currently this is only stored/used in PassInfo. PiperOrigin-RevId: 220485662
*	[MLIR] Make upper bound implementation exclusive	Nicolas Vasilache	2019-03-29	1	-10/+10
\| \| \| \| \| \| \|	This CL implement exclusive upper bound behavior as per b/116854378. A followup CL will update the semantics of the for loop. PiperOrigin-RevId: 220448963
*	Add static pass registration	Jacques Pienaar	2019-03-29	12	-9/+94
\| \| \| \| \| \| \| \|	Add static pass registration and change mlir-opt to use it. Future work is needed to refactor the registration for PassManager usage. Change build targets to alwayslink to enforce registration. PiperOrigin-RevId: 220390178
*	Introduce loop tiling code generation (hyper-rectangular case)	Uday Bondhugula	2019-03-29	2	-1/+241
\| \| \| \| \| \| \| \| \| \| \|	- simple perfectly nested band tiling with fixed tile sizes. - only the hyper-rectangular case is handled, with other limitations of getIndexSet applying (constant loop bounds, etc.); once the latter utility is extended, tiled code generation should become more general. - Add FlatAffineConstraints::isHyperRectangular() PiperOrigin-RevId: 220324933
*	Adds a dependence check to test whether two accesses to the same memref ↵	MLIR Team	2019-03-29	1	-0/+244
\| \| \| \| \| \| \| \| \| \| \| \|	access the same element. - Builds access functions and iterations domains for each access. - Builds dependence polyhedron constraint system which has equality constraints for equated access functions and inequality constraints for iteration domain loop bounds. - Runs elimination on the dependence polyhedron to test if no dependence exists between the accesses. - Adds a trivial LoopFusion transformation pass with a simple test policy to test dependence between accesses to the same memref in adjacent loops. - The LoopFusion pass will be extended in subsequent CLs. PiperOrigin-RevId: 219630898
*	[MLIR] Extend vectorization to 2+-D patterns	Nicolas Vasilache	2019-03-29	1	-61/+168
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL adds support for vectorization using more interesting 2-D and 3-D patterns. Note in particular the fact that we match some pretty complex imperfectly nested 2-D patterns with a quite minimal change to the implementation: we just add a bit of recursion to traverse the matched patterns and actually vectorize the loops. For instance, vectorizing the following loop by 128: ``` for %i3 = 0 to %0 { %7 = affine_apply (d0) -> (d0)(%i3) %8 = load %arg0[%c0_0, %7] : memref<?x?xf32> } ``` Currently generates: ``` #map0 = ()[s0] -> (s0 + 127) #map1 = (d0) -> (d0) for %i3 = 0 to #map0()[%0] step 128 { %9 = affine_apply #map1(%i3) %10 = alloc() : memref<1xvector<128xf32>> %11 = "n_d_unaligned_load"(%arg0, %c0_0, %9, %10, %c0) : (memref<?x?xf32>, index, index, memref<1xvector<128xf32>>, index) -> (memref<?x?xf32>, index, index, memref<1xvector<128xf32>>, index) %12 = load %10[%c0] : memref<1xvector<128xf32>> } ``` The above is subject to evolution. PiperOrigin-RevId: 219629745
*	Enable constructing a FuncBuilder using a Operation*.	Jacques Pienaar	2019-03-29	1	-22/+24
\| \| \| \| \| \|	FuncBuilder is useful to build a operation to replace an existing operation, so change the constructor to allow constructing it with an existing operation. Change FuncBuilder to contain (effectively) a tagged union of CFGFuncBuilder and MLFuncBuilder (as these should be cheap to copy and avoid allocating/deletion when created via a operation). PiperOrigin-RevId: 219532952
*	Introduce memref bound checking.	Uday Bondhugula	2019-03-29	10	-65/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce analysis to check memref accesses (in MLFunctions) for out of bound ones. It works as follows: $ mlir-opt -memref-bound-check test/Transforms/memref-bound-check.mlir /tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#2 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#2 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:12:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1 %y = load %B[%idy] : memref<128 x i32> ^ /tmp/single.mlir:12:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1 %y = load %B[%idy] : memref<128 x i32> ^ #map0 = (d0, d1) -> (d0, d1) #map1 = (d0, d1) -> (d0 * 128 - d1) mlfunc @test() { %0 = alloc() : memref<9x9xi32> %1 = alloc() : memref<128xi32> for %i0 = -1 to 9 { for %i1 = -1 to 9 { %2 = affine_apply #map0(%i0, %i1) %3 = load %0[%2tensorflow/mlir#0, %2tensorflow/mlir#1] : memref<9x9xi32> %4 = affine_apply #map1(%i0, %i1) %5 = load %1[%4] : memref<128xi32> } } return } - Improves productivity while manually / semi-automatically developing MLIR for testing / prototyping; also provides an indirect way to catch errors in transformations. - This pass is an easy way to test the underlying affine analysis machinery including low level routines. Some code (in getMemoryRegion()) borrowed from @andydavis cl/218263256. While on this: - create mlir/Analysis/Passes.h; move Pass.h up from mlir/Transforms/ to mlir/ - fix a bug in AffineAnalysis.cpp::toAffineExpr TODO: extend to non-constant loop bounds (straightforward). Will transparently work for all accesses once floordiv, mod, ceildiv are supported in the AffineMap -> FlatAffineConstraints conversion. PiperOrigin-RevId: 219397961
*	Implement value type abstraction for types.	River Riddle	2019-03-29	5	-26/+27
\| \| \| \| \| \|	This is done by changing Type to be a POD interface around an underlying pointer storage and adding in-class support for isa/dyn_cast/cast. PiperOrigin-RevId: 219372163
*	[MLIR] Implement 1-D vectorization for fastest varying load/stores	Nicolas Vasilache	2019-03-29	1	-20/+507
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL is a first in a series that implements early vectorization of increasingly complex patterns. In particular, early vectorization will support arbitrary loop nesting patterns (both perfectly and imperfectly nested), at arbitrary depths in the loop tree. This first CL builds the minimal support for applying 1-D patterns. It relies on an unaligned load/store op abstraction that can be inplemented differently on different HW. Future CLs will support higher dimensional patterns, but 1-D patterns already exhibit interesting properties. In particular, we want to separate pattern matching (i.e. legality both structural and dependency analysis based), from profitability analysis, from application of the transformation. As a consequence patterns may intersect and we need to verify that a pattern can still apply by the time we get to applying it. A non-greedy analysis on profitability that takes into account pattern intersection is left for future work. Additionally the CL makes the following cleanups: 1. the matches method now returns a value, not a reference; 2. added comments about the MLFunctionMatcher and MLFunctionMatches usage by value; 3. added size and empty methods to matches; 4. added a negative vectorization test with a conditional, this exhibited a but in the iterators. Iterators now return nullptr if the underlying storage is nullpt. PiperOrigin-RevId: 219299489
*	Add support for walking the use list of an SSAValue and converting owners to	Chris Lattner	2019-03-29	1	-50/+20
\| \| \| \| \| \| \| \|	Operation*'s, simplifying some code in GreedyPatternRewriteDriver.cpp. Also add print/dump methods on Operation. PiperOrigin-RevId: 219045764
*	Fix two issues:	Chris Lattner	2019-03-29	1	-3/+25
\| \| \| \| \| \| \| \| \| \| \| \|	1) We incorrectly reassociated non-reassociative operations like subi, causing miscompilations. 2) When constant folding, we didn't add users of the new constant back to the worklist for reprocessing, causing us to miss some cases (pointed out by Uday). The code for tensorflow/mlir#2 is gross, but I'll add the new APIs in a followup patch. PiperOrigin-RevId: 218803984
*	Simplify FunctionPass to eliminate the CFGFunctionPass/MLFunctionPass	Chris Lattner	2019-03-29	9	-37/+38
\| \| \| \| \| \| \|	distinction. FunctionPasses can now choose to get called on all functions, or have the driver split CFG/ML Functions up for them. NFC. PiperOrigin-RevId: 218775885
*	Refactor all of the canonicalization patterns out of the Canonicalize pass, and	Chris Lattner	2019-03-29	3	-378/+8
\| \| \| \| \| \| \| \| \| \|	make operations provide a list of canonicalizations that can be applied to them. This allows canonicalization to be general to any IR definition. As part of this, sink PatternMatch.h/cpp down to the IR library to fix a layering problem. PiperOrigin-RevId: 218773981
*	Implement value type abstraction for attributes.	River Riddle	2019-03-29	4	-16/+16
\| \| \| \| \| \|	This is done by changing Attribute to be a POD interface around an underlying pointer storage and adding in-class support for isa/dyn_cast/cast. PiperOrigin-RevId: 218764173
*	Move transform utilities out to their own TransformUtils library, instead of	Chris Lattner	2019-03-29	5	-0/+0
\| \| \| \| \| \| \| \|	just having the pattern matcher in its own library. At this point, lib/Transforms/*.cpp are all actually passes themselves (and will probably eventually be themselves move to a new subdirectory as we accrete more). PiperOrigin-RevId: 218745193
*	Refactor the bulk of the worklist driver out of the canonicalizer into its own	Chris Lattner	2019-03-29	3	-301/+375
\| \| \| \| \| \| \| \| \| \|	helper function, in preparation for it being used by other passes. There is still a lot of room for improvement in its design, this patch is intended as an NFC refactoring, and the improvements will continue after this lands. PiperOrigin-RevId: 218737116
*	Introduce Fourier-Motzkin variable elimination + other cleanup/support	Uday Bondhugula	2019-03-29	2	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Introduce Fourier-Motzkin variable elimination to eliminate a dimension from a system of linear equalities/inequalities. Update isEmpty to use this. Since FM is only exact on rational/real spaces, an emptiness check based on this is guaranteed to be exact whenever it says the underlying set is empty; if it says, it's not empty, there may still be no integer points in it. Also, supports a version that computes "dark shadows". - Test this by checking for "always false" conditionals in if statements. - Unique IntegerSet's that are small (few constraints, few variables). This basically means the canonical empty set and other small sets that are likely commonly used get uniqued; allows checking for the canonical empty set by pointer. IntegerSet::kUniquingThreshold gives the threshold constraint size for uniqui'ing. - rename simplify-affine-expr -> simplify-affine-structures Other cleanup - IntegerSet::numConstraints, AffineMap::numResults are no longer needed; remove them. - add copy assignment operators for AffineMap, IntegerSet. - rename Invalid() -> Null() on AffineExpr, AffineMap, IntegerSet - Misc cleanup for FlatAffineConstraints API PiperOrigin-RevId: 218690456
*	Adds Gaussian Elimination to FlatAffineConstraints.	MLIR Team	2019-03-29	1	-13/+33
\| \| \| \| \| \| \| \|	- Adds FlatAffineConstraints::isEmpty method to test if there are no solutions to the system. - Adds GCD test check if equality constraints have no solution. - Adds unit test cases. PiperOrigin-RevId: 218546319
*	Change typedef to using to be consistent across the codebase	Lei Zhang	2019-03-29	4	-4/+4
\| \| \| \| \| \|	Google C++ style guide also prefers using to typedef. PiperOrigin-RevId: 218541849
*	Teach canonicalize pass to unique and hoist constants to the entry block. This	Chris Lattner	2019-03-29	1	-23/+77
\| \| \| \| \| \| \| \| \|	is a straight-forward change, but required adding missing moveBefore() methods on operations (requiring moving some traits around to make C++ happy). This also fixes a constness issue with the getBlock/getFunction() methods on Instruction, and adds a missing getFunction() method on MLFuncBuilder. PiperOrigin-RevId: 218523905
*	Implement shape folding in the canonicalization pass:	Chris Lattner	2019-03-29	2	-6/+129
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Add a few canonicalization patterns to fold memref_cast into load/store/dealloc. - Canonicalize alloc(constant) into an alloc with a constant shape followed by a cast. - Add a new PatternRewriter::updatedRootInPlace API to make this more convenient. SimplifyAllocConst and the testcase is heavily based on Uday's implementation work, just in a different framework. PiperOrigin-RevId: 218361237
*	PassResult return cleanup.	Uday Bondhugula	2019-03-29	2	-17/+22
\| \| \| \| \| \|	- return success as long as IR is in a valid state. PiperOrigin-RevId: 218225317
*	Add a pattern (x+0) -> x, generalize Canonicalize to CFGFunc's, address a ↵	Chris Lattner	2019-03-29	2	-50/+111
\| \| \| \| \| \| \| \|	few TODOs, and add some casting support to Operation. PiperOrigin-RevId: 218219340
*	Introduce a new Operation::erase helper to generalize some code in	Chris Lattner	2019-03-29	7	-20/+13
\| \| \| \| \| \| \|	the pattern matcher / canonicalizer, and rename existing eraseFromBlock methods to align with it. PiperOrigin-RevId: 218104455
*	Introduce a new PatternRewriter class to help keep the worklist in	Chris Lattner	2019-03-29	2	-24/+108
\| \| \| \| \| \| \| \| \| \| \|	PatternMatcher clients up to date and provide a funnel point for newly added operations. This is also progress towards the canonicalizer supporting CFGFunctions. This paves the way for more complex patterns, but by itself doesn't do much useful, so no testcase. PiperOrigin-RevId: 218101737
*	Loop bound constant folding: follow-up / address comments from cl/215997346	Uday Bondhugula	2019-03-29	2	-6/+58
\| \| \| \| \| \| \|	- create a single function to fold both bounds - move bound constant folding into transforms PiperOrigin-RevId: 217954701
*	Rename Operation::getAs to Operation::dyn_cast	Feng Liu	2019-03-29	5	-23/+21
\| \| \| \| \| \| \| \| \|	Also rename Operation::is to Operation::isa Introduce Operation::cast All of these are for consistency with global dyn_cast/cast/isa operators. PiperOrigin-RevId: 217878786
*	Generalize / improve DMA transfer overlap; nested and multiple DMA support; ↵	Uday Bondhugula	2019-03-29	3	-153/+177
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	resolve multiple TODOs. - replace the fake test pass (that worked on just the first loop in the MLFunction) to perform DMA pipelining on all suitable loops. - nested DMAs work now (DMAs in an outer loop, more DMAs in nested inner loops) - fix bugs / assumptions: correctly copy memory space and elemental type of source memref for double buffering. - correctly identify matching start/finish statements, handle multiple DMAs per loop. - introduce dominates/properlyDominates utitilies for MLFunction statements. - move checkDominancePreservationOnShifts to LoopAnalysis.h; rename it getShiftValidity - refactor getContainingStmtPos -> findAncestorStmtInBlock - move into Analysis/Utils.h; has two users. - other improvements / cleanup for related API/utilities - add size argument to dma_wait - for nested DMAs or in general, it makes it easy to obtain the size to use when lowering the dma_wait since we wouldn't want to identify the matching dma_start, and more importantly, in general/in the future, there may not always be a dma_start dominating the dma_wait. - add debug information in the pass PiperOrigin-RevId: 217734892
*	[MLIR] Basic infrastructure for vectorization test	Nicolas Vasilache	2019-03-29	1	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL implements a very simple loop vectorization test and the basic infrastructure to support it. The test simply consists in: 1. matching the loops in the MLFunction and all the Load/Store operations nested under the loop; 2. testing whether all the Load/Store are contiguous along the innermost memory dimension along that particular loop. If any reference is non-contiguous (i.e. the ForStmt SSAValue appears in the expression), then the loop is not-vectorizable. The simple test above can gradually be extended with more interesting behaviors to account for the fact that a layout permutation may exist that enables contiguity etc. All these will come in due time but it is worthwhile noting that the test already supports detection of outer-vetorizable loops. In implementing this test, I also added a recursive MLFunctionMatcher and some sugar that can capture patterns such as `auto gemmLike = Doall(Doall(Red(LoadStore())))` and allows iterating on the matched IR structures. For now it just uses in order traversal but post-order DFS will be useful in the future once IR rewrites start occuring. One may note that the memory management design decision follows a different pattern from MLIR. After evaluating different designs and how they quickly increase cognitive overhead, I decided to opt for the simplest solution in my view: a class-wide (threadsafe) RAII context. This way, a pass that needs MLFunctionMatcher can just have its own locally scoped BumpPtrAllocator and everything is cleaned up when the pass is destroyed. If passes are expected to have a longer lifetime, then the contexts can easily be scoped inside the runOnMLFunction call and storage lifetime reduced. Lastly, whatever the scope of threading (module, function, pass), this is expected to also be future-proof wrt concurrency (but this is a detail atm). PiperOrigin-RevId: 217622889
*	Use FuncBuilder instead of MLFuncBuilder in pattern matcher.	Jacques Pienaar	2019-03-29	2	-9/+7
\| \| \| \| \| \|	Use the general function buil wrapper instead of the CFG/ML specific one. PiperOrigin-RevId: 217335607
*	Add constant folding and binary operator reassociation to the canonicalize	Chris Lattner	2019-03-29	1	-8/+97
\| \| \| \| \| \| \|	pass, build up the worklist infra in anticipation of improving the pattern matcher to match more than one node. PiperOrigin-RevId: 217330579
*	Move Pattern and related classes to a different file	Feng Liu	2019-03-29	2	-267/+168
\| \| \| \| \| \|	So we can use it as a library. PiperOrigin-RevId: 217267049
*	Adds method to AffineApplyOp which forward substitutes its results into any ↵	MLIR Team	2019-03-29	2	-150/+87
\| \| \| \| \| \| \| \| \|	of its users which are also AffineApplyOps. Updates ComposeAffineMaps test pass to use this method. Updates affine map composition test cases to handle the new pass, which can be reused when this method is used in a future instruction combine pass. PiperOrigin-RevId: 217163351
*	Various improvements to pattern matching and other infra:	Chris Lattner	2019-03-29	4	-38/+129
\| \| \| \| \| \| \| \| \| \| \|	- Make it so OpPointer implicitly converts to SSAValue* when the underlying op has a single value. This eliminates a lot more ->getResult() calls and makes the behavior more LLVM-like - Fill out PatternBenefit to be typed instead of just a typedef for int with magic numbers. - Simplify various code due to these changes. PiperOrigin-RevId: 217020717
*	Create private exclusive / single use affine computation slice for an op stmt.	Uday Bondhugula	2019-03-29	3	-122/+174
\| \| \| \| \| \| \| \| \| \| \| \| \|	- add util to create a private / exclusive / single use affine computation slice for an op stmt (see method doc comment); a single multi-result affine_apply op is prepended to the op stmt to provide all results needed for its operands as a function of loop iterators and symbols. - use it for DMA pipelining (to create private slices for DMA start stmt's); resolve TODOs/feature request (b/117159533) - move createComposedAffineApplyOp to Transforms/Utils; free it from taking a memref as input / generalize it. PiperOrigin-RevId: 216926818