bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Use FlatAffineConstraints::unionBoundingBox to perform slice bounds union ↵	MLIR Team	2019-03-29	1	-5/+21
\| \| \| \| \| \| \| \| \|	for loop fusion pass (WIP). Adds utility to convert slice bounds to a FlatAffineConstraints representation. Adds utility to FlatAffineConstraints to promote loop IV symbol identifiers to dim identifiers. PiperOrigin-RevId: 236973261
*	Fix and improve detectAsMod	Uday Bondhugula	2019-03-29	1	-12/+44
\| \| \| \| \| \| \| \|	- fix for the mod detection - simplify/avoid the mod at construction (if the dividend is already known to be less than the divisor), since the information is available at hand there PiperOrigin-RevId: 236882988
*	Bug fix for getConstantBoundOnDimSize	Uday Bondhugula	2019-03-29	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	- this was detected when memref-bound-check was run on the output of the loop-fusion pass - the addition (to represent ceildiv as a floordiv) had to be performed only for the constant term of the constraint - update test cases - memref-bound-check no longer returns an error on the output of this test case PiperOrigin-RevId: 236731137
*	Update addSliceBounds to deal with loops with floor's/mod's.	Uday Bondhugula	2019-03-29	1	-25/+81
\| \| \| \| \| \| \| \| \|	- This change only impacts the cost model for fusion, given the way addSliceBounds was being used. It so happens that the output in spite of this CL's fix is the same; however, the assertions added no longer fail. (an invalid/inconsistent memref region was being used earlier). PiperOrigin-RevId: 236405030
*	NFC. Move all of the remaining operations left in BuiltinOps to StandardOps. ↵	River Riddle	2019-03-29	1	-1/+1
\| \| \| \| \| \|	The only thing left in BuiltinOps are the core MLIR types. The standard types can't be moved because they are referenced within the IR directory, e.g. in things like Builder. PiperOrigin-RevId: 236403665
*	Loop fusion for input reuse.	MLIR Team	2019-03-29	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	) Breaks fusion pass into multiple sub passes over nodes in data dependence graph: - first pass fuses single-use producers into their unique consumer. - second pass enables fusing for input-reuse by fusing sibling nodes which read from the same memref, but which do not share dependence edges. - third pass fuses remaining producers into their consumers (Note that the sibling fusion pass may have transformed a producer with multiple uses into a single-use producer). ) Fusion for input reuse is enabled by computing a sibling node slice using the load/load accesses to the same memref, and fusion safety is guaranteed by checking that the sibling node memref write region (to a different memref) is preserved. ) Enables output vector and output matrix computations from KFAC patches-second-moment operation to fuse into a single loop nest and reuse input from the image patches operation. ) Adds a generic loop utilitiy for finding all sequential loops in a loop nest. *) Adds and updates unit tests. PiperOrigin-RevId: 236350987
*	Analysis support for floordiv/mod's in loop bounds/	Uday Bondhugula	2019-03-29	1	-28/+56
\| \| \| \| \| \| \| \| \|	- handle floordiv/mod's in loop bounds for all analysis purposes - allows fusion slicing to be more powerful - add simple test cases based on -memref-bound-check - fusion based test cases in follow up CLs PiperOrigin-RevId: 236328551
*	Method to align/merge dimensional/symbolic identifiers between two ↵	Uday Bondhugula	2019-03-29	1	-96/+162
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	FlatAffineConstraints - add a method to merge and align the spaces (identifiers) of two FlatAffineConstraints (both get dimension-wise and symbol-wise unique columns) - this completes several TODOs, gets rid of previous assumptions/restrictions in composeMap, unionBoundingBox, and reuses common code - remove previous workarounds / duplicated funcitonality in FlatAffineConstraints::composeMap and unionBoundingBox, use mergeAlignIds from both PiperOrigin-RevId: 236320581
*	Change some of the debug messages to use emitError / emitWarning / emitNote ↵	Uday Bondhugula	2019-03-29	1	-6/+6
\| \| \| \| \| \|	- NFC PiperOrigin-RevId: 236169676
*	Detect more trivially redundant constraints better	Uday Bondhugula	2019-03-29	1	-11/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- detect more trivially redundant constraints in FlatAffineConstraints::removeTrivialRedundantConstraints. Redundancy due to constraints that only differ in the constant part (eg., 32i + 64j - 3 >= 0, 32 + 64j - 8 >= 0) is now detected. The method is still linear-time and does a single scan over the FlatAffineConstraints buffer. This detection is useful and needed to eliminate redundant constraints generated after FM elimination. - update GCDTightenInequalities so that we also normalize by the GCD while at it. This way more constraints will show up as redundant (232i - 203 >= 0 becomes i - 1 >= 0 instead of 232i - 232 >= 0) without having to call normalizeConstraintsByGCD. - In FourierMotzkinEliminate, call GCDTightenInequalities and normalizeConstraintsByGCD before calling removeTrivialRedundantConstraints() - so that more redundant constraints are detected. As a result, redundancy due to constraints like i - 5 >= 0, i - 7 >= 0, 2i - 5 >= 0, 232i - 203 >= 0 is now detected (here only i >= 7 is non-redundant). As a result of these, a -memref-bound-check on the added test case runs in 16ms instead of 1.35s (opt build) and no longer returns a conservative result. PiperOrigin-RevId: 235983550
*	Fix bug in memref region computation with slice loop bounds. Adds loop IV ↵	MLIR Team	2019-03-29	1	-10/+24
\| \| \| \| \| \|	values to ComputationSliceState which are used in FlatAffineConstraints::addSliceBounds, to ensure that constraints are only added for loop IV values which are present in the constraint system. PiperOrigin-RevId: 235952912
*	Temp change in FlatAffineConstraints::getSliceBounds() to deal with TODO in	Uday Bondhugula	2019-03-29	1	-11/+16
\| \| \| \| \| \| \| \| \| \| \|	LoopFusion - getConstDifference in LoopFusion is pending a refactoring to handle bounds with min's and max's; it currently asserts on some useful test cases that we want to experiment with. This CL changes getSliceBounds to be more conservative so as to not trigger the assertion. Filed b/126426796 to track this. PiperOrigin-RevId: 235826538
*	Cleanup post cl/235283610 - NFC	Uday Bondhugula	2019-03-29	1	-4/+5
\| \| \| \| \| \| \|	- remove stale comments + cleanup - drop MLIRContext * field from expr flattener PiperOrigin-RevId: 235621178
*	Refactor AffineExprFlattener and move FlatAffineConstraints out of IR into	Uday Bondhugula	2019-03-29	1	-0/+2530
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Analysis - NFC - refactor AffineExprFlattener (-> SimpleAffineExprFlattener) so that it doesn't depend on FlatAffineConstraints, and so that FlatAffineConstraints could be moved out of IR/; the simplification that the IR needs for AffineExpr's doesn't depend on FlatAffineConstraints - have AffineExprFlattener derive from SimpleAffineExprFlattener to use for all Analysis/Transforms purposes; override addLocalFloorDivId in the derived class - turn addAffineForOpDomain into a method on FlatAffineConstraints - turn AffineForOp::getAsValueMap into an AffineValueMap ctor PiperOrigin-RevId: 235283610
*	Refactor the affine analysis by moving some functionality to IR and some to ↵	River Riddle	2019-03-29	1	-2167/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	AffineOps. This is important for allowing the affine dialect to define canonicalizations directly on the operations instead of relying on transformation passes, e.g. ComposeAffineMaps. A summary of the refactoring: * AffineStructures has moved to IR. * simplifyAffineExpr/simplifyAffineMap/getFlattenedAffineExpr have moved to IR. * makeComposedAffineApply/fullyComposeAffineMapAndOperands have moved to AffineOps. * ComposeAffineMaps is replaced by AffineApplyOp::canonicalize and deleted. PiperOrigin-RevId: 232586468
*	NFC: Move AffineApplyOp to the AffineOps dialect. This also moves the ↵	River Riddle	2019-03-29	1	-1/+1
\| \| \| \| \| \|	isValidDim/isValidSymbol methods from Value to the AffineOps dialect. PiperOrigin-RevId: 232386632
*	Update dma-generate pass to (1) work on blocks of instructions (instead of just	Uday Bondhugula	2019-03-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	loops), (2) take into account fast memory space capacity and lower 'dmaDepth' to fit, (3) add location information for debug info / errors - change dma-generate pass to work on blocks of instructions (start/end iterators) instead of 'for' loops; complete TODOs - allows DMA generation for straightline blocks of operation instructions interspersed b/w loops - take into account fast memory capacity: check whether memory footprint fits in fastMemoryCapacity parameter, and recurse/lower the depth at which DMA generation is performed until it does fit in the provided memory - add location information to MemRefRegion; any insufficient fast memory capacity errors or debug info w.r.t dma generation shows location information - allow DMA generation pass to be instantiated with a fast memory capacity option (besides command line flag) - change getMemRefRegion to return unique_ptr's - change getMemRefFootprintBytes to work on a 'Block' instead of 'ForInst' - other helper methods; add postDomInstFilter option for replaceAllMemRefUsesWith; drop forInst->walkOps, add Block::walkOps methods Eg. output $ mlir-opt -dma-generate -dma-fast-mem-capacity=1 /tmp/single.mlir /tmp/single.mlir:9:13: error: Total size of all DMA buffers' for this block exceeds fast memory capacity for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) { ^ $ mlir-opt -debug-only=dma-generate -dma-generate -dma-fast-mem-capacity=400 /tmp/single.mlir /tmp/single.mlir:9:13: note: 8 KiB of DMA buffers in fast memory space for this block for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) { PiperOrigin-RevId: 232297044
*	Fold the functionality of OperationInst into Instruction. OperationInst ↵	River Riddle	2019-03-29	1	-1/+1
\| \| \| \| \| \|	still exists as a forward declaration and will be removed incrementally in a set of followup cleanup patches. PiperOrigin-RevId: 232198540
*	Define the AffineForOp and replace ForInst with it. This patch is largely ↵	River Riddle	2019-03-29	1	-10/+12
\| \| \| \| \| \|	mechanical, i.e. changing usages of ForInst to OpPointer<AffineForOp>. An important difference is that upon construction an AffineForOp no longer automatically creates the body and induction variable. To generate the body/iv, 'createBody' can be called on an AffineForOp with no body. PiperOrigin-RevId: 232060516
*	Fix getFullMemRefAsRegion() and FlatAffineConstraints::reset	Uday Bondhugula	2019-03-29	1	-3/+7
\| \| \| \|	PiperOrigin-RevId: 231426734
*	Change AffineApplyOp to produce a single result, simplifying the code that	Chris Lattner	2019-03-29	1	-2/+1
\| \| \| \| \| \|	works with it, and updating the g3docs. PiperOrigin-RevId: 231120927
*	Change the ForInst induction variable to be a block argument of the body ↵	River Riddle	2019-03-29	1	-1/+1
\| \| \| \| \| \|	instead of the ForInst itself. This is a necessary step in converting ForInst into an operation. PiperOrigin-RevId: 231064139
*	Drop AffineMap::Null and IntegerSet::Null	Nicolas Vasilache	2019-03-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Addresses b/122486036 This CL addresses some leftover crumbs in AffineMap and IntegerSet by removing the Null method and cleaning up the constructors. As the ::Null uses were tracked down, opportunities appeared to untangle some of the Parsing logic and make it explicit where AffineMap/IntegerSet have ambiguous syntax. Previously, ambiguous cases were hidden behind the implicit pointer values of AffineMap* and IntegerSet* that were passed as function parameters. Depending the values of those pointers one of 3 behaviors could occur. This parsing logic convolution is one of the rare cases where I would advocate for code duplication. The more proper fix would be to make the syntax unambiguous or to allow some lookahead. PiperOrigin-RevId: 231058512
*	Update dma-generate: update for multiple load/store op's per memref	Uday Bondhugula	2019-03-29	1	-2/+115
\| \| \| \| \| \| \| \| \| \|	- introduce a way to compute union using symbolic rectangular bounding boxes - handle multiple load/store op's to the same memref by taking a union of the regions - command-line argument to provide capacity of the fast memory space - minor change to replaceAllMemRefUsesWith to not generate affine_apply if the supplied index remap was identity PiperOrigin-RevId: 230848185
*	Update fusion cost model + some additional infrastructure and debug ↵	Uday Bondhugula	2019-03-29	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	information for -loop-fusion - update fusion cost model to fuse while tolerating a certain amount of redundant computation; add cl option -fusion-compute-tolerance evaluate memory footprint and intermediate memory reduction - emit debug info from -loop-fusion showing what was fused and why - introduce function to compute memory footprint for a loop nest - getMemRefRegion readability update - NFC PiperOrigin-RevId: 230541857
*	Allocate private/local buffers for slices accurately during fusion	Uday Bondhugula	2019-03-29	1	-5/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- the size of the private memref created for the slice should be based on the memref region accessed at the depth at which the slice is being materialized, i.e., symbolic in the outer IVs up until that depth, as opposed to the region accessed based on the entire domain. - leads to a significant contraction of the temporary / intermediate memref whenever the memref isn't reduced to a single scalar (through store fwd'ing). Other changes - update to promoteIfSingleIteration - avoid introducing unnecessary identity map affine_apply from IV; makes it much easier to write and read test cases and pass output for all passes that use promoteIfSingleIteration; loop-fusion test cases become much simpler - fix replaceAllMemrefUsesWith bug that was exposed by the above update - 'domInstFilter' could be one of the ops erased due to a memref replacement in it. - fix getConstantBoundOnDimSize bug: a division by the coefficient of the identifier was missing (the latter need not always be 1); add lbFloorDivisors output argument - rename getBoundingConstantSizeAndShape -> getConstantBoundingSizeAndShape PiperOrigin-RevId: 230405218
*	Fix FlatAffineConstraints::removeIdRange	Uday Bondhugula	2019-03-29	1	-3/+21
\| \| \| \| \| \| \|	- the number of symbols/local ids was being incorrectly updated; the code in cl/230112574 exposes this. PiperOrigin-RevId: 230358327
*	LoopFusion improvements:	MLIR Team	2019-03-29	1	-2/+15
\| \| \| \| \| \| \| \|	) Adds support for fusing into consumer loop nests with multiple loads from the same memref. ) Adds support for reducing slice loop trip count by projecting out destination loop IVs greater than destination loop depth. *) Removes dependence on src loop depth and simplifies cost model computation. PiperOrigin-RevId: 229575126
*	Simplify compositions of AffineApply	Nicolas Vasilache	2019-03-29	1	-302/+0
\| \| \| \| \| \| \| \|	This CL is the 6th and last on the path to simplifying AffineMap composition. This removes `AffineValueMap::forwardSubstitutions` and replaces it by simple calls to `fullyComposeAffineMapAndOperands`. PiperOrigin-RevId: 228962580
*	Add safeguard against FM explosion	Uday Bondhugula	2019-03-29	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- FM has a worst case exponential complexity. For our purposes, this worst case is rarely expected, but could still appear due to improperly constructed constraints (a logical/memory error in other methods for eg.) or artificially created arbitrarily complex integer sets (adversarial / fuzz tests). Add a check to detect such an explosion in the number of constraints and conservatively return false from isEmpty() (instead of running out of memory or running for too long). - Add an artifical virus test case. PiperOrigin-RevId: 228753496
*	Fix affine expr flattener bug + improve simplification in a particular scenario	Uday Bondhugula	2019-03-29	1	-24/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	- fix visitDivExpr: constraints constructed for localVarCst used the original divisor instead of the simplified divisor; fix this. Add a simple test case in memref-bound-check that reproduces this bug - although this was encountered in the context of slicing for fusion. - improve mod expr flattening: when flattening mod expressions, cancel out the GCD of the numerator and denominator so that we can get a simpler flattened form along with a simpler floordiv local var for it PiperOrigin-RevId: 228539928
*	Extend loop-fusion's slicing utility + other fixes / updates	Uday Bondhugula	2019-03-29	1	-54/+343
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- refactor toAffineFromEq and the code surrounding it; refactor code into FlatAffineConstraints::getSliceBounds - add FlatAffineConstraints methods to detect identifiers as mod's and div's of other identifiers - add FlatAffineConstraints::getConstantLower/UpperBound - Address b/122118218 (don't assert on invalid fusion depths cmdline flags - instead, don't do anything; change cmdline flags src-loop-depth -> fusion-src-loop-depth - AffineExpr/Map print method update: don't fail on null instances (since we have a wrapper around a pointer, it's avoidable); rationale: dump/print methods should never fail if possible. - Update memref-dataflow-opt to add an optimization to avoid a unnecessary call to IsRangeOneToOne when it's trivially going to be true. - Add additional test cases to exercise the new support - update a few existing test cases since the maps are now generated uniformly with all destination loop operands appearing for the backward slice - Fix projectOut - fix wrong range for getBestElimCandidate. - Fix for getConstantBoundOnDimSize() - didn't show up in any test cases since we didn't have any non-hyperrectangular ones. PiperOrigin-RevId: 228265152
*	Misc readability and doc / code comment related improvements - NFC	Uday Bondhugula	2019-03-29	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- when SSAValue/MLValue existed, code at several places was forced to create additional aggregate temporaries of SmallVector<SSAValue/MLValue> to handle the conversion; get rid of such redundant code - use filling ctors instead of explicit loops - for smallvectors, change insert(list.end(), ...) -> append(... - improve comments at various places - turn getMemRefAccess into MemRefAccess ctor and drop duplicated getMemRefAccess. In the next CL, provide getAccess() accessors for load, store, DMA op's to return a MemRefAccess. PiperOrigin-RevId: 228243638
*	Complete TODOs / cleanup for loop-fusion utility	Uday Bondhugula	2019-03-29	1	-29/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- this is CL 1/2 that does a clean up and gets rid of one limitation in an underlying method - as a result, fusion works for more cases. - fix bugs/incomplete impl. in toAffineMapFromEq - fusing across rank changing reshapes for example now just works For eg. given a rank 1 memref to rank 2 memref reshape (64 -> 8 x 8) like this, -loop-fusion -memref-dataflow-opt now completely fuses and inlines/store-forward to get rid of the temporary: INPUT // Rank 1 -> Rank 2 reshape for %i0 = 0 to 64 { %v = load %A[%i0] store %v, %B[%i0 floordiv 8, i0 mod 8] } for %i1 = 0 to 8 for %i2 = 0 to 8 %w = load %B[%i1, i2] "foo"(%w) : (f32) -> () OUTPUT $ mlir-opt -loop-fusion -memref-dataflow-opt fuse_reshape.mlir #map0 = (d0, d1) -> (d0 * 8 + d1) mlfunc @fuse_reshape(%arg0: memref<64xf32>) { for %i0 = 0 to 8 { for %i1 = 0 to 8 { %0 = affine_apply #map0(%i0, %i1) %1 = load %arg0[%0] : memref<64xf32> "foo"(%1) : (f32) -> () } } } AFAIK, there is no polyhedral tool / compiler that can perform such fusion - because it's not really standard loop fusion, but possible through a generalized slicing-based approach such as ours. PiperOrigin-RevId: 227918338
*	[MLIR] Fix uninitialized value found with msan	Nicolas Vasilache	2019-03-29	1	-1/+3
\| \| \| \| \| \| \|	The omission of an early exit created opportunities for unitialized memory reads. This CL fixes the issue. PiperOrigin-RevId: 227761814
*	Introduce memref store to load forwarding - a simple memref dataflow analysis	Uday Bondhugula	2019-03-29	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- the load/store forwarding relies on memref dependence routines as well as SSA/dominance to identify the memref store instance uniquely supplying a value to a memref load, and replaces the result of that load with the value being stored. The memref is also deleted when possible if only stores remain. - add methods for post dominance for MLFunction blocks. - remove duplicated getLoopDepth/getNestingDepth - move getNestingDepth, getMemRefAccess, getNumCommonSurroundingLoops into Analysis/Utils (were earlier static) - add a helper method in FlatAffineConstraints - isRangeOneToOne. PiperOrigin-RevId: 227252907
*	Fix b/122139732; update FlatAffineConstraints::isEmpty() to eliminate IDs in a	Uday Bondhugula	2019-03-29	1	-57/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	better order. - update isEmpty() to eliminate IDs in a better order. Speed improvement for complex cases (for eg. high-d reshape's involving mod's/div's). - minor efficiency update to projectOut (was earlier making an extra albeit benign call to gaussianEliminateIds) (NFC). - move getBestIdToEliminate further up in the file (NFC). - add the failing test case. - add debug info to checkMemRefAccessDependence. PiperOrigin-RevId: 227244634
*	Standardize naming of statements -> instructions, revisting the code base to be	Chris Lattner	2019-03-29	1	-13/+13
\| \| \| \| \| \| \| \| \|	consistent and moving the using declarations over. Hopefully this is the last truly massive patch in this refactoring. This is step 21/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227178245
*	Extend/complete dependence tester to utilize local var info.	Uday Bondhugula	2019-03-29	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- extend/complete dependence tester to utilize local var info while adding access function equality constraints; one more step closer to get slicing based fusion working in the general case of affine_apply's involving mod's/div's. - update test case to reflect more accurate dependence information; remove inaccurate comment on test case mod_deps. - fix a minor "bug" in equality addition in addMemRefAccessConstraints (doesn't affect correctness, but the fixed version is more intuitive). - some more surrounding code clean up - move simplifyAffineExpr out of anonymous AffineExprFlattener class - the latter has state, and the former should reside outside. PiperOrigin-RevId: 227175600
*	Merge Operation into OperationInst and standardize nomenclature around	Chris Lattner	2019-03-29	1	-1/+1
\| \| \| \| \| \| \| \|	OperationInst. This is a big mechanical patch. This is step 16/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227093712
*	Merge SSAValue, CFGValue, and MLValue together into a single Value class, which	Chris Lattner	2019-03-29	1	-35/+31
\| \| \| \| \| \| \| \| \|	is the new base of the SSA value hierarchy. This CL also standardizes all the nomenclature and comments to use 'Value' where appropriate. This also eliminates a large number of cast<MLValue>(x)'s, which is very soothing. This is step 11/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227064624
*	Simplify memref-dependence-check's meta data structures / drop duplication and	Uday Bondhugula	2019-03-29	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reuse existing ones. - drop IterationDomainContext, redundant since FlatAffineConstraints has MLValue information associated with its dimensions. - refactor to use existing support - leads to a reduction in LOC - as a result of these changes, non-constant loop bounds get naturally supported for dep analysis. - update test cases to include a couple with non-constant loop bounds - rename addBoundsFromForStmt -> addForStmtDomain - complete TODO for getLoopIVs (handle 'if' statements) PiperOrigin-RevId: 226082008
*	Update / complete a TODO for addBoundsForForStmt	Uday Bondhugula	2019-03-29	1	-9/+23
\| \| \| \| \| \| \| \| \| \|	- when adding constraints from a 'for' stmt into FlatAffineConstraints, correctly add bound operands of the 'for' stmt as a dimensional identifier or a symbolic identifier depending on whether the bound operand is a valid MLFunction symbol - update test case to exercise this. PiperOrigin-RevId: 225988511
*	Refactor/update memref-dep-check's addMemRefAccessConstraints and	Uday Bondhugula	2019-03-29	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \|	addDomainConstraints; add support for mod/div for dependence testing. - add support for mod/div expressions in dependence analysis - refactor addMemRefAccessConstraints to use getFlattenedAffineExprs (instead of getFlattenedAffineExpr); update addDomainConstraints. - rename AffineExprFlattener::cst -> localVarCst PiperOrigin-RevId: 225933306
*	Loop Fusion pass update: introduce utilities to perform generalized loop ↵	MLIR Team	2019-03-29	1	-1/+48
\| \| \| \| \| \| \| \| \| \| \| \|	fusion based on slicing; encompasses standard loop fusion. ) Adds simple greedy fusion algorithm to drive experimentation. This algorithm greedily fuses loop nests with single-writer/single-reader memref dependences to improve locality. ) Adds support for fusing slices of a loop nest computation: fusing one loop nest into another by adjusting the source loop nest's iteration bounds (after it is fused into the destination loop nest). This is accomplished by solving for the source loop nest's IVs in terms of the destination loop nests IVs and symbols using the dependece polyhedron, then creating AffineMaps of these functions for the loop bounds of the fused source loop. ) Adds utility function 'insertMemRefComputationSlice' which computes and inserts computation slice from loop nest surrounding a source memref access into the loop nest surrounding the destingation memref access. ) Adds FlatAffineConstraints::toAffineMap function which returns and AffineMap which represents an equality contraint where one dimension identifier is represented as a function of all others in the equality constraint. *) Adds multiple fusion unit tests. PiperOrigin-RevId: 225842944
*	Expression flattening improvement - reuse local expressions.	Uday Bondhugula	2019-03-29	1	-81/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- if a local id was already for a specific mod/div expression, just reuse it if the expression repeats (instead of adding a new one). - drastically reduces the number of local variables added during flattening for real use cases - since the same div's and mod expressions often repeat. - add getFlattenedAffineExprs for AffineMap, IntegerSet based on the above As a natural result of the above: - FlatAffineConstraints(IntegerSet) ctor now deals with integer sets that have mod and div constraints as well, and these get simplified as well from -simplify-affine-structures PiperOrigin-RevId: 225452174
*	FlatAffineConstraints - complete TODOs: add method to remove duplicate /	Uday Bondhugula	2019-03-29	1	-8/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	trivially redundant constraints. Update projectOut to eliminate identifiers in a more efficient order. Fix b/120801118. - add method to remove duplicate / trivially redundant constraints from FlatAffineConstraints (use a hashing-based approach with DenseSet) - update projectOut to eliminate identifiers in a more efficient order (A sequence of affine_apply's like this (from a real use case) finally exposed the lack of the above trivial/low hanging simplifications). for %ii = 0 to 64 { for %jj = 0 to 9 { %a0 = affine_apply (d0, d1) -> (d0 * (9 * 1024) + d1 * 128) (%ii, %jj) %a1 = affine_apply (d0) -> (d0 floordiv (2 * 3 * 3 * 128 * 128), (d0 mod 294912) floordiv (3 * 3 * 128 * 128), (((d0 mod 294912) mod 147456) floordiv 1152) floordiv 8, (((d0 mod 294912) mod 147456) mod 1152) floordiv 384, ((((d0 mod 294912) mod 147456) mod 1152) mod 384) floordiv 128, (((((d0 mod 294912) mod 147456) mod 1152) mod 384) mod 128) floordiv 128) (%a0) %v0 = load %in[%a1tensorflow/mlir#0, %a1tensorflow/mlir#1, %a1tensorflow/mlir#3, %a1tensorflow/mlir#4, %a1tensorflow/mlir#2, %a1tensorflow/mlir#5] : memref<2x2x3x3x16x1xi32> } } - update FlatAffineConstraints::print to print number of constraints. PiperOrigin-RevId: 225397480
*	Remove dead code from FlatAffineConstraints	Uday Bondhugula	2019-03-29	1	-76/+0
\| \| \| \| \| \| \| \| \| \| \|	- getDimensionBounds() was added initially for quick experimentation - no longer used (getConstantBoundOnDimSize is the more powerful/complete replacement). - FlatAffineConstraints::getConstantLower/UpperBound are incomplete, functionality/naming-wise misleading, and not used currently. Removing these; complete/fixed version will be added in an upcoming CL. PiperOrigin-RevId: 225075061
*	FlatAffineConstraints API cleanup; add normalizeConstraintsByGCD().	Uday Bondhugula	2019-03-29	1	-30/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- add method normalizeConstraintsByGCD - call normalizeConstraintsByGCD() and GCDTightenInequalities() at the end of projectOut. - remove call to GCDTightenInequalities() from getMemRefRegion - change isEmpty() to check isEmptyByGCDTest() / hasInvalidConstraint() each time an identifier is eliminated (to detect emptiness early). - make FourierMotzkinEliminate, gaussianEliminateId(s), GCDTightenInequalities() private - improve / update stale comments PiperOrigin-RevId: 224866741
*	Generate strided DMAs from -dma-generate	Uday Bondhugula	2019-03-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- generate DMAs correctly now using strided DMAs where needed - add support for multi-level/nested strides; op still supports one level of stride for now. Other things - add test case for symbolic lower/upper bound; cases where the DMA buffer size can't be bounded by a known constant - add test case for dynamic shapes where the DMA buffers are however bounded by constants - refactor some of the '-dma-generate' code PiperOrigin-RevId: 224584529