bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Address documentation/readability related comments from cl/227252907 on memref	Uday Bondhugula	2019-03-29	2	-13/+14
\| \| \| \| \| \|	store forwarding - NFC. PiperOrigin-RevId: 229561933
*	Minor code cleanup - NFC.	Uday Bondhugula	2019-03-29	4	-23/+14
\| \| \| \| \| \|	- readability changes PiperOrigin-RevId: 229443430
*	Add EDSC sugar	Nicolas Vasilache	2019-03-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows load, store and ForNest to be used with both Expr and Bindable. This simplifies writing generic pieces of MLIR snippet. For instance, a generic pointwise add can now be written: ```cpp // Different Bindable ivs, one per loop in the loop nest. auto ivs = makeBindables(shapeA.size()); Bindable zero, one; // Same bindable, all equal to `zero`. SmallVector<Bindable, 8> zeros(ivs.size(), zero); // Same bindable, all equal to `one`. SmallVector<Bindable, 8> ones(ivs.size(), one); // clang-format off Bindable A, B, C; Stmt scalarA, scalarB, tmp; Stmt block = edsc::Block({ ForNest(ivs, zeros, shapeA, ones, { scalarA = load(A, ivs), scalarB = load(B, ivs), tmp = scalarA + scalarB, store(tmp, C, ivs) }), }); // clang-format on ``` This CL also adds some extra support for pretty printing that will be used in a future CL when we introduce standalone testing of EDSCs. At the momen twe are lacking the basic infrastructure to write such tests. PiperOrigin-RevId: 229375850
*	Fix outdated comments	Uday Bondhugula	2019-03-29	2	-10/+9
\| \| \| \|	PiperOrigin-RevId: 229300301
*	Swap the type and attribute parameter in ConstantOp::build()	Lei Zhang	2019-03-29	3	-5/+5
\| \| \| \| \| \| \|	This is to keep consistent with other TableGen generated builders so that we can also use this builder in TableGen rules. PiperOrigin-RevId: 229244630
*	LoopFusion: automate selection of source loop nest slice depth and ↵	MLIR Team	2019-03-29	1	-22/+353
\| \| \| \| \| \| \| \| \| \| \| \|	destination loop nest insertion depth based on a simple cost model (cost model can be extended/replaced at a later time). ) LoopFusion: Adds fusion cost function which compares the cost of the fused loop nest, with the cost of the two unfused loop nests to determine if it is profitable to fuse the candidate loop nests. The fusion cost function is run for various combinations for src/dst loop depths attempting find the minimum cost setting for src/dst loop depths which does not increase the computational cost when the loop nests are fused. Combinations of src/dst loop depth are evaluated attempting to maximize loop depth (i.e. take a bigger computation slice from the source loop nest, and insert it deeper in the destination loop nest for better locality). ) LoopFusion: Adds utility to compute op instance count for loop nests, sliced loop nests, and to compute the cost of a loop nest fused with another sliced loop nest. ) LoopFusion: canonicalizes slice bound AffineMaps (and updates related tests). ) Analysis::Utils: Splits getBackwardComputationSlice into two functions: one which calculates and returns the slice loop bounds for analysis by LoopFusion, and the other for insertion of the computation slice (ones fusion has calculated the min-cost src/dst loop depths). *) Test: Adds multiple unit tests to test the new functionality. PiperOrigin-RevId: 229219757
*	[MLIR] Clip all access dimensions during LowerVectorTransfers	Nicolas Vasilache	2019-03-29	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL adds a short term remedy to an issue that was found during execution tests. Lowering of vector transfer ops uses the permutation map to determine which ForInst have been super-vectorized. During materialization to HW vector sizes however, some of those dimensions may be fully unrolled and do not appear in the permutation map. Such dimensions were then not clipped and may have accessed out of bounds. This CL conservatively clips all dimensions to ensure no out of bounds access. The longer term solution is still up for debate but will probably require either passing more information between Materialization and lowering, or just merging the 2 passes. PiperOrigin-RevId: 228980787
*	Simplify compositions of AffineApply	Nicolas Vasilache	2019-03-29	2	-84/+10
\| \| \| \| \| \| \| \|	This CL is the 6th and last on the path to simplifying AffineMap composition. This removes `AffineValueMap::forwardSubstitutions` and replaces it by simple calls to `fullyComposeAffineMapAndOperands`. PiperOrigin-RevId: 228962580
*	Uniformize composition of AffineApplyOp by construction	Nicolas Vasilache	2019-03-29	3	-38/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL is the 5th on the path to simplifying AffineMap composition. This removes the distinction between normalized single-result AffineMap and more general composed multi-result map. One nice byproduct of making the implementation driven by single-result is that the multi-result extension is a trivial change: the implementation is still single-result and we just use: ``` unsigned idx = getIndexOf(...); map.getResult(idx); ``` This CL also fixes an AffineNormalizer implementation issue related to symbols. Namely it stops performing substitutions on symbols in AffineNormalizer and instead concatenates them all to be consistent with the call to `AffineMap::compose(AffineMap)`. This latter call to `compose` cannot perform simplifications of symbols coming from different maps based on positions only: i.e. dims are applied and renumbered but symbols must be concatenated. The only way to determine whether symbols from different AffineApply are the same is to look at the concrete values. The canonicalizeMapAndOperands is thus extended with behavior to support replacing operands that appear multiple times. Lastly, this CL demonstrates that the implementation is correct by rewriting ComposeAffineMaps using only `makeComposedAffineApply`. The implementation uses a matcher because AffineApplyOp are introduced as composed operations on the fly instead of iteratively forwardSubstituting. For this purpose, a walker would revisit freshly introduced AffineApplyOp. Regardless, ComposeAffineMaps is scheduled to disappear, this CL replaces the implementation based on iterative `forwardSubstitute` by a composed-by-construction `makeComposedAffineApply`. Remaining calls to `forwardSubstitute` will be removed in the next CL. PiperOrigin-RevId: 228830443
*	Implement branch-free single-division lowering of affine division/remainder	Alex Zinenko	2019-03-29	1	-12/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implements the lowering of `floordiv`, `ceildiv` and `mod` operators from affine expressions to the arithmetic primitive operations. Integer division rules in affine expressions explicitly require rounding towards either negative or positive infinity unlike machine implementations that round towards zero. In the general case, implementing `floordiv` and `ceildiv` using machine signed division requires computing both the quotient and the remainder. When the divisor is positive, this can be simplified by adjusting the dividend and the quotient by one and switching signs. In the current use cases, we are unlikely to encounter affine expressions with negative divisors (affine divisions appear in loop transformations such as tiling that guarantee that divisors are positive by construction). Therefore, it is reasonable to use branch-free single-division implementation. In case of affine maps, divisors can only be literals so we can check the sign and implement the case for negative divisors when the need arises. The affine lowering pass can still fail when applied to semi-affine maps (division or modulo by a symbol). PiperOrigin-RevId: 228668181
*	Fix DMA overlap pass buffer mapping	Uday Bondhugula	2019-03-29	1	-2/+3
\| \| \| \| \| \| \| \| \|	- the double buffer should be indexed (iv floordiv step) % 2 and NOT (iv % 2); step wasn't being accounted for. - fix test cases, enable failing test cases PiperOrigin-RevId: 228635726
*	[MLIR] Make SuperVectorization use normalized AffineApplyOp	Nicolas Vasilache	2019-03-29	1	-8/+20
\| \| \| \| \| \| \| \| \|	Supervectorization does not plan on handling multi-result AffineMaps and non-canonical chains of > 1 AffineApplyOp. This CL uses the simpler single-result unbounded AffineApplyOp in the MaterializeVectors pass. PiperOrigin-RevId: 228469085
*	Introduce AffineMap::compose(AffineMap)	Nicolas Vasilache	2019-03-29	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	This CL is the 2nd on the path to simplifying AffineMap composition. This CL uses the now accepted `AffineExpr::compose(AffineMap)` to implement `AffineMap::compose(AffineMap)`. Implications of keeping the simplification function in Analysis are documented where relevant. PiperOrigin-RevId: 228276646
*	Extend loop-fusion's slicing utility + other fixes / updates	Uday Bondhugula	2019-03-29	2	-7/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- refactor toAffineFromEq and the code surrounding it; refactor code into FlatAffineConstraints::getSliceBounds - add FlatAffineConstraints methods to detect identifiers as mod's and div's of other identifiers - add FlatAffineConstraints::getConstantLower/UpperBound - Address b/122118218 (don't assert on invalid fusion depths cmdline flags - instead, don't do anything; change cmdline flags src-loop-depth -> fusion-src-loop-depth - AffineExpr/Map print method update: don't fail on null instances (since we have a wrapper around a pointer, it's avoidable); rationale: dump/print methods should never fail if possible. - Update memref-dataflow-opt to add an optimization to avoid a unnecessary call to IsRangeOneToOne when it's trivially going to be true. - Add additional test cases to exercise the new support - update a few existing test cases since the maps are now generated uniformly with all destination loop operands appearing for the backward slice - Fix projectOut - fix wrong range for getBestElimCandidate. - Fix for getConstantBoundOnDimSize() - didn't show up in any test cases since we didn't have any non-hyperrectangular ones. PiperOrigin-RevId: 228265152
*	Convert expr - c * (expr floordiv c) to expr mod c in AffineExpr	Uday Bondhugula	2019-03-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Detect 'mod' to replace the combination of floordiv, mul, and subtract when possible at construction time; when 'c' is a power of two, this reduces the number of operations; also more compact and readable. Update simplifyAdd for this. On a side note: - with the affine expr flattening we have, a mod expression like d0 mod c would be flattened into d0 - c * q, c * q <= d0 <= cq + c - 1, with 'q' being added as the local variable (q = d0 floordiv c); as a result, a mod was turned into a floordiv whenever the expression was reconstructed back, i.e., as d0 - c (d0 floordiv c); as a result of this change, we recover the mod back. - rename SimplifyAffineExpr -> SimplifyAffineStructures (pass had been renamed but the file hadn't been). PiperOrigin-RevId: 228258120
*	Misc readability and doc / code comment related improvements - NFC	Uday Bondhugula	2019-03-29	4	-34/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- when SSAValue/MLValue existed, code at several places was forced to create additional aggregate temporaries of SmallVector<SSAValue/MLValue> to handle the conversion; get rid of such redundant code - use filling ctors instead of explicit loops - for smallvectors, change insert(list.end(), ...) -> append(... - improve comments at various places - turn getMemRefAccess into MemRefAccess ctor and drop duplicated getMemRefAccess. In the next CL, provide getAccess() accessors for load, store, DMA op's to return a MemRefAccess. PiperOrigin-RevId: 228243638
*	[MLIR] Introduce normalized single-result unbounded AffineApplyOp	Nicolas Vasilache	2019-03-29	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Supervectorization does not plan on handling multi-result AffineMaps and non-canonical chains of > 1 AffineApplyOp. This CL introduces a simpler abstraction and composition of single-result unbounded AffineApplyOp by using the existing unbound AffineMap composition. This CL adds a simple API call and relevant tests: ```c++ OpPointer<AffineApplyOp> makeNormalizedAffineApply( FuncBuilder b, Location loc, AffineMap map, ArrayRef<Value> operands); ``` which creates a single-result unbounded AffineApplyOp. The operands of AffineApplyOp are not themselves results of AffineApplyOp by consrtuction. This represent the simplest possible interface to complement the composition of (mathematical) AffineMap, for the cases when we are interested in applying it to Value*. In this CL the composed AffineMap is not compressed (i.e. there exist operands that are not part of the result). A followup commit will compress to normal form. The single-result unbounded AffineApplyOp abstraction will be used in a followup CL to support the MaterializeVectors pass. PiperOrigin-RevId: 227879021
*	[MLIR] More graceful failure in MaterializeVectors	Nicolas Vasilache	2019-03-29	1	-3/+6
\| \| \| \| \| \| \|	Even though it is unexpected except in pathological cases, a nullptr clone may be returned. This CL handles the nullptr return gracefuly. PiperOrigin-RevId: 227764615
*	[MLIR] Drop strict super-vector requirement in MaterializeVector	Nicolas Vasilache	2019-03-29	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The strict requirement (i.e. at least 2 HW vectors in a super-vector) was a premature optimization to avoid interfering with other vector code potentially introduced via other means. This CL avoids this premature optimization and the spurious errors it causes when super-vector size == HW vector size (which is a possible corner case). This may be revisited in the future. PiperOrigin-RevId: 227763966
*	[MLIR] Handle corner case in MaterializeVectors	Nicolas Vasilache	2019-03-29	1	-4/+11
\| \| \| \| \| \| \| \| \| \| \|	This corner was found when stress testing with a functional end-to-end CPU path. In the case where the hardware vector size is 1x...x1 the `keep` vector is empty and would result a crash. While there is no reason to expect a 1x...x1 HW vector in practice, this case can just gracefully degrade to scalar, which is what this CL allows. PiperOrigin-RevId: 227761097
*	Split the standard types from builtin types and move them into separate ↵	River Riddle	2019-03-29	1	-0/+1
\| \| \| \| \| \|	source files(StandardTypes.cpp/h). After this cl only FunctionType and IndexType are builtin types, but IndexType will likely become a standard type when the ml/cfgfunc merger is done. Mechanical NFC. PiperOrigin-RevId: 227750918
*	Merge LowerAffineApplyPass into LowerIfAndForPass, rename to LowerAffinePass	Alex Zinenko	2019-03-29	3	-252/+153
\| \| \| \| \| \| \| \| \| \| \| \|	This change is mechanical and merges the LowerAffineApplyPass and LowerIfAndForPass into a single LowerAffinePass. It makes a step towards defining an "affine dialect" that would contain all polyhedral-related constructs. The motivation for merging these two passes is based on retiring MLFunctions and, eventually, transforming If and For statements into regular operations. After that happens, LowerAffinePass becomes yet another legalization. PiperOrigin-RevId: 227566113
*	LowerForAndIf: expand affine_apply's inplace	Alex Zinenko	2019-03-29	2	-38/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Existing implementation was created before ML/CFG unification refactoring and did not concern itself with further lowering to separate concerns. As a result, it emitted `affine_apply` instructions to implement `for` loop bounds and `if` conditions and required a follow-up function pass to lower those `affine_apply` to arithmetic primitives. In the unified function world, LowerForAndIf is mostly a lowering pass with low complexity. As we move towards a dialect for affine operations (including `for` and `if`), it makes sense to lower `for` and `if` conditions directly to arithmetic primitives instead of relying on `affine_apply`. Expose `expandAffineExpr` function in LoweringUtils. Use this function together with `expandAffineMaps` to emit primitives that implement loop and branch conditions directly. Also remove tests that become unnecessary after transforming LowerForAndIf into a function pass. PiperOrigin-RevId: 227563608
*	Refactor LowerAffineApply	Alex Zinenko	2019-03-29	2	-35/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In LoweringUtils, extract out `expandAffineMap`. This function takes an affine map and a list of values the map should be applied to and emits a sequence of arithmetic instructions that implement the affine map. It is independent of the AffineApplyOp and can be used in places where we need to insert an evaluation of an affine map without relying on a (temporary) `affine_apply` instruction. This prepares for a merge between LowerAffineApply and LowerForAndIf passes. Move the `expandAffineApply` function to the LowerAffineApply pass since it is the only place that must be aware of the `affine_apply` instructions. PiperOrigin-RevId: 227563439
*	Eliminate extfunc/cfgfunc/mlfunc as a concept, and just use 'func' instead.	Chris Lattner	2019-03-29	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	The entire compiler now looks at structural properties of the function (e.g. does it have one block, does it contain an if/for stmt, etc) so the only thing holding up this difference is round tripping through the parser/printer syntax. Removing this shrinks the compile by ~140LOC. This is step 31/n towards merging instructions and statements. The last step is updating the docs, which I will do as a separate patch in order to split it from this mostly mechanical patch. PiperOrigin-RevId: 227540453
*	[MLIR] Sketch a simple set of EDSCs to declaratively write MLIR	Nicolas Vasilache	2019-03-29	2	-137/+352
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL introduces a simple set of Embedded Domain-Specific Components (EDSCs) in MLIR components: 1. a `Type` system of shell classes that closely matches the MLIR type system. These types are subdivided into `Bindable` leaf expressions and non-bindable `Expr` expressions; 2. an `MLIREmitter` class whose purpose is to: a. maintain a map of `Bindable` leaf expressions to concrete SSAValue; b. provide helper functionality to specify bindings of `Bindable` classes to SSAValue while verifying comformable types; c. traverse the `Expr` and emit the MLIR. This is used on a concrete example to implement MemRef load/store with clipping in the LowerVectorTransfer pass. More specifically, the following pseudo-C++ code: ```c++ MLFuncBuilder *b = ...; Location location = ...; Bindable zero, one, expr, size; // EDSL expression auto access = select(expr < zero, zero, select(expr < size, expr, size - one)); auto ssaValue = MLIREmitter(b) .bind(zero, ...) .bind(one, ...) .bind(expr, ...) .bind(size, ...) .emit(location, access); ``` is used to emit all the MLIR for a clipped MemRef access. This simple EDSL can easily be extended to more powerful patterns and should serve as the counterpart to pattern matchers (and could potentially be unified once we get enough experience). In the future, most of this code should be TableGen'd but for now it has concrete valuable uses: make MLIR programmable in a declarative fashion. This CL also adds Stmt, proper supporting free functions and rewrites VectorTransferLowering fully using EDSCs. The code for creating the EDSCs emitting a VectorTransferReadOp as loops with clipped loads is: ```c++ Stmt block = Block({ tmpAlloc = alloc(tmpMemRefType), vectorView = vector_type_cast(tmpAlloc, vectorMemRefType), ForNest(ivs, lbs, ubs, steps, { scalarValue = load(scalarMemRef, accessInfo.clippedScalarAccessExprs), store(scalarValue, tmpAlloc, accessInfo.tmpAccessExprs), }), vectorValue = load(vectorView, zero), tmpDealloc = dealloc(tmpAlloc.getLHS())}); emitter.emitStmt(block); ``` where `accessInfo.clippedScalarAccessExprs)` is created with: ```c++ select(i + ii < zero, zero, select(i + ii < N, i + ii, N - one)); ``` The generated MLIR resembles: ```mlir %1 = dim %0, 0 : memref<?x?x?x?xf32> %2 = dim %0, 1 : memref<?x?x?x?xf32> %3 = dim %0, 2 : memref<?x?x?x?xf32> %4 = dim %0, 3 : memref<?x?x?x?xf32> %5 = alloc() : memref<5x4x3xf32> %6 = vector_type_cast %5 : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>> for %i4 = 0 to 3 { for %i5 = 0 to 4 { for %i6 = 0 to 5 { %7 = affine_apply #map0(%i0, %i4) %8 = cmpi "slt", %7, %c0 : index %9 = affine_apply #map0(%i0, %i4) %10 = cmpi "slt", %9, %1 : index %11 = affine_apply #map0(%i0, %i4) %12 = affine_apply #map1(%1, %c1) %13 = select %10, %11, %12 : index %14 = select %8, %c0, %13 : index %15 = affine_apply #map0(%i3, %i6) %16 = cmpi "slt", %15, %c0 : index %17 = affine_apply #map0(%i3, %i6) %18 = cmpi "slt", %17, %4 : index %19 = affine_apply #map0(%i3, %i6) %20 = affine_apply #map1(%4, %c1) %21 = select %18, %19, %20 : index %22 = select %16, %c0, %21 : index %23 = load %0[%14, %i1, %i2, %22] : memref<?x?x?x?xf32> store %23, %5[%i6, %i5, %i4] : memref<5x4x3xf32> } } } %24 = load %6[%c0] : memref<1xvector<5x4x3xf32>> dealloc %5 : memref<5x4x3xf32> ``` In particular notice that only 3 out of the 4-d accesses are clipped: this corresponds indeed to the number of dimensions in the super-vector. This CL also addresses the cleanups resulting from the review of the prevous CL and performs some refactoring to simplify the abstraction. PiperOrigin-RevId: 227367414
*	Merge together the CFG/ML function paths in the CSE pass. I did a first pass	Chris Lattner	2019-03-29	1	-126/+113
\| \| \| \| \| \| \| \| \|	on this to merge together the classes, but there may be other simplification possible. I'll leave that to riverriddle@ as future work. This is step 29/n towards merging instructions and statements. PiperOrigin-RevId: 227328680
*	Update and generalize various passes to work on both CFG and ML functions,	Chris Lattner	2019-03-29	13	-122/+75
\| \| \| \| \| \| \| \| \| \| \| \| \|	simplifying them in minor ways. The only significant cleanup here is the constant folding pass. All the other changes are simple and easy, but this is still enough to shrink the compiler by 45LOC. The one pass left to merge is the CSE pass, which will be move involved, so I'm splitting it out to its own patch (which I'll tackle right after this). This is step 28/n towards merging instructions and statements. PiperOrigin-RevId: 227328115
*	Simplify the remapFunctionAttrs logic, merging CFG/ML function handling.	Chris Lattner	2019-03-29	1	-30/+3
\| \| \| \| \| \| \| \| \| \| \|	Remove an unnecessary restriction in forward substitution. Slightly simplify LLVM IR lowering, which previously would crash if given an ML function, it should now produce a clean error if given a function with an if/for instruction in it, just like it does any other unsupported op. This is step 27/n towards merging instructions and statements. PiperOrigin-RevId: 227324542
*	Simplify GreedyPatternRewriteDriver now that functions are merged into one	Chris Lattner	2019-03-29	1	-116/+50
\| \| \| \| \| \| \| \| \|	representation, shrinking by 70LOC. The PatternRewriter class can probably also be simplified as well, but one step at a time. This is step 26/n towards merging instructions and statements. NFC. PiperOrigin-RevId: 227324218
*	Introduce PostDominanceInfo, fix properlyDominates() for Instructions	Uday Bondhugula	2019-03-29	4	-7/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- introduce PostDominanceInfo in the right/complete way and use that for post dominance check in store-load forwarding - replace all uses of Analysis/Utils::dominates/properlyDominates with DominanceInfo::dominates/properlyDominates - drop all redundant copies of dominance methods in Analysis/Utils/ - in pipeline-data-transfer, replace dominates call with a much less expensive check; similarly, substitute dominates() in checkMemRefAccessDependence with a simpler check suitable for that context - fix a bug in properlyDominates - improve doc for 'for' instruction 'body' PiperOrigin-RevId: 227320507
*	Greatly simplify the ConvertToCFG pass, converting it from a module pass to a	Chris Lattner	2019-03-29	2	-636/+384
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	function pass, and eliminating the need to copy over code and do interprocedural updates. While here, also improve it to make fewer empty blocks, and rename it to "LowerIfAndFor" since that is what it does. This is a net reduction of ~170 lines of code. As drive-bys, change the splitBlock method to not insert an unconditional branch, since that behavior is annoying for all clients. Also improve the AsmPrinter to not crash when a block is referenced that isn't linked into a function. PiperOrigin-RevId: 227308856
*	Fix ASAN failure in memref-dataflow-opt	Uday Bondhugula	2019-03-29	1	-3/+3
\| \| \| \| \| \|	- memrefsToErase had duplicates inserted into it; switch to SmallPtrSet. PiperOrigin-RevId: 227299306
*	Introduce memref store to load forwarding - a simple memref dataflow analysis	Uday Bondhugula	2019-03-29	3	-52/+247
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- the load/store forwarding relies on memref dependence routines as well as SSA/dominance to identify the memref store instance uniquely supplying a value to a memref load, and replaces the result of that load with the value being stored. The memref is also deleted when possible if only stores remain. - add methods for post dominance for MLFunction blocks. - remove duplicated getLoopDepth/getNestingDepth - move getNestingDepth, getMemRefAccess, getNumCommonSurroundingLoops into Analysis/Utils (were earlier static) - add a helper method in FlatAffineConstraints - isRangeOneToOne. PiperOrigin-RevId: 227252907
*	Extend InstVisitor and Walker to handle arbitrary CFG functions, expand the	Chris Lattner	2019-03-29	9	-27/+35
\| \| \| \| \| \| \| \| \| \| \|	Function::walk functionality into f->walkInsts/Ops which allows visiting all instructions, not just ops. Eliminate Function::getBody() and Function::getReturn() helpers which crash in CFG functions, and were only kept around as a bridge. This is step 25/n towards merging instructions and statements. PiperOrigin-RevId: 227243966
*	Tidy up references to "basic blocks" that should refer to blocks now. NFC.	Chris Lattner	2019-03-29	1	-2/+2
\| \| \| \|	PiperOrigin-RevId: 227196077
*	Standardize naming of statements -> instructions, revisting the code base to be	Chris Lattner	2019-03-29	18	-854/+855
\| \| \| \| \| \| \| \| \|	consistent and moving the using declarations over. Hopefully this is the last truly massive patch in this refactoring. This is step 21/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227178245
*	Rename BasicBlock and StmtBlock to Block, and make a pass cleaning it up. I ↵	Chris Lattner	2019-03-29	13	-58/+59
\| \| \| \| \| \| \| \| \| \| \|	did not make an effort to rename all of the 'bb' names in the codebase, since they are still correct and any specific missed once can be fixed up on demand. The last major renaming is Statement -> Instruction, which is why Statement and Stmt still appears in various places. This is step 19/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227163082
*	Eliminate the using decls for MLFunction and CFGFunction standardizing on	Chris Lattner	2019-03-29	20	-106/+104
\| \| \| \| \| \| \| \|	Function. This is step 18/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227139399
*	Rename BBArgument -> BlockArgument, Op::getOperation -> Op::getInst(),	Chris Lattner	2019-03-29	7	-22/+21
\| \| \| \| \| \| \| \|	StmtResult -> InstResult, StmtOperand -> InstOperand, and remove the old names. This is step 17/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227121537
*	Merge Operation into OperationInst and standardize nomenclature around	Chris Lattner	2019-03-29	17	-165/+166
\| \| \| \| \| \| \| \|	OperationInst. This is a big mechanical patch. This is step 16/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227093712
*	Rework inherentance hierarchy: Operation now derives from Statement, and	Chris Lattner	2019-03-29	1	-13/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	OperationInst derives from it. This allows eliminating some forwarding functions, other complex code handling multiple paths, and the 'isStatement' bit tracked by Operation. This is the last patch I think I can make before the big mechanical change merging Operation into OperationInst, coming next. This is step 15/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227077411
*	Minor renamings: Trim the "Stmt" prefix off	Chris Lattner	2019-03-29	2	-6/+6
\| \| \| \| \| \| \| \| \|	StmtSuccessorIterator/StmtSuccessorIterator, and rename and move the CFGFunctionViewGraph pass to ViewFunctionGraph. This is step 13/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227069438
*	Merge CFGFuncBuilder/MLFuncBuilder/FuncBuilder together into a single new	Chris Lattner	2019-03-29	12	-67/+66
\| \| \| \| \| \| \| \|	FuncBuilder class. Also rename SSAValue.cpp to Value.cpp This is step 12/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227067644
*	Merge SSAValue, CFGValue, and MLValue together into a single Value class, which	Chris Lattner	2019-03-29	14	-229/+202
\| \| \| \| \| \| \| \| \|	is the new base of the SSA value hierarchy. This CL also standardizes all the nomenclature and comments to use 'Value' where appropriate. This also eliminates a large number of cast<MLValue>(x)'s, which is very soothing. This is step 11/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227064624
*	Eliminate the Instruction, BasicBlock, CFGFunction, MLFunction, and ↵	Chris Lattner	2019-03-29	7	-44/+43
\| \| \| \| \| \| \| \| \| \| \| \|	ExtFunction classes, using the Statement/StmtBlock hierarchy and Function instead. This only changes the internal data structures, it does not affect the user visible syntax or structure of MLIR code. Function gets new "isCFG()" sorts of predicates as a transitional measure. This patch is gross in a number of ways, largely in an effort to reduce the amount of mechanical churn in one go. It introduces a bunch of using decls to keep the old names alive for now, and a bunch of stuff needs to be renamed. This is step 10/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227044402
*	Rename findFunction from the ML side of the house to be named getFunction(),	Chris Lattner	2019-03-29	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	making it more similar to the CFG side of things. It is true that in a deeply nested case that this is not a guaranteed O(1) time operation, and that 'get' could lead compiler hackers to think this is cheap, but we need to merge these and we can look into solutions for this in the future if it becomes a problem in practice. This is step 9/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 226983931
*	Rename CFGFunctionGraphTraits.h -> FunctionGraphTraits.h and add	Chris Lattner	2019-03-29	1	-1/+1
\| \| \| \| \| \| \| \| \|	graph specializations for doing CFG traversals of ML Functions, making the two sorts of functions have the same capabilities. This is step 8/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 226968502
*	SuperVectorization: fix 'isa' assertion	Alex Zinenko	2019-03-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Supervectorization uses null pointers to SSA values as a means of communicating the failure to vectorize. In operation vectorization, all operations producing the values of operation arguments must be vectorized for the given operation to be vectorized. The existing check verified if any of the value "def" statements was vectorized instead, sometimes leading to assertions inside `isa` called on a null pointer. Fix this to check that all "def" statements were vectorized. PiperOrigin-RevId: 226941552
*	Rename convenience methods to make type explicit.	Jacques Pienaar	2019-03-29	1	-1/+1
\| \| \| \|	PiperOrigin-RevId: 226939383