bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Add utility 'replaceAllUsesWith' methods to Operation.	River Riddle	2019-08-07	4	-10/+5
\| \| \| \| \| \|	These methods will allow replacing the uses of results with an existing operation, with the same number of results, or a range of values. This removes a number of hand-rolled result replacement loops and simplifies replacement for operations with multiple results. PiperOrigin-RevId: 262206600
*	Improve support for opaque types in MLIR, allowing dialects to opt into	Chris Lattner	2019-08-07	2	-1/+17
\| \| \| \| \| \|	supporting opaque types, and providing ODS support for matching them. PiperOrigin-RevId: 262183028
*	Fix verification of zero-dim memref in ↵	Diego Caballero	2019-08-07	3	-7/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	affine.load/affine.store/std.load/std.store Verification complained when using zero-dimensional memrefs in affine.load, affine.store, std.load and std.store. This PR extends verification so that those memrefs can be used. Closes tensorflow/mlir#58 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/58 from dcaballe:dcaballe/zero-dim 49bcdcd45c52c48beca776431328e5ce551dfa9e PiperOrigin-RevId: 262164916
*	NFC: Simplify ModuleTerminatorOp by using the HasParent trait.	River Riddle	2019-08-06	1	-11/+0
\| \| \| \|	PiperOrigin-RevId: 261962104
*	Remove ops in regions/blocks from worklist when parent op is being removed ↵	Andy Ly	2019-08-06	1	-0/+4
\| \| \| \| \| \| \| \|	via GreedyPatternRewriteDriver::replaceOp. This fixes a bug where ops inside the parent op are visited even though the parent op has been removed. PiperOrigin-RevId: 261953580
*	NFC: Simplify ModuleOp by using the SingleBlockImplicitTerminator trait.	River Riddle	2019-08-06	1	-18/+2
\| \| \| \|	PiperOrigin-RevId: 261944712
*	Add a region to linalg.generic	Nicolas Vasilache	2019-08-06	2	-34/+140
\| \| \| \| \| \| \| \| \| \|	This CL extends the Linalg GenericOp with an alternative way of specifying the body of the computation based on a single block region. The "fun" attribute becomes optional. Either a SymbolRef "fun" attribute or a single block region must be specified to describe the side-effect-free computation. Upon lowering to loops, the new region body is inlined in the innermost loop. The parser, verifier and pretty printer are extended. Appropriate roundtrip, negative and lowering to loop tests are added. PiperOrigin-RevId: 261895568
*	Refactor Linalg ops to loop lowering (NFC)	Nicolas Vasilache	2019-08-06	6	-245/+313
\| \| \| \| \| \| \|	This CL modifies the LowerLinalgToLoopsPass to use RewritePattern. This will make it easier to inline Linalg generic functions and regions when emitting to loops in a subsequent CL. PiperOrigin-RevId: 261894120
*	Add TTI pass initialization to pass managers.	Diego Caballero	2019-08-05	2	-11/+27
\| \| \| \| \| \| \| \| \|	Many LLVM transformations benefits from knowing the targets. This enables optimizations, especially in a JIT context when the target is (generally) well-known. Closes tensorflow/mlir#49 PiperOrigin-RevId: 261840617
*	NFC: Implement OwningRewritePatternList as a class instead of a using directive.	River Riddle	2019-08-05	18	-75/+51
\| \| \| \| \| \|	This allows for proper forward declaration, as opposed to leaking the internal implementation via a using directive. This also allows for all pattern building to go through 'insert' methods on the OwningRewritePatternList, replacing uses of 'push_back' and 'RewriteListBuilder'. PiperOrigin-RevId: 261816316
*	Drop linalg.range_intersect op	Nicolas Vasilache	2019-08-05	2	-67/+1
\| \| \| \| \| \|	This op is not useful. PiperOrigin-RevId: 261665736
*	Use SingleBlockImplicitTerminator trait for spv.module	Lei Zhang	2019-08-05	1	-8/+4
\| \| \| \| \| \| \| \|	This trait provides the ensureTerminator() utility function and the checks to make sure a spv.module is indeed terminated with spv._module_end. PiperOrigin-RevId: 261664153
*	Introduce custom syntax for llvm.func	Alex Zinenko	2019-08-05	3	-5/+79
\| \| \| \| \| \| \|	Similar to all LLVM dialect operations, llvm.func needs to have the custom syntax. Use the generic FunctionLike printer and parser to implement it. PiperOrigin-RevId: 261641755
*	Remove non-needed includes from ConvertControlFlowToCFG.cpp (NFC)	Mehdi Amini	2019-08-04	1	-5/+0
\| \| \| \| \| \| \| \|	The includes related to the LLVM dialect are not used in this file and introduce an implicit dependencies between the two libraries which isn't reflected in the CMakeLists.txt, causing non-deterministic build failures. PiperOrigin-RevId: 261576935
*	Fix ExecutionEngine post-update in upstream LLVM	Alex Zinenko	2019-08-04	1	-1/+3
\| \| \| \| \| \| \| \| \|	LLVM r367686 changed the locking scheme to avoid potential deadlocks and the related llvm::orc::ThreadSafeModule APIs ExecutionEngine was relying upon, breaking the MLIR build. Update our use of ThreadSafeModule to unbreak the build. PiperOrigin-RevId: 261566571
*	Fix clang 5.0 by using type aliases for LLVM DenseSet/Map	Mehdi Amini	2019-08-03	3	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When inlining the declaration for llvm::DenseSet/DenseMap in the mlir namespace from a forward declaration, clang does not take the default for the template parameters if their are declared later. namespace llvm { template<typename Foo> class DenseMap; } namespace mlir { using llvm::DenseMap; } namespace llvm { template<typename Foo = int> class DenseMap {}; } namespace mlir { DenseMap<> map; } PiperOrigin-RevId: 261495612
*	Add a generic Linalg op	Nicolas Vasilache	2019-08-02	5	-8/+203
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL introduces a linalg.generic op to represent generic tensor contraction operations on views. A linalg.generic operation requires a numbers of attributes that are sufficient to emit the computation in scalar form as well as compute the appropriate subviews to enable tiling and fusion. These attributes are very similar to the attributes for existing operations such as linalg.matmul etc and existing operations can be implemented with the generic form. In the future, most existing operations can be implemented using the generic form. This CL starts by splitting out most of the functionality of the linalg::NInputsAndOutputs trait into a ViewTrait that queries the per-instance properties of the op. This allows using the attribute informations. This exposes an ordering of verifiers issue where ViewTrait::verify uses attributes but the verifiers for those attributes have not been run. The desired behavior would be for the verifiers of the attributes specified in the builder to execute first but it is not the case atm. As a consequence, to emit proper error messages and avoid crashing, some of the linalg.generic methods are defensive as such: ``` unsigned getNumInputs() { // This is redundant with the `n_views` attribute verifier but ordering of verifiers // may exhibit cases where we crash instead of emitting an error message. if (!getAttr("n_views") \|\| n_views().getValue().size() != 2) return 0; ``` In pretty-printed form, the specific attributes required for linalg.generic are factored out in an independent dictionary named "_". When parsing its content is flattened and the "_name" is dropped. This allows using aliasing for reducing boilerplate at each linalg.generic invocation while benefiting from the Tablegen'd verifier form for each named attribute in the dictionary. For instance, implementing linalg.matmul in terms of linalg.generic resembles: ``` func @mac(%a: f32, %b: f32, %c: f32) -> f32 { %d = mulf %a, %b: f32 %e = addf %c, %d: f32 return %e: f32 } #matmul_accesses = [ (m, n, k) -> (m, k), (m, n, k) -> (k, n), (m, n, k) -> (m, n) ] #matmul_trait = { doc = "C(m, n) += A(m, k) * B(k, n)", fun = @mac, indexing_maps = #matmul_accesses, library_call = "linalg_matmul", n_views = [2, 1], n_loop_types = [2, 1, 0] } ``` And can be used in multiple places as: ``` linalg.generic #matmul_trait %A, %B, %C [other-attributes] : !linalg.view<?x?xf32>, !linalg.view<?x?xf32>, !linalg.view<?x?xf32> ``` In the future it would be great to have a mechanism to alias / register a new linalg.op as a pair of linalg.generic, #trait. Also, note that with one could theoretically only specify the `doc` string and parse all the attributes from it. PiperOrigin-RevId: 261338740
*	Fully qualify DenseMap.	Jacques Pienaar	2019-08-02	1	-1/+2
\| \| \| \|	PiperOrigin-RevId: 261325481
*	AffineDataCopyGeneration: don't use CL flag values inside the pass	Alex Zinenko	2019-08-02	1	-13/+22
\| \| \| \| \| \| \| \| \| \|	AffineDataCopyGeneration pass relied on command line flags for internal logic in several places, which makes it unusable in a library context (i.e. outside a standalone mlir-opt binary that does the command line parsing). Define configuration flags in the constructor instead, and set them up to command line-based defaults to maintain the original behavior. PiperOrigin-RevId: 261322364
*	Add missing include to DenseMap in MLIRContext.cpp	Mehdi Amini	2019-08-01	1	-0/+1
\| \| \| \| \| \|	This is fixing the build of MLIR on MacOS when built within TensorFlow PiperOrigin-RevId: 261223250
*	Introduce explicit copying optimization by generalizing the DMA generation pass	Uday Bondhugula	2019-08-01	2	-186/+281
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Explicit copying to contiguous buffers is a standard technique to avoid conflict misses and TLB misses, and improve hardware prefetching performance. When done in conjunction with cache tiling, it nearly eliminates all cache conflict and TLB misses, and a single hardware prefetch stream is needed per data tile. - generalize/extend DMA generation pass (renamed data copying pass) to perform either point-wise explicit copies to fast memory buffers or DMAs (depending on a cmd line option). All logic is the same as erstwhile -dma-generate. - -affine-dma-generate is now renamed -affine-data-copy; when -dma flag is provided, DMAs are generated, or else explicit copy loops are generated (point-wise) by default. - point-wise copying could be used for CPUs (or GPUs); some indicative performance numbers with a "C" version of the MLIR when compiled with and without this optimization (about 2x improvement here). With a matmul on 4096^2 matrices on a single core of an Intel Core i7 Skylake i7-8700K with clang 8.0.0: clang -O3: 518s clang -O3 with MLIR tiling (128x128): 24.5s clang -O3 with MLIR tiling + data copying 12.4s (code equivalent to test/Transforms/data-copy.mlir func @matmul) - fix some misleading comments. - change default fast-mem space to 0 (more intuitive now with the default copy generation using point-wise copies instead of DMAs) On a simple 3-d matmul loop nest, code generated with -affine-data-copy: ``` affine.for %arg3 = 0 to 4096 step 128 { affine.for %arg4 = 0 to 4096 step 128 { %0 = affine.apply #map0(%arg3, %arg4) %1 = affine.apply #map1(%arg3, %arg4) %2 = alloc() : memref<128x128xf32, 2> // Copy-in Out matrix. affine.for %arg5 = 0 to 128 { %5 = affine.apply #map2(%arg3, %arg5) affine.for %arg6 = 0 to 128 { %6 = affine.apply #map2(%arg4, %arg6) %7 = load %arg2[%5, %6] : memref<4096x4096xf32> affine.store %7, %2[%arg5, %arg6] : memref<128x128xf32, 2> } } affine.for %arg5 = 0 to 4096 step 128 { %5 = affine.apply #map0(%arg3, %arg5) %6 = affine.apply #map1(%arg3, %arg5) %7 = alloc() : memref<128x128xf32, 2> // Copy-in LHS. affine.for %arg6 = 0 to 128 { %11 = affine.apply #map2(%arg3, %arg6) affine.for %arg7 = 0 to 128 { %12 = affine.apply #map2(%arg5, %arg7) %13 = load %arg0[%11, %12] : memref<4096x4096xf32> affine.store %13, %7[%arg6, %arg7] : memref<128x128xf32, 2> } } %8 = affine.apply #map0(%arg5, %arg4) %9 = affine.apply #map1(%arg5, %arg4) %10 = alloc() : memref<128x128xf32, 2> // Copy-in RHS. affine.for %arg6 = 0 to 128 { %11 = affine.apply #map2(%arg5, %arg6) affine.for %arg7 = 0 to 128 { %12 = affine.apply #map2(%arg4, %arg7) %13 = load %arg1[%11, %12] : memref<4096x4096xf32> affine.store %13, %10[%arg6, %arg7] : memref<128x128xf32, 2> } } // Compute. affine.for %arg6 = #map7(%arg3) to #map8(%arg3) { affine.for %arg7 = #map7(%arg4) to #map8(%arg4) { affine.for %arg8 = #map7(%arg5) to #map8(%arg5) { %11 = affine.load %7[-%arg3 + %arg6, -%arg5 + %arg8] : memref<128x128xf32, 2> %12 = affine.load %10[-%arg5 + %arg8, -%arg4 + %arg7] : memref<128x128xf32, 2> %13 = affine.load %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2> %14 = mulf %11, %12 : f32 %15 = addf %13, %14 : f32 affine.store %15, %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2> } } } dealloc %10 : memref<128x128xf32, 2> dealloc %7 : memref<128x128xf32, 2> } %3 = affine.apply #map0(%arg3, %arg4) %4 = affine.apply #map1(%arg3, %arg4) // Copy out result matrix. affine.for %arg5 = 0 to 128 { %5 = affine.apply #map2(%arg3, %arg5) affine.for %arg6 = 0 to 128 { %6 = affine.apply #map2(%arg4, %arg6) %7 = affine.load %2[%arg5, %arg6] : memref<128x128xf32, 2> store %7, %arg2[%5, %6] : memref<4096x4096xf32> } } dealloc %2 : memref<128x128xf32, 2> } } ``` With -affine-data-copy -dma: ``` affine.for %arg3 = 0 to 4096 step 128 { %0 = affine.apply #map3(%arg3) %1 = alloc() : memref<128xf32, 2> %2 = alloc() : memref<1xi32> affine.dma_start %arg2[%arg3], %1[%c0], %2[%c0], %c128_0 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32> affine.dma_wait %2[%c0], %c128_0 : memref<1xi32> %3 = alloc() : memref<1xi32> affine.for %arg4 = 0 to 4096 step 128 { %5 = affine.apply #map0(%arg3, %arg4) %6 = affine.apply #map1(%arg3, %arg4) %7 = alloc() : memref<128x128xf32, 2> %8 = alloc() : memref<1xi32> affine.dma_start %arg0[%arg3, %arg4], %7[%c0, %c0], %8[%c0], %c16384, %c4096, %c128_2 : memref<4096x4096xf32>, memref<128x128xf32, 2>, memref<1xi32> affine.dma_wait %8[%c0], %c16384 : memref<1xi32> %9 = affine.apply #map3(%arg4) %10 = alloc() : memref<128xf32, 2> %11 = alloc() : memref<1xi32> affine.dma_start %arg1[%arg4], %10[%c0], %11[%c0], %c128_1 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32> affine.dma_wait %11[%c0], %c128_1 : memref<1xi32> affine.for %arg5 = #map3(%arg3) to #map5(%arg3) { affine.for %arg6 = #map3(%arg4) to #map5(%arg4) { %12 = affine.load %7[-%arg3 + %arg5, -%arg4 + %arg6] : memref<128x128xf32, 2> %13 = affine.load %10[-%arg4 + %arg6] : memref<128xf32, 2> %14 = affine.load %1[-%arg3 + %arg5] : memref<128xf32, 2> %15 = mulf %12, %13 : f32 %16 = addf %14, %15 : f32 affine.store %16, %1[-%arg3 + %arg5] : memref<128xf32, 2> } } dealloc %11 : memref<1xi32> dealloc %10 : memref<128xf32, 2> dealloc %8 : memref<1xi32> dealloc %7 : memref<128x128xf32, 2> } %4 = affine.apply #map3(%arg3) affine.dma_start %1[%c0], %arg2[%arg3], %3[%c0], %c128 : memref<128xf32, 2>, memref<4096xf32>, memref<1xi32> affine.dma_wait %3[%c0], %c128 : memref<1xi32> dealloc %3 : memref<1xi32> dealloc %2 : memref<1xi32> dealloc %1 : memref<128xf32, 2> } ``` Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#50 PiperOrigin-RevId: 261221903
*	Qualify StringRef to fix Windows build failure	Lei Zhang	2019-08-01	1	-1/+2
\| \| \| \|	PiperOrigin-RevId: 261195069
*	[spirv] Add support for specialization constant	Lei Zhang	2019-08-01	3	-86/+144
\| \| \| \| \| \| \| \|	This CL extends the existing spv.constant op to also support specialization constant by adding an extra unit attribute on it. PiperOrigin-RevId: 261194869
*	[spirv] Add binary logical operations.	Denis Khalikov	2019-08-01	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \|	Add binary logical operations regarding to the spec section 3.32.15: OpIEqual, OpINotEqual, OpUGreaterThan, OpSGreaterThan, OpUGreaterThanEqual, OpSGreaterThanEqual, OpULessThan, OpSLessThan, OpULessThanEqual, OpSLessThanEqual. Closes tensorflow/mlir#61 PiperOrigin-RevId: 261181281
*	Replace the verifyUnusedValue directive with HasNoUseOf constraint	Lei Zhang	2019-08-01	2	-10/+5
\| \| \| \| \| \| \| \| \|	verifyUnusedValue is a bit strange given that it is specified in a result pattern but used to generate match statements. Now we are able to support multi-result ops better, we can retire it and replace it with a HasNoUseOf constraint. This reduces the number of mechanisms. PiperOrigin-RevId: 261166863
*	Fix support for auxiliary ops in declarative rewrite rules	Lei Zhang	2019-07-31	1	-14/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We allow to generate more ops than what are needed for replacing the matched root op. Only the last N static values generated are used as replacement; the others serve as auxiliary ops/values for building the replacement. With the introduction of multi-result op support, an op, if used as a whole, may be used to replace multiple static values of the matched root op. We need to consider this when calculating the result range an generated op is to replace. For example, we can have the following pattern: ```tblgen def : Pattern<(ThreeResultOp ...), [(OneResultOp ...), (OneResultOp ...), (OneResultOp ...)]>; // Two op to replace all three results def : Pattern<(ThreeResultOp ...), [(TwoResultOp ...), (OneResultOp ...)]>; // One op to replace all three results def : Pat<(ThreeResultOp ...), (ThreeResultOp ...)>; def : Pattern<(ThreeResultOp ...), [(AuxiliaryOp ...), (ThreeResultOp ...)]>; ``` PiperOrigin-RevId: 261017235
*	Support hexadecimal floats in tensor literals	Alex Zinenko	2019-07-30	1	-114/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Extend the recently introduced support for hexadecimal float literals to tensor literals, which may also contain special floating point values such as infinities and NaNs. Modify TensorLiteralParser to store the list of tokens representing values until the type is parsed instead of trying to guess the tensor element type from the token kinds (hexadecimal values can be either integers or floats, and can be mixed with both). Maintain the error reports as close as possible to the existing implementation to avoid disturbing the tests. They can be improved in a separate clean-up if deemed necessary. PiperOrigin-RevId: 260794716
*	Add support for (de)serialization of SPIR-V Op Decorations	Mahesh Ravishankar	2019-07-30	3	-14/+104
\| \| \| \| \| \| \| \| \| \| \| \|	All non-argument attributes specified for an operation are treated as decorations on the result value and (de)serialized using OpDecorate instruction. An error is generated if an attribute is not an argument, and the name doesn't correspond to a Decoration enum. Name of the attributes that represent decoerations are to be the snake-case-ified version of the Decoration name. Add utility methods to convert to snake-case and camel-case. PiperOrigin-RevId: 260792638
*	Add support for hexadecimal float literals	Alex Zinenko	2019-07-30	2	-16/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MLIR does not have support for parsing special floating point values such as infinities and NaNs. If programmatically constructed, these values are printed as NaN and (+-)Inf and cannot be parsed back. Add parser support for hexadecimal literals in float attributes, following LLVM IR. The literal corresponds to the in-memory representation of the floating point value. IEEE 754 defines a range of possible values for NaNs, storing the bitwise representation allows MLIR to properly roundtrip NaNs with different bit values of significands. The initial version of this commit was missing support for float literals that used to be printed in decimal notation as a fallback, but ended up being printed in hexadecimal format which became the fallback for special values. The decimal fallback behavior was not exercised by tests. It is currently reinstated and tested by the newly added test @f32_potential_precision_loss in parser.mlir. PiperOrigin-RevId: 260790900
*	Initial implementation to translate kernel fn in GPU Dialect to SPIR-V Dialect	Mahesh Ravishankar	2019-07-30	10	-93/+465
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL adds an initial implementation for translation of kernel function in GPU Dialect (used with a gpu.launch_kernel) op to a spv.Module. The original function is translated into an entry function. Most of the heavy lifting is done by adding TypeConversion and other utility functions/classes that provide most of the functionality to translate from Standard Dialect to SPIR-V Dialect. These are intended to be reusable in implementation of different dialect conversion pipelines. Note : Some of the files for have been renamed to be consistent with the norm used by the other Conversion frameworks. PiperOrigin-RevId: 260759165
*	[spirv] Add basic infrastructure for negative deserializer tests	Lei Zhang	2019-07-30	5	-99/+93
\| \| \| \| \| \| \| \| \| \| \| \| \|	We are relying on serializer to construct positive cases to drive the test for deserializer. This leaves negative cases untested. This CL adds a basic test fixture for covering the negative corner cases to enforce a more robust deserializer. Refactored common SPIR-V building methods out of serializer to share it with the deserialization test. PiperOrigin-RevId: 260742733
*	Remove dead code.	Jacques Pienaar	2019-07-30	3	-58/+0
\| \| \| \|	PiperOrigin-RevId: 260585594
*	Add a `HasParent` operation trait to enforce a specific parent on an ↵	Mehdi Amini	2019-07-30	1	-3/+1
\| \| \| \| \| \|	operation (NFC) PiperOrigin-RevId: 260532592
*	Initialize union to avoid -Wmissing-field-initializers warning.	Jacques Pienaar	2019-07-27	1	-2/+2
\| \| \| \| \| \|	Reported by clang-6. PiperOrigin-RevId: 260311814
*	Verify that affine.load/store/dma_start/dma_wait operands are valid ↵	Andy Davis	2019-07-27	1	-7/+41
\| \| \| \| \| \|	dimension or symbol identifiers. PiperOrigin-RevId: 260197567
*	Support referencing a single value generated by a matched multi-result op	Lei Zhang	2019-07-26	1	-2/+3
\| \| \| \| \| \| \| \|	It's quite common that we want to put further constraints on the matched multi-result op's specific results. This CL enables referencing symbols bound to source op with the `__N` syntax. PiperOrigin-RevId: 260122401
*	Fix backward slice corner case	Nicolas Vasilache	2019-07-26	1	-3/+7
\| \| \| \| \| \| \|	In the backward slice computation, BlockArgument coming from function arguments represent a natural boundary for the traversal and should not trigger llvm_unreachable. This CL also improves the error message and adds a relevant test. PiperOrigin-RevId: 260118630
*	Use "standard" load and stores in LowerVectorTransfers	Nicolas Vasilache	2019-07-26	1	-2/+6
\| \| \| \| \| \| \|	Clipping creates non-affine memory accesses, use std_load and std_store instead of affine_load and affine_store. In the future we may also want a fill with the neutral element rather than clip, this would make the accesses affine if we wanted more analyses and transformations to happen post lowering to pointwise copies. PiperOrigin-RevId: 260110503
*	Automated rollback of commit fc194a4f22fe53f46729821d9c4a993fe200facf	Mehdi Amini	2019-07-25	2	-55/+12
\| \| \| \|	PiperOrigin-RevId: 260037115
*	[spirv] Add AccessChainOp operation.	Denis Khalikov	2019-07-25	1	-0/+138
\| \| \| \| \| \| \| \| \|	AccessChainOp creates a pointer into a composite object that can be used with OpLoad and OpStore. Closes tensorflow/mlir#52 PiperOrigin-RevId: 260035676
*	Genericize function-like printer and parser. NFC	Alex Zinenko	2019-07-25	2	-160/+210
\| \| \| \| \| \| \| \| \| \| \| \| \|	Function-like operations are likely to have similar custom syntax, in particular they all need to print function signature with argument attributes. Transform function printer and parser so that they can be applied to any operation with the FunctionLike trait. Move them to the trait itself. To avoid large member functions in the class template, define a concrete base class for the trait and implement common functionality in it. This allows printer and parser to be implemented in a source file without templating. PiperOrigin-RevId: 260020893
*	Add support for hexadecimal float literals	Alex Zinenko	2019-07-25	2	-12/+55
\| \| \| \| \| \| \| \| \| \| \| \| \|	MLIR does not have support for parsing special floating point values such as infinities and NaNs. If programmatically constructed, these values are printed as NaN and (+-)Inf and cannot be parsed back. Add parser support for hexadecimal literals in float attributes, following LLVM IR. The literal corresponds to the in-memory representation of the floating point value. IEEE 754 defines a range of possible values for NaNs, storing the bitwise representation allows MLIR to properly roundtrip NaNs with different bit values of significands. PiperOrigin-RevId: 260018802
*	Add support for an analysis mode to DialectConversion.	River Riddle	2019-07-25	1	-4/+46
\| \| \| \| \| \| \| \|	This mode analyzes which operations are legalizable to the given target if a conversion were to be applied, i.e. no rewrites are ever performed even on success. This mode is useful for device partitioning or other utilities that may want to analyze the effect of conversion to different targets before performing it. The analysis method currently just fills a provided set with the operations that were found to be legalizable. This can be extended in the future to capture more information as necessary. PiperOrigin-RevId: 259987105
*	Fix backward slice computation to iterate through known control flow	Nicolas Vasilache	2019-07-25	1	-2/+15
\| \| \| \| \| \| \| \| \|	This CL fixes an oversight with dealing with loops in slicing analysis. The forward slice computation properly propagates through loops but not the backward slice. Add relevant unit tests. PiperOrigin-RevId: 259903396
*	Move GPU dialect to {lib,include/mlir}/Dialect	Alex Zinenko	2019-07-25	12	-14/+14
\| \| \| \| \| \| \|	Per tacit agreement, individual dialects should now live in lib/Dialect/Name with headers in include/mlir/Dialect/Name and tests in test/Dialect/Name. PiperOrigin-RevId: 259896851
*	Enable multi-level Linalg fusion	Nicolas Vasilache	2019-07-24	2	-29/+65
\| \| \| \| \| \| \| \|	This CL adds support for SubViewOp in the alias analysis to permit multiple Linalg fusion passes to compose. The debugging messages are also improved for better readability. The readability benefits came in handy when tracking this issue. A 2-level fusion test is added to capture the new behavior. PiperOrigin-RevId: 259720246
*	Add a utility function to populate StdOp to SPIRV Conversion Patterns	Mahesh Ravishankar	2019-07-23	1	-2/+9
\| \| \| \| \| \| \| \|	The function populateStdOpsToSPIRVPatterns appends the conversion patterns automatically generated from StdOpsToSPIRVConversion.td to a list of patterns PiperOrigin-RevId: 259677890
*	Add sitofp to the standard dialect	MLIR Team	2019-07-23	2	-2/+17
\| \| \| \| \| \|	Conversion from integers (window or input size, padding etc) to floating point is required to express many ML kernels, for example average pooling. PiperOrigin-RevId: 259575284
*	Affine loop parallelism detection: conservatively handle unknown ops	Alex Zinenko	2019-07-23	2	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The loop parallelism detection utility only collects the affine.load and affine.store operations appearing inside the loop to analyze the access patterns for the absence of dependences. However, any operation, including unregistered operations, can appear in a body of an affine loop. If such operation has side effects, the result of parallelism analysis is incorrect. Conservatively assume affine loops are not parallel in presence of operations other than affine.load, affine.store, affine.for, affine.terminator that may have side effects. This required to update the loop-fusion unit test that relies on parallelism analysis and was exercising loop fusion in presence of an unregistered operation. PiperOrigin-RevId: 259560935
*	Introduce LLVMFuncOp	Alex Zinenko	2019-07-23	2	-99/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, MLIR only supported functions of the built-in FunctionType. On the conversion path to LLVM IR, we were creating MLIR functions that contained LLVM dialect operations and used LLVM IR types for everything expect top-level functions (e.g., a second-order function would have a FunctionType that consume or produces a wrapped LLVM function pointer type). With MLIR functions becoming operations, it is now possible to introduce non-built-in function operations. This will let us use conversion patterns for function conversion, simplify the MLIR-to-LLVM translation by removing the knowledge of the MLIR built-in function types, and provide stronger correctness verifications (e.g. LLVM functions only accept LLVM types). Furthermore, we can currently construct a situation where the same function is used with two different types: () -> () when its specified and called directly, and !llvm<"void ()"> when it's passed somewhere on called indirectly. Having a special function-op that is always of !llvm<"void ()"> type makes the function model and the llvm dialect type system more consistent. Introduce LLVMFuncOp to represent a function in the LLVM dialect. Unlike standard FuncOp, this function has an LLVMType wrapping an LLVM IR function type. Generalize the common behavior of function-defining operations (functions live in a symbol table of a module, contain a single region, are iterable as a list of blocks, and support argument attributes). This only defines the operation. Custom syntax, conversion and translation rules will be added in follow-ups. The operation name mentions LLVM explicitly to avoid confusion with standard FuncOp, especially in multiple files that use both `mlir` and `mlir::LLVM` namespaces. PiperOrigin-RevId: 259550940