bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Add CMakeLists.txt for AffineToStandard conversion	Alex Zinenko	2019-11-14	1	-0/+24
\| \| \| \|	PiperOrigin-RevId: 280470142
*	Move Affine to Standard conversion to lib/Conversion	Alex Zinenko	2019-11-14	3	-1/+539
\| \| \| \| \| \| \|	This is essentially a dialect conversion and conceptually belongs to conversions. PiperOrigin-RevId: 280460034
*	Make positions of elements in MemRef descriptor private	Alex Zinenko	2019-11-14	1	-25/+17
\| \| \| \| \| \| \| \|	Previous commits removed all uses of LLVMTypeConverter::k*PosInMemRefDescriptor outside of the MemRefDescriptor class. These numbers are an implementation detail and can be hidden under a layer of more semantic APIs. PiperOrigin-RevId: 280442444
*	Use MemRefDescriptor in Vector-to-LLVM convresion	Alex Zinenko	2019-11-14	2	-38/+19
\| \| \| \| \| \| \| \| \| \|	Following up on the consolidation of MemRef descriptor conversion, update Vector-to-LLVM conversion to use the helper class that abstracts away the implementation details of the MemRef descriptor. This also makes the types of the attributes in emitted llvm.insert/extractelement operations consistently i64 instead of a mix of index and i64. PiperOrigin-RevId: 280441451
*	Adapt code to LLVM API updates.	MLIR Team	2019-11-14	1	-1/+1
\| \| \| \|	PiperOrigin-RevId: 280431812
*	Move VectorOps to Tablegen - (almost) NFC	Nicolas Vasilache	2019-11-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL moves VectorOps to Tablegen and cleans up the implementation. This is almost NFC but 2 changes occur: 1. an interface change occurs in the padding value specification in vector_transfer_read: the value becomes non-optional. As a shortcut we currently use %f0 for all paddings. This should become an OpInterface for vectorization in the future. 2. the return type of vector.type_cast is trivial and simplified to `memref<vector<...>>` Relevant roundtrip and invalid tests that used to sit in core are moved to the vector dialect. The op documentation is moved to the .td file. PiperOrigin-RevId: 280430869
*	Use MemRefDescriptor in Linalg-to-LLVM conversion	Alex Zinenko	2019-11-14	1	-105/+96
\| \| \| \| \| \| \| \| \| \| \|	Following up on the consolidation of MemRef descriptor conversion, update Linalg-to-LLVM conversion to use the helper class that abstracts away the implementation details of the MemRef descriptor. This required MemRefDescriptor to become publicly visible. Since this conversion is heavily EDSC-based, introduce locally an additional wrapper that uses builder and location pointed to by the EDSC context while emitting descriptor manipulation operations. PiperOrigin-RevId: 280429228
*	Concentrate memref descriptor manipulation logic in one place	Alex Zinenko	2019-11-14	1	-165/+174
\| \| \| \| \| \| \| \| \| \| \|	Memref descriptor is becoming increasingly complex. Memrefs are manipulated by multiple standard instructions, each of which has a non-trivial lowering to the LLVM dialect. This leads to verbose code that manipulates the descriptors exposing the internals of insert/extractelement opreations. Implement a wrapper class that contains a memref descriptor and provides semantically named methods that build the primitive IR operations instead. PiperOrigin-RevId: 280371225
*	NFC: Refactor block signature conversion to not erase the original arguments.	River Riddle	2019-11-13	1	-2/+2
\| \| \| \| \| \|	This refactors the implementation of block signature(type) conversion to not insert fake cast operations to perform the type conversion, but to instead create a new block containing the proper signature. This has the benefit of enabling the use of pre-computed analyses that rely on mapping values. It also leads to a much cleaner implementation overall. The major user facing change is that applySignatureConversion will now replace the entry block of the region, meaning that blocks generally shouldn't be cached over calls to applySignatureConversion. PiperOrigin-RevId: 280226936
*	Add operations needed to support lowering of AffineExpr to SPIR-V.	Mahesh Ravishankar	2019-11-12	1	-14/+72
\| \| \| \| \| \| \| \|	Lowering of CmpIOp, DivISOp, RemISOp, SubIOp and SelectOp to SPIR-V dialect enables the lowering of operations generated by AffineExpr -> StandardOps conversion into the SPIR-V dialect. PiperOrigin-RevId: 280039204
*	Make legality check in GPU->SPIR-V lowering of FuncOp kernel specific.	Mahesh Ravishankar	2019-11-12	1	-2/+7
\| \| \| \| \| \| \| \| \| \|	Existing check that sets FuncOp to be dynamically legal was just checking that the types of the argument are SPIR-V compatible. Since the current conversion from GPU to SPIR-V does not handle lowering non-kernel functions, change the legality check to verify that the FuncOp has the gpu.kernel attribute and has void(void) return type. PiperOrigin-RevId: 280032782
*	Add Conversion to lower loop::ForOp to spirv::LoopOp.	Mahesh Ravishankar	2019-11-12	1	-1/+83
\| \| \| \| \| \| \| \| \|	loop::ForOp can be lowered to the structured control flow represented by spirv::LoopOp by making the continue block of the spirv::LoopOp the loop latch and the merge block the exit block. The resulting spirv::LoopOp has a single back edge from the continue to header block, and a single exit from header to merge. PiperOrigin-RevId: 280015614
*	Add LLVM lowering of std.subview	Nicolas Vasilache	2019-11-12	1	-0/+113
\| \| \| \| \| \|	A followup CL will replace usage of linalg.subview by std.subview. PiperOrigin-RevId: 279961981
*	Add support for alignment attribute in std.alloc.	Nicolas Vasilache	2019-11-12	2	-67/+138
\| \| \| \| \| \| \| \| \| \| \| \|	This CL adds an extra pointer to the memref descriptor to allow specifying alignment. In a previous implementation, we used 2 types: `linalg.buffer` and `view` where the buffer type was the unit of allocation/deallocation/alignment and `view` was the unit of indexing. After multiple discussions it was decided to use a single type, which conflates both, so the memref descriptor now needs to carry both pointers. This is consistent with the [RFC-Proposed Changes to MemRef and Tensor MLIR Types](https://groups.google.com/a/tensorflow.org/forum/#!searchin/mlir/std.view%7Csort:date/mlir/-wKHANzDNTg/4K6nUAp8AAAJ). PiperOrigin-RevId: 279959463
*	Update Linalg to use std.view	Nicolas Vasilache	2019-11-07	1	-4/+8
\| \| \| \| \| \|	Now that a view op has graduated to the std dialect, we can update Linalg to use it and remove ops that have become obsolete. As a byproduct, the linalg buffer and associated ops can also disappear. PiperOrigin-RevId: 279073591
*	Add lowering of std.view to LLVM	Nicolas Vasilache	2019-11-06	1	-0/+126
\| \| \| \| \| \| \| \| \| \| \| \| \|	This CL ports the lowering of linalg.view to the newly introduced std.view. Differences in implementation relate to std.view having slightly different semantics: 1. a static or dynamic offset can be specified. 2. the size of the (contiguous) shape is passed instead of a range. 3. static size and stride information is extracted from the memref type rather than the range. Besides these differences, lowering behaves the same. A future CL will update Linalg to use this unified infrastructure. PiperOrigin-RevId: 278948853
*	Support lowering of imperfectly nested loops into GPU dialect.	Mahesh Ravishankar	2019-11-01	2	-26/+304
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current lowering of loops to GPU only supports lowering of loop nests where the loops mapped to workgroups and workitems are perfectly nested. Here a new lowering is added to handle lowering of imperfectly nested loop body with the following properties 1) The loops partitioned to workgroups are perfectly nested. 2) The loop body of the inner most loop partitioned to workgroups can contain one or more loop nests that are to be partitioned across workitems. Each individual loops nests partitioned to workitems should also be perfectly nested. 3) The number of workgroups and workitems are not deduced from the loop bounds but are passed in by the caller of the lowering as values. 4) For statements within the perfectly nested loop nest partitioned across workgroups that are not loops, it is valid to have all threads execute that statement. This is NOT verified. PiperOrigin-RevId: 277958868
*	NFC: Use #ifndef in various .td files instead of #ifdef and #else	Lei Zhang	2019-10-31	2	-12/+6
\| \| \| \| \| \| \| \| \| \|	Upstream LLVM gained support for #ifndef with https://reviews.llvm.org/D61888 This is changed mechanically via the following command: find . -name ".td" -exec sed -i -e ':a' -e 'N' -e '$!ba' -e 's/#ifdef $[A-Z_]$\n#else/#ifndef \1/g' {} \; PiperOrigin-RevId: 277789427
*	Lookup function declaration in SymbolTable not ModuleOp.	Alexander Belyaev	2019-10-28	1	-4/+2
\| \| \| \|	PiperOrigin-RevId: 277033167
*	Fix include guards and add tests for OpToFuncCallLowering.	Alexander Belyaev	2019-10-26	1	-6/+14
\| \| \| \|	PiperOrigin-RevId: 276859463
*	Add custom lowering of ExpOp for NVVM and ROCM.	Alexander Belyaev	2019-10-24	3	-1/+111
\| \| \| \|	PiperOrigin-RevId: 276440911
*	Unify GPU op definition names with other dialects.	Christian Sigg	2019-10-21	5	-20/+22
\| \| \| \| \| \|	Rename GPU op names from gpu_Foo to GPU_FooOp. PiperOrigin-RevId: 275882232
*	Fix minor spelling tweaks (NFC)	Kazuaki Ishizaki	2019-10-20	2	-5/+5
\| \| \| \| \| \|	Closes tensorflow/mlir#177 PiperOrigin-RevId: 275692653
*	Use new eraseOp instead of replaceOp with empty values	Geoffrey Martin-Noble	2019-10-19	1	-2/+2
\| \| \| \|	PiperOrigin-RevId: 275631166
*	Get active source lane predicate from shuffle instruction.	Christian Sigg	2019-10-19	1	-18/+20
\| \| \| \| \| \| \| \|	nvvm.shfl.sync.bfly optionally returns a predicate whether source lane was active. Support for this was added to clang in https://reviews.llvm.org/D68892. Add an optional 'pred' unit attribute to the instruction to return this predicate. Specify this attribute in the partial warp reduction so we don't need to manually compute the predicate. PiperOrigin-RevId: 275616564
*	Implement lowering of VectorTypeCastOp to LLVM	Nicolas Vasilache	2019-10-18	2	-15/+126
\| \| \| \| \| \| \| \| \| \|	A VectorTypeCastOp can only be used to lower between statically sized contiguous memrefs of scalar and matching vector type. The sizes and strides are thus fully static and easy to determine. A relevant test is added. This is a step towards solving tensorflow/mlir#189. PiperOrigin-RevId: 275538981
*	Fix OSS target name GPUtoNVVMTransforms -> MLIRGPUtoNVVMTransforms	Nicolas Vasilache	2019-10-18	1	-1/+1
\| \| \| \| \| \| \|	This unbreaks the `cmake -G Ninja ../llvm -DLLVM_BUILD_EXAMPLES=ON -DLLVM_TARGETS_TO_BUILD="host"` in my local OSS build PiperOrigin-RevId: 275452330
*	Add gpu.barrier op to synchronize invocations of a local workgroup.	Christian Sigg	2019-10-18	3	-0/+50
\| \| \| \| \| \| \| \|	Adding gen table for rewrite patterns from GPU to NVVM dialect. Copy missing op documentation from GPUOps.td to GPU.md. PiperOrigin-RevId: 275419588
*	NFC: Remove trivial builder get methods.	River Riddle	2019-10-17	1	-1/+1
\| \| \| \| \| \|	These don't add any value, and some are even more restrictive than the respective static 'get' method. PiperOrigin-RevId: 275391240
*	Makes spv.module generated by GPU->SPIRV conversion spec compliant	Mahesh Ravishankar	2019-10-16	2	-5/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Makes the spv.module generated by the GPU to SPIR-V conversion SPIR-V spec compliant (validated using spirv-val from Vulkan tools). 1) Separate out the VulkanLayoutUtils from DecorateSPIRVCompositeTypeLayoutPass to make it reusable within the Type converter in SPIR-V lowering infrastructure. This is used to compute the layout of the !spv.struct used in global variable type description. 2) Set the capabilities of the spv.module to Shader (needed for use of Logical Memory Model, and the extensions to SPV_KHR_storage_buffer_storage_class for use of Storage Buffer) PiperOrigin-RevId: 275081486
*	Support custom accumulator provided as region to gpu.all_reduce.	Christian Sigg	2019-10-16	1	-8/+105
\| \| \| \| \| \| \| \| \| \|	In addition to specifying the type of accumulation through the 'op' attribute, the accumulation can now also be specified as arbitrary code region. Adds a gpu.yield op to specify the result of the accumulation. Also support more types (integers) and accumulations (mul). PiperOrigin-RevId: 275065447
*	Allow for remapping argument to a Value in SignatureConversion.	Mahesh Ravishankar	2019-10-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current SignatureConversion framework (part of DialectConversion) allows remapping input arguments to a function from 1->0, 1->1 or 1->many arguments during conversion. Another case is where the argument itself is dropped, but it's use are remapped to another Value*. An example of this is: The Vulkan/SPIR-V spec requires entry functions to be of type void(void). The GPU -> SPIR-V conversion implemented this without having the DialectConversion framework track the remapping that lead to some undefined behavior. The changes here addresses that. PiperOrigin-RevId: 275059656
*	Add support for PatternRewriter::eraseOp.	River Riddle	2019-10-16	2	-3/+3
\| \| \| \| \| \|	This hook is useful when an operation is known to be dead, and no replacement values make sense. PiperOrigin-RevId: 275052756
*	Add conversion for splat of vectors of 2+D	Nicolas Vasilache	2019-10-15	1	-48/+142
\| \| \| \| \| \| \|	This CL adds a missing lowering for splat of multi-dimensional vectors. Additional support is also added to the runtime utils library to allow printing memrefs with such vectors. PiperOrigin-RevId: 274794723
*	Emit LLVM IR equivalent of sizeof when lowering alloc operations	Alex Zinenko	2019-10-11	1	-12/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, the lowering of `alloc` operations has been computing the number of bytes to allocate when lowering based on the properties of MLIR type. This does not take into account type legalization that happens when compiling LLVM IR down to target assembly. This legalization can widen the type, potentially leading to out-of-bounds accesses to `alloc`ed data due to mismatches between address computation that takes the widening into account and allocation that does not. Use the LLVM IR's equivalent of `sizeof` to compute the number of bytes to be allocated: %0 = getelementptr %type* null, %indexType 0 %1 = ptrtoint %type* %0 to %indexType adapted from http://nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt PiperOrigin-RevId: 274159900
*	Drop obsolete code from std to llvm memref lowering	Uday Bondhugula	2019-10-11	1	-22/+11
\| \| \| \| \| \| \| \| \| \| \| \|	- dropping what looks like outdated code post some of the previous updates Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#179 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/179 from bondhugula:llfix 2a72ea441fe1b3924802273ffbe9870afeb90f91 PiperOrigin-RevId: 274158273
*	Rename LLVM::exp and LLVM::fmuladd to LLVM::ExpOP and LLVM::FMulAddOp.	Alexander Belyaev	2019-10-11	2	-2/+2
\| \| \| \|	PiperOrigin-RevId: 274154655
*	Add unary ops and ExpOp to Standard Dialect.	Alexander Belyaev	2019-10-11	1	-25/+79
\| \| \| \|	PiperOrigin-RevId: 274152154
*	Standard-to-LLVM conversion: check that operands have LLVM types	Alex Zinenko	2019-10-10	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	In Standard to LLVM dialect conversion, the binary op conversion pattern implicitly assumed some operands were of LLVM IR dialect type. This is not necessarily true, for example if the Ops that produce those operands did not match the existing convresion patterns. Check if all operands are of LLVM IR dialect type and if not, fail to patch the binary op pattern. Closes tensorflow/mlir#168 PiperOrigin-RevId: 274063207
*	Add lowering of constant ops to SPIR-V.	Mahesh Ravishankar	2019-10-10	2	-6/+67
\| \| \| \| \| \| \| \| \| \| \|	The lowering is specified as a pattern and is done only if the result is a SPIR-V scalar type or vector type. Handling ConstantOp with index return type needs special handling since SPIR-V dialect does not have index types. Based on the bitwidth of the attribute value, either i32 or i64 is chosen. Other constant lowerings are left as a TODO. PiperOrigin-RevId: 274056805
*	Mark GPU dialect as illegal when lowering to NVVM.	Christian Sigg	2019-10-10	1	-0/+1
\| \| \| \|	PiperOrigin-RevId: 273948293
*	Use llvm.func to define functions with wrapped LLVM IR function type	Alex Zinenko	2019-10-10	3	-93/+128
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function-like operation allows one to define functions that have wrapped LLVM IR function type, in particular variadic functions. The operation was added in parallel to the existing lowering flow, this commit only switches the flow to use it. Using a custom function type makes the LLVM IR dialect type system more consistent and avoids complex conversion rules for functions that previously had to use the built-in function type instead of a wrapped LLVM IR dialect type and perform conversions during the analysis. PiperOrigin-RevId: 273910855
*	Make SPIR-V lowering infrastructure follow Vulkan SPIR-V validation.	Mahesh Ravishankar	2019-10-09	2	-102/+191
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The lowering infrastructure needs to be enhanced to lower into a spv.Module that is consistent with the SPIR-V spec. The following changes are needed 1) The Vulkan/SPIR-V validation rules dictates entry functions to have signature of void(void). This requires changes to the function signature conversion infrastructure within the dialect conversion framework. When an argument is dropped from the original function signature, a function can be specified that when invoked will return the value to use as a replacement for the argument from the original function. 2) Some changes to the type converter to make the converted type consistent with the Vulkan/SPIR-V validation rules, a) Add support for converting dynamically shaped tensors to spv.rtarray type. b) Make the global variable of type !spv.ptr<!spv.struct<...>> 3) Generate the entry point operation for the kernel functions and automatically compute all the interface variables needed PiperOrigin-RevId: 273784229
*	Change to doxygen comments. NFC.	Christian Sigg	2019-10-09	1	-63/+63
\| \| \| \|	PiperOrigin-RevId: 273707610
*	GPUToCUDA: attach CUBIN to the nested module rather than to the function	Alex Zinenko	2019-10-08	2	-46/+38
\| \| \| \| \| \| \| \| \| \| \|	Originally, we were attaching attributes containing CUBIN blobs to the kernel function called by `gpu.launch_func`. This kernel is now contained in a nested module that is used as a compilation unit. Attach compiled CUBIN blobs to the module rather than to the function since we were compiling the module. This also avoids duplication of the attribute on multiple kernels within the same module. PiperOrigin-RevId: 273497303
*	GPUToCUDA: emit addressof directly instead of wrapping it into a getter function	Alex Zinenko	2019-10-08	1	-65/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, the CUBIN getter function was introduced as a mechanism to circumvent the absence of globals in the LLVM dialect. It would allocate memory and populate it with the CUBIN data. LLVM dialect now supports globals and they are already used to store CUBIN data, making the getter function a trivial address computation of a global. Emit the address computation directly at the place of `gpu.launch_func` instead of putting it in a function and calling it. This simplifies the conversion flow and prepares it for using the DialectConversion infrastructure. PiperOrigin-RevId: 273496221
*	Fuse GenerateCubinAccessors pass into LaunchFunctToCuda	Alex Zinenko	2019-10-08	3	-141/+58
\| \| \| \| \| \| \| \| \| \| \|	Now that the accessor function is a trivial getter of the global variable, it makes less sense to have the getter generation as a separate pass. Move the getter generation into the lowering of `gpu.launch_func` to CUDA calls. This change is mostly code motion, but the process can be simplified further by generating the addressof inplace instead of using a call. This is will be done in a follow-up. PiperOrigin-RevId: 273492517
*	Use named modules for gpu.launch_func	Alex Zinenko	2019-10-08	3	-33/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kernel function called by gpu.launch_func is now placed into an isolated nested module during the outlining stage to simplify separate compilation. Until recently, modules did not have names and could not be referenced. This limitation was circumvented by introducing a stub kernel at the same name at the same nesting level as the module containing the actual kernel. This relation is only effective in one direction: from actual kernel function to its launch_func "caller". Leverage the recently introduced symbol name attributes on modules to refer to a specific nested module from `gpu.launch_func`. This removes the implicit connection between the identically named stub and kernel functions. It also enables support for `gpu.launch_func`s to call different kernels located in the same module. PiperOrigin-RevId: 273491891
*	Support reduction of partial warps.	Christian Sigg	2019-10-07	1	-88/+159
\| \| \| \| \| \|	gpu.all_reduce now supports block sizes that are not multiple of 32. PiperOrigin-RevId: 273255204
*	Replace constexpr MemRefType::kDynamicStrideOrOffset by a ↵	Nicolas Vasilache	2019-10-04	1	-5/+5
\| \| \| \| \| \| \| \| \| \|	MemRefType:;getDynamicStrideOrOffset() method - NFC This fixes global ODR-use issues, some of which manifest in Parser.cpp. Fixes tensorflow/mlir#167. PiperOrigin-RevId: 272886347