summaryrefslogtreecommitdiffstats
path: root/parallel-libs/streamexecutor/lib
Commit message (Collapse)AuthorAgeFilesLines
* [SE] Remove StreamExecutorJason Henline2016-10-2516-885/+0
| | | | | | | | | | | | | | Summary: The project has been renamed to Acxxel, so this old directory needs to be deleted. Reviewers: jlebar, jprice Subscribers: beanz, mgorny, parallel_libs-commits, modocache Differential Revision: https://reviews.llvm.org/D25964 llvm-svn: 285115
* [SE] Support CUDA dynamic shared memoryJason Henline2016-09-151-7/+34
| | | | | | | | | | | | | | Summary: Add proper handling for shared memory arguments in the CUDA platform. Also add in unit tests for CUDA. Reviewers: jlebar Subscribers: beanz, mgorny, jprice, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24596 llvm-svn: 281635
* [SE] Let users specify CUDA pathJason Henline2016-09-152-39/+0
| | | | | | | | | | | | Summary: Add logic to allow users to specify the CUDA path at configuration time. Reviewers: jlebar Subscribers: beanz, mgorny, jlebar, jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24580 llvm-svn: 281626
* [SE] Add CUDA platformJason Henline2016-09-147-1/+407
| | | | | | | | | | | | | | | | | | | | Summary: Basic CUDA platform implementation and cmake infrastructure to control whether it's used. A few important TODOs will be handled in later patches: * Log some error messages that can't easily be returned as Errors. * Cache modules and kernels to prevent reloading them if someone tries to reload a kernel that's already loaded. * Tolerate shared memory arguments for kernel launches. Reviewers: jlebar Subscribers: beanz, mgorny, jprice, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24538 llvm-svn: 281524
* [SE] KernelSpec return best PTXJason Henline2016-09-131-4/+5
| | | | | | | | | | | | | | | | Summary: Before, the kernel spec would only return PTX for exactly the requested compute capability. With this patch it will now return the PTX with the largest compute capability that does not exceed that requested compute capability. Reviewers: jlebar Subscribers: jprice, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24531 llvm-svn: 281417
* [SE] Host platform implementationJason Henline2016-09-131-0/+3
| | | | | | | | | | | | | | | | | Summary: This implementation does not currently support multiple concurrent streams, and it won't allow kernels to be launched with grids larger than one block or blocks larger than one thread. These limitations could be removed in the future by launching new threads on the host, but that is not done in this implementation. Reviewers: jlebar Subscribers: beanz, mgorny, jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24473 llvm-svn: 281377
* [SE] Add .clang-formatJason Henline2016-09-137-20/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The .clang-tidy file is copied from the top-level LLVM source directory. Also fix warnings generated by clang-format: * Moved SimpleHostPlatformDevice.h so its header include guard could have the right format. * Changed signatures of methods taking llvm::Twine by value to take it by const ref instead. * Add "noexcept" to some move constructors and assignment operators. * Removed a bunch of places where single-statement loops and conditionals were surrounded with braces. (This was not found by the current clang-tidy, but with a local patch that I hope to upstream soon.) Reviewers: jlebar, jprice Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24468 llvm-svn: 281374
* [SE] RegisteredHostMemory for async device copiesJason Henline2016-09-122-0/+30
| | | | | | | | | | | | | | | | | | | Summary: Improve the error-prone interface that allows users to pass host pointers that haven't been registered to asynchronous copy methods. In CUDA, this is an extremely easy error to make, and instead of failing at runtime, it succeeds and gives the right answers by turning the async copy into a sync copy. So, you silently get a huge performance degradation if you misuse the old interface. This new interface should prevent that. Reviewers: jlebar Subscribers: jprice, beanz, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24353 llvm-svn: 281225
* [SE] Remove Utils directoryJason Henline2016-09-093-9/+2
| | | | | | | | | | | | | | | | | | Summary: There is no purpose in splitting out the Error class from the rest of the StreamExecutor code. This organization was just a vestige of an old failed design. Plus, this change fixes a bug in the build where the utilites library was not being statically linked in with libstreamexecutor. Reviewers: jlebar, jprice Subscribers: beanz, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24434 llvm-svn: 281118
* [StreamExecutor] Make SE work with an in-tree LLVM build.Justin Lebar2016-09-098-1259/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With these changes, we can put parallel-libs within llvm/projects and build as normal. This is kind of the minimal change I could figure out how to make while still making us compatible with llvm's build system. Some things I'm not thrilled about include: * The creation of a CoreTests directory (the macros really seemed to want this) * Pulling SimpleHostPlatformDevice.h into CoreTests. It seems to me this should live inside unittests/include, or maybe tests/include, but I didn't want to make that change in this patch. One important piece of work that remains to be done is to make $ ninja check-streamexecutor run all the tests. Right now the only way I've figured out to run the tests is $ ninja projects/parallel-libs/streamexecutor/unittests/StreamExecutorUnitTests $ projects/parallel-libs/streamexecutor/unittests/CoreTests/CoreTests Reviewers: jhen Subscribers: beanz, parallel_libs-commits, jprice Differential Revision: https://reviews.llvm.org/D24368 llvm-svn: 281091
* [SE] Add getName method to Device classJason Henline2016-09-071-0/+4
| | | | | | | | | | Reviewers: jhen Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24240 llvm-svn: 280872
* [SE] Rename PlatformInterfaces to PlatformDeviceJason Henline2016-09-068-10/+10
| | | | | | | | | | | | | | Summary: The only interface that we ever plan to have in this file is PlatformDevice, so it makes sense to rename the file to reflect that. Reviewers: jprice Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24269 llvm-svn: 280737
* [SE] Remove Platform*Handle classesJason Henline2016-09-066-27/+84
| | | | | | | | | | | | | | | | Summary: As pointed out by jprice, these classes don't serve a purpose. Instead, we stay consistent with the way memory is managed and let the Stream and Kernel classes directly hold opaque handles to device Stream and Kernel instances, respectively. Reviewers: jprice, jlebar Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24213 llvm-svn: 280719
* [SE] Add getByteCount methods for device memoryJason Henline2016-09-031-12/+12
| | | | | | | | | | | | | | Summary: Simple utility methods will prevent users from making mistakes when converting element counts to byte counts. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24197 llvm-svn: 280563
* [SE] GlobalDeviceMemory owns its handleJason Henline2016-09-023-1/+29
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Final step in getting GlobalDeviceMemory to own its handle. * Make GlobalDeviceMemory movable, but no longer copyable. * Make Device::freeDeviceMemory function private and make GlobalDeviceMemoryBase a friend of Device so GlobalDeviceMemoryBase can free its memory in its destructor. * Make GlobalDeviceMemory constructor private and make Device a friend so it can construct GlobalDeviceMemory. * Remove SharedDeviceMemoryBase class because it is never used. * Remove explicit memory freeing from example code. This change just consumes any errors generated during device memory freeing. The real error handling will be added in a future patch. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24195 llvm-svn: 280509
* [SE] Add "install" actions to cmake buildJason Henline2016-09-021-0/+2
| | | | | | | The "install" build target will now copy the StreamExecutor library and headers to the appropriate subdirectories of CMAKE_INSTALL_PREFIX. llvm-svn: 280506
* [SE] Don't pack raw device mem argsJason Henline2016-09-021-89/+37
| | | | | | | | | | | | | | | | | Summary: Step 4 of getting GlobalDeviceMemory to own its handle. Take out code to pack untyped device memory types as kernel arguments. When GlobalDeviceMemory owns its handle, users will never touch untyped device memory types, so they will never pass them as kernel args. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24177 llvm-svn: 280496
* [StreamExecutor] Read dev array directly in testJason Henline2016-09-013-63/+97
| | | | | | | | | | | | | | | | | | Summary: Step 2 of getting GlobalDeviceMemory to own its handle. Use the SimpleHostPlatformDevice allocate methods to create device arrays for tests, and check for successful copies by dereferncing the device array handle directly because we know it is really a host pointer. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24148 llvm-svn: 280428
* [StreamExecutor] Dev handles in platform interfaceJason Henline2016-09-013-121/+139
| | | | | | | | | | | | | | | | | Summary: This is the first in a series of patches that will convert GlobalDeviceMemory to own its device memory handle. The first step is to remove GlobalDeviceMemoryBase from the PlatformInterface interfaces and use raw handles there instead. This is useful because GlobalDeviceMemoryBase is going to lose its importance in this process. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24114 llvm-svn: 280401
* [SE] Make Stream movableJason Henline2016-09-012-3/+4
| | | | | | | | | | | | | | Summary: The example code makes it clear that this is a much better design decision. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24142 llvm-svn: 280397
* [StreamExecutor] getOrDie and dieIfError utilsJason Henline2016-08-311-0/+8
| | | | | | | | | | Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24107 llvm-svn: 280312
* [StreamExecutor] Simplify Kernel classesJason Henline2016-08-303-124/+3
| | | | | | | | | | | | | | Summary: Make the Kernel class follow the pattern of the other classes. It now has a type-safe user wrapper and a typeless, platform-specific handle. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24043 llvm-svn: 280176
* [StreamExecutor] Add Platform and PlatformManagerJason Henline2016-08-253-0/+59
| | | | | | | | | | | | Summary: Abstractions for a StreamExecutor platform Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23857 llvm-svn: 279779
* [StreamExecutor] Rename Executor to DeviceJason Henline2016-08-2410-517/+514
| | | | | | | | | | | | Summary: This more clearly describes what the class is. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23851 llvm-svn: 279669
* [StreamExecutor] Fix allocateDeviceMemoryJason Henline2016-08-241-0/+27
| | | | | | | | | | | | | | | | | | | Summary: The return value from PlatformExecutor::allocateDeviceMemory needs to be converted from Expected<GlobalDeviceMemoryBase> to Expected<GlobalDeviceMemory<T>> in Executor::allocateDeviceMemory. A similar bug is also fixed for Executor::allocateHostMemory. Thanks to jprice for identifying this bug. Reviewers: jprice, jlebar Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23849 llvm-svn: 279658
* [StreamExecutor] Executor add synchronous methodsJason Henline2016-08-244-34/+773
| | | | | | | | | | | | | | Summary: Add Executor methods that block the host until completion. Since these methods are host-synchronous, they don't require Stream arguments. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23577 llvm-svn: 279640
* [StreamExecutor] Rename StreamExecutor to ExecutorJason Henline2016-08-167-34/+32
| | | | | | | | | | | | Summary: No functional changes just renaming this class for better readability. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23574 llvm-svn: 278833
* [StreamExecutor] Add basic Stream operationsJason Henline2016-08-169-4/+248
| | | | | | | | | | | | Summary: Add the Stream class and a few of the operations it supports. Reviewers: jlebar, tra Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23333 llvm-svn: 278829
* [StreamExecutor] Add DeviceMemory and kernel arg packingJason Henline2016-08-082-0/+212
| | | | | | | | | | | | | | Summary: Add types for device memory and add the code that knows how to pack these device memory types if they are passed as arguments to kernel launches. Reviewers: jlebar, tra Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23211 llvm-svn: 278021
* [StreamExecutor] Add kernel typesJason Henline2016-08-054-0/+149
| | | | | | | | | | | | Summary: Add StreamExecutor kernel types. Reviewers: jlebar, tra Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23138 llvm-svn: 277827
* [StreamExecutor] Add KernelLoaderSpecJason Henline2016-08-034-0/+244
| | | | | | | | | | | | | | | Summary: Add definitions for the KernelLoaderSpec and MultiKernelLoaderSpec classes to StreamExecutor. Instances of these classes are generated by the compiler in order to provide host code with a handle to device code. Reviewers: jlebar, tra Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23038 llvm-svn: 277615
* [StreamExecutor] Add error handling libraryJason Henline2016-07-292-0/+67
Summary: Error handling in StreamExecutor is based on llvm::Error and llvm::Expected. This CL sets up the StreamExecutor wrapper classes in the streamexecutor namespace. All the other StreamExecutor code makes use of this error handling code, so this is the first CL for checking in StreamExecutor. Reviewers: jlebar, tra Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D22687 llvm-svn: 277210
OpenPOWER on IntegriCloud