summaryrefslogtreecommitdiffstats
path: root/parallel-libs
Commit message (Collapse)AuthorAgeFilesLines
* Fix typos throughout the license files that somehow I and my reviewersChandler Carruth2019-01-211-1/+1
| | | | | | | | | | | all missed! Thanks to Alex Bradbury for pointing this out, and the fact that I never added the intended `legacy` anchor to the developer policy. Add that anchor too. With hope, this will cause the links to all resolve successfully. llvm-svn: 351731
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-1913-52/+39
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* Install new LLVM license structure and new developer policy.Chandler Carruth2019-01-191-21/+236
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This installs the new developer policy and moves all of the license files across all LLVM projects in the monorepo to the new license structure. The remaining projects will be moved independently. Note that I've left odd formatting and other idiosyncracies of the legacy license structure text alone to make the diff easier to read. Critically, note that we do not in any case *remove* the old license notice or terms, as that remains necessary until we finish the relicensing process. I've updated a few license files that refer to the LLVM license to instead simply refer generically to whatever license the LLVM project is under, basically trying to minimize confusion. This is really the culmination of so many people. Chris led the community discussions, drafted the policy update and organized the multi-year string of meeting between lawyers across the community to figure out the strategy. Numerous lawyers at companies in the community spent their time figuring out initial answers, and then the Foundation's lawyer Heather Meeker has done *so* much to help refine and get us ready here. I could keep going on, but I just want to make sure everyone realizes what a huge community effort this has been from the begining. Differential Revision: https://reviews.llvm.org/D56897 llvm-svn: 351631
* Update year in license filesHans Wennborg2019-01-151-1/+1
| | | | | | | In last year's update (D48219) it was suggested that the release manager might want to do this, so here we go. llvm-svn: 351194
* Update copyright year to 2018.Paul Robinson2018-06-181-1/+1
| | | | llvm-svn: 334936
* [Axccel] Remove -Wno-missing-braces in buildJason Henline2016-12-193-6/+6
| | | | | | | | | | | | | | | | Summary: I originally added the -Wno-missing-braces flag because I thought it was erroneously flagging std::array initializations. Now I realize the extra braces really are desired for these initializations, so I'm turning the warning flag back on. Reviewers: jlebar Subscribers: mgorny, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D27941 llvm-svn: 290137
* [Acxxel] Remove setActiveDeviceForThreadJason Henline2016-10-287-249/+232
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: After experimenting with CUDA, I realized that we really only need to set the active context right before creating an object such as a stream or a device memory allocation. When we go on to use these objects later, it is fine if the context that created them is no longer active, operations with those objects will succeed anyway. Since it turns out that we don't have to check the active context for every operation, it makes sense to hide this active context from users (by removing the "ActiveDeviceForThread" setter and getter) and to change the Acxxel API to explicitly pass in the device ID to create objects. This change improves the Acxxel API and greatly simplifies the CUDA and OpenCL implementations because they no longer require thread_local data. Reviewers: jlebar, jprice Subscribers: mgorny, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D26050 llvm-svn: 285372
* [SE] Remove StreamExecutorJason Henline2016-10-2551-7668/+0
| | | | | | | | | | | | | | Summary: The project has been renamed to Acxxel, so this old directory needs to be deleted. Reviewers: jlebar, jprice Subscribers: beanz, mgorny, parallel_libs-commits, modocache Differential Revision: https://reviews.llvm.org/D25964 llvm-svn: 285115
* Initial check-in of Acxxel (StreamExecutor renamed)Jason Henline2016-10-2521-0/+6679
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Acxxel is basically a simplified redesign of StreamExecutor. Here are the major points where Acxxel differs from the current StreamExecutor design: * Acxxel doesn't support the kernel and kernel loader types designed for emission by the compiler to support type-safe kernel launches. For CUDA, kernels in Acxxel can be seamlessly launched using the standard CUDA triple-chevron kernel launch syntax that is available with clang and nvcc. For CUDA and OpenCL, kernel arguments can be passed in the old-fashioned way, as one array of pointers to arguments and another array of argument sizes. Although OpenCL doesn't get a type-safe kernel launch method, it does still get the benefit of all the memory management wrappers. In the future, clang may add support for triple-chevron OpenCL kernel launchs, or some other type-safe OpenCL kernel launch method. * Acxxel does not depend on any other code in LLVM, so it builds completely independently from LLVM. The goal will be to check in Acxxel and remove StreamExecutor, or perhaps to remove the old StreamExecutor and rename Acxxel to StreamExecutor, so I think Acxxel should be thought of as a new version of StreamExecutor, not as a separate project. Reviewers: jlebar, jprice Subscribers: beanz, mgorny, modocache, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D25701 llvm-svn: 285111
* [SE] Change CoreTests target nameJason Henline2016-09-271-1/+1
| | | | | | | | | | | | | | Summary: Call it StreamExecutorCoreTests in order to prevent collision with targets from other modules. Reviewers: jlebar, jprice Subscribers: beanz, mgorny, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24949 llvm-svn: 282491
* [SE] Fix config bug with CUDA testsJason Henline2016-09-152-1/+1
| | | | | | | | | | | | | | | | | Summary: It turns out CMake errors out if a processed directory contains source files that are not used. This was causing an error with the CUDATest.cpp file when configuring StreamExecutor with the CUDA platform disabled. Moving CUDATest.cpp to its own directory fixes this problem. Reviewers: jlebar, jprice Subscribers: beanz, mgorny, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24618 llvm-svn: 281654
* [SE] Support CUDA dynamic shared memoryJason Henline2016-09-153-7/+254
| | | | | | | | | | | | | | Summary: Add proper handling for shared memory arguments in the CUDA platform. Also add in unit tests for CUDA. Reviewers: jlebar Subscribers: beanz, mgorny, jprice, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24596 llvm-svn: 281635
* [SE] Let users specify CUDA pathJason Henline2016-09-154-51/+63
| | | | | | | | | | | | Summary: Add logic to allow users to specify the CUDA path at configuration time. Reviewers: jlebar Subscribers: beanz, mgorny, jlebar, jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24580 llvm-svn: 281626
* [SE] Add CUDA platformJason Henline2016-09-1413-17/+596
| | | | | | | | | | | | | | | | | | | | Summary: Basic CUDA platform implementation and cmake infrastructure to control whether it's used. A few important TODOs will be handled in later patches: * Log some error messages that can't easily be returned as Errors. * Cache modules and kernels to prevent reloading them if someone tries to reload a kernel that's already loaded. * Tolerate shared memory arguments for kernel launches. Reviewers: jlebar Subscribers: beanz, mgorny, jprice, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24538 llvm-svn: 281524
* [SE] Pack global dev handle addressesJason Henline2016-09-134-32/+14
| | | | | | | | | | | | | | | | | Summary: We were packing global device memory handles in `PackedKernelArgumentArray`, but as I was implementing the CUDA platform, I realized that CUDA wants the address of the handle, not the handle itself. So this patch switches to packing the address of the handle. Reviewers: jlebar Subscribers: jprice, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24528 llvm-svn: 281424
* Device doc says device is smallJason Henline2016-09-131-0/+4
| | | | llvm-svn: 281423
* [SE] Platforms return Device valuesJason Henline2016-09-134-24/+19
| | | | | | | | | | | | | | | Summary: Platforms were returning Device pointers, but a Device is now basically just a pointer to an underlying PlatformDevice, so we will now just pass it around as a value. Reviewers: jlebar Subscribers: jprice, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24537 llvm-svn: 281422
* [SE] KernelSpec return best PTXJason Henline2016-09-133-12/+15
| | | | | | | | | | | | | | | | Summary: Before, the kernel spec would only return PTX for exactly the requested compute capability. With this patch it will now return the PTX with the largest compute capability that does not exceed that requested compute capability. Reviewers: jlebar Subscribers: jprice, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24531 llvm-svn: 281417
* [SE] Use real HostPlatformDevice for testingJason Henline2016-09-135-151/+17
| | | | | | | | | | | | | | Summary: Replace uses of SimpleHostPlatformDevice in tests with HostPlatformDevice. Reviewers: jlebar Subscribers: jlebar, jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24519 llvm-svn: 281384
* [SE] Host platform implementationJason Henline2016-09-138-5/+329
| | | | | | | | | | | | | | | | | Summary: This implementation does not currently support multiple concurrent streams, and it won't allow kernels to be launched with grids larger than one block or blocks larger than one thread. These limitations could be removed in the future by launching new threads on the host, but that is not done in this implementation. Reviewers: jlebar Subscribers: beanz, mgorny, jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24473 llvm-svn: 281377
* [SE] Add .clang-formatJason Henline2016-09-1319-153/+120
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The .clang-tidy file is copied from the top-level LLVM source directory. Also fix warnings generated by clang-format: * Moved SimpleHostPlatformDevice.h so its header include guard could have the right format. * Changed signatures of methods taking llvm::Twine by value to take it by const ref instead. * Add "noexcept" to some move constructors and assignment operators. * Removed a bunch of places where single-statement loops and conditionals were surrounded with braces. (This was not found by the current clang-tidy, but with a local patch that I hope to upstream soon.) Reviewers: jlebar, jprice Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24468 llvm-svn: 281374
* [SE] Stop using llvm-config --cxxflagsJason Henline2016-09-131-11/+4
| | | | | | | | | | | | | | | | | | | | | | | Summary: Build configuration was adding $(llvm-config --cxxflags) to the StreamExecutor CXXFLAGS, but this was causing "-O3" to be passed even for debug builds, and was making debugging difficult. The llvm-config call was originally introduced to handle the -fno-rtti flag because an RTTI StreamExecutor could not link with a no-RTTI LLVM. This patch converts to using LLVM_ENABLE_RTTI and only adding the `-fno-rtti` flag if needed, not all the rest of the LLVM CXXFLAGS. I have tested this with clang-4.0 and gcc-4.8 on Ubuntu. Some work will probably have to be done to support MSVC. Reviewers: jlebar Subscribers: beanz, jprice, parallel_libs-commits, mgorny Differential Revision: https://reviews.llvm.org/D24474 llvm-svn: 281347
* [SE] Clean up device and host memory slicesJason Henline2016-09-124-22/+35
| | | | | | | | | | | | | | | | | | | Summary: * Add LLVM_ATTRIBUTE_UNUSED_RESULT used to slicing methods in order to emphasize that the slicing is not done in place. * Change device memory slice function name from `drop_front` to `slice` in order to match the naming convention of `llvm::ArrayRef` and host memory slice. * Change the parameter names of host memory slice functions to `DropCount` and `TakeCount` to match device memory slice declarations. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24464 llvm-svn: 281239
* [SE] RegisteredHostMemory for async device copiesJason Henline2016-09-1211-302/+393
| | | | | | | | | | | | | | | | | | | Summary: Improve the error-prone interface that allows users to pass host pointers that haven't been registered to asynchronous copy methods. In CUDA, this is an extremely easy error to make, and instead of failing at runtime, it succeeds and gives the right answers by turning the async copy into a sync copy. So, you silently get a huge performance degradation if you misuse the old interface. This new interface should prevent that. Reviewers: jlebar Subscribers: jprice, beanz, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24353 llvm-svn: 281225
* [SE] Remove Utils directoryJason Henline2016-09-0911-19/+12
| | | | | | | | | | | | | | | | | | Summary: There is no purpose in splitting out the Error class from the rest of the StreamExecutor code. This organization was just a vestige of an old failed design. Plus, this change fixes a bug in the build where the utilites library was not being statically linked in with libstreamexecutor. Reviewers: jlebar, jprice Subscribers: beanz, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24434 llvm-svn: 281118
* [StreamExecutor] Make SE work with an in-tree LLVM build.Justin Lebar2016-09-0912-56/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With these changes, we can put parallel-libs within llvm/projects and build as normal. This is kind of the minimal change I could figure out how to make while still making us compatible with llvm's build system. Some things I'm not thrilled about include: * The creation of a CoreTests directory (the macros really seemed to want this) * Pulling SimpleHostPlatformDevice.h into CoreTests. It seems to me this should live inside unittests/include, or maybe tests/include, but I didn't want to make that change in this patch. One important piece of work that remains to be done is to make $ ninja check-streamexecutor run all the tests. Right now the only way I've figured out to run the tests is $ ninja projects/parallel-libs/streamexecutor/unittests/StreamExecutorUnitTests $ projects/parallel-libs/streamexecutor/unittests/CoreTests/CoreTests Reviewers: jhen Subscribers: beanz, parallel_libs-commits, jprice Differential Revision: https://reviews.llvm.org/D24368 llvm-svn: 281091
* Add streamexecutor-configJason Henline2016-09-084-1/+215
| | | | | | | | | | | | | | Summary: Similar to llvm-config, gets command-line flags that are needed to build applications linking against StreamExecutor. Reviewers: jprice, jlebar Subscribers: beanz, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24302 llvm-svn: 280955
* [SE] Add getName method to Device classJason Henline2016-09-072-0/+7
| | | | | | | | | | Reviewers: jhen Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24240 llvm-svn: 280872
* [SE] Rename PlatformInterfaces to PlatformDeviceJason Henline2016-09-0611-27/+20
| | | | | | | | | | | | | | Summary: The only interface that we ever plan to have in this file is PlatformDevice, so it makes sense to rename the file to reflect that. Reviewers: jprice Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24269 llvm-svn: 280737
* [SE] Remove Platform*Handle classesJason Henline2016-09-0610-95/+142
| | | | | | | | | | | | | | | | Summary: As pointed out by jprice, these classes don't serve a purpose. Instead, we stay consistent with the way memory is managed and let the Stream and Kernel classes directly hold opaque handles to device Stream and Kernel instances, respectively. Reviewers: jprice, jlebar Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24213 llvm-svn: 280719
* [SE] Add getByteCount methods for device memoryJason Henline2016-09-032-13/+22
| | | | | | | | | | | | | | Summary: Simple utility methods will prevent users from making mistakes when converting element counts to byte counts. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24197 llvm-svn: 280563
* [SE] Remove broken doc refJason Henline2016-09-021-3/+0
| | | | llvm-svn: 280512
* [SE] Doc tweaksJason Henline2016-09-025-14/+41
| | | | | | | | | | | | | | | | | | | | | Summary: * Sections on main page. * Use std algorithm for equality check in example. * Add tree view on left side. * Add extra CSS sheet to restrict content width. * Add mild background color. * Restrict alphabetic indexes to 1 column. * Round corners of content boxes. * Rename example to CUDASaxpy.cpp. * Add CUDASaxpy.cpp to "Examples" section. Reviewers: jprice Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24198 llvm-svn: 280511
* [SE] GlobalDeviceMemory owns its handleJason Henline2016-09-026-143/+123
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Final step in getting GlobalDeviceMemory to own its handle. * Make GlobalDeviceMemory movable, but no longer copyable. * Make Device::freeDeviceMemory function private and make GlobalDeviceMemoryBase a friend of Device so GlobalDeviceMemoryBase can free its memory in its destructor. * Make GlobalDeviceMemory constructor private and make Device a friend so it can construct GlobalDeviceMemory. * Remove SharedDeviceMemoryBase class because it is never used. * Remove explicit memory freeing from example code. This change just consumes any errors generated during device memory freeing. The real error handling will be added in a future patch. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24195 llvm-svn: 280509
* [SE] Add "install" actions to cmake buildJason Henline2016-09-022-0/+4
| | | | | | | The "install" build target will now copy the StreamExecutor library and headers to the appropriate subdirectories of CMAKE_INSTALL_PREFIX. llvm-svn: 280506
* [SE] Don't pack raw device mem argsJason Henline2016-09-022-116/+40
| | | | | | | | | | | | | | | | | Summary: Step 4 of getting GlobalDeviceMemory to own its handle. Take out code to pack untyped device memory types as kernel arguments. When GlobalDeviceMemory owns its handle, users will never touch untyped device memory types, so they will never pass them as kernel args. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24177 llvm-svn: 280496
* [StreamExecutor] Pass device memory by refJason Henline2016-09-023-31/+34
| | | | | | | | | | | | | | | | Summary: Step 3 of getting GlobalDeviceMemory to own its handle. Since GlobalDeviceMemory will no longer by copy-constructible, we must pass instances by reference rather than by value. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24172 llvm-svn: 280439
* [SE] Make Kernel movableJason Henline2016-09-023-72/+12
| | | | | | | | | | | | | | | Summary: Kernel is basically just a smart pointer to the underlying implementation, so making it movable prevents having to store a std::unique_ptr to it. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24150 llvm-svn: 280437
* [StreamExecutor] Read dev array directly in testJason Henline2016-09-013-63/+97
| | | | | | | | | | | | | | | | | | Summary: Step 2 of getting GlobalDeviceMemory to own its handle. Use the SimpleHostPlatformDevice allocate methods to create device arrays for tests, and check for successful copies by dereferncing the device array handle directly because we know it is really a host pointer. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24148 llvm-svn: 280428
* [StreamExecutor] Dev handles in platform interfaceJason Henline2016-09-016-153/+171
| | | | | | | | | | | | | | | | | Summary: This is the first in a series of patches that will convert GlobalDeviceMemory to own its device memory handle. The first step is to remove GlobalDeviceMemoryBase from the PlatformInterface interfaces and use raw handles there instead. This is useful because GlobalDeviceMemoryBase is going to lose its importance in this process. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24114 llvm-svn: 280401
* [SE] Make Stream movableJason Henline2016-09-015-14/+17
| | | | | | | | | | | | | | Summary: The example code makes it clear that this is a much better design decision. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24142 llvm-svn: 280397
* [SE] Docs use JAVADOC_AUTOBRIEFJason Henline2016-09-011-1/+1
| | | | | | | That way we don't have to explicitly annotate each brief description as \brief. llvm-svn: 280384
* [StreamExecutor] getOrDie and dieIfError utilsJason Henline2016-08-314-37/+52
| | | | | | | | | | Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24107 llvm-svn: 280312
* Exclude examples, unittests from doc genJason Henline2016-08-311-1/+1
| | | | | | | Public documentation shouldn't be generated for unit test code and code that is only meant to be used as snippets in other documentation. llvm-svn: 280278
* [StreamExecutor] Add Doxygen main pageJason Henline2016-08-317-5/+243
| | | | | | | | | | Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24066 llvm-svn: 280277
* [StreamExecutor] Add Stream::blockHostUntilDoneJason Henline2016-08-311-1/+10
| | | | | | | | | | | | Summary: Add the type-safe wrapper to the platform-specific implementation. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24063 llvm-svn: 280182
* [StreamExecutor] Simplify Kernel classesJason Henline2016-08-307-212/+87
| | | | | | | | | | | | | | Summary: Make the Kernel class follow the pattern of the other classes. It now has a type-safe user wrapper and a typeless, platform-specific handle. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24043 llvm-svn: 280176
* [StreamExecutor] Fix KernelSpec DoxygenJason Henline2016-08-261-5/+5
| | | | | | | | | | | | | | | Summary: There was a typo where \endcode was spelled as \encode and it was keeping the whole file document from rendering. I also added in some \c annotations for inline code stuff to make it look nicer. Reviewers: jprice Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23941 llvm-svn: 279855
* [StreamExecutor] Add Platform and PlatformManagerJason Henline2016-08-255-0/+154
| | | | | | | | | | | | Summary: Abstractions for a StreamExecutor platform Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23857 llvm-svn: 279779
* [StreamExecutor] Rename Executor to DeviceJason Henline2016-08-2414-580/+575
| | | | | | | | | | | | Summary: This more clearly describes what the class is. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23851 llvm-svn: 279669
OpenPOWER on IntegriCloud