| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The project has been renamed to Acxxel, so this old directory needs to
be deleted.
Reviewers: jlebar, jprice
Subscribers: beanz, mgorny, parallel_libs-commits, modocache
Differential Revision: https://reviews.llvm.org/D25964
llvm-svn: 285115
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add proper handling for shared memory arguments in the CUDA platform. Also add
in unit tests for CUDA.
Reviewers: jlebar
Subscribers: beanz, mgorny, jprice, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24596
llvm-svn: 281635
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add logic to allow users to specify the CUDA path at configuration time.
Reviewers: jlebar
Subscribers: beanz, mgorny, jlebar, jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24580
llvm-svn: 281626
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Basic CUDA platform implementation and cmake infrastructure to control
whether it's used. A few important TODOs will be handled in later
patches:
* Log some error messages that can't easily be returned as Errors.
* Cache modules and kernels to prevent reloading them if someone tries to
reload a kernel that's already loaded.
* Tolerate shared memory arguments for kernel launches.
Reviewers: jlebar
Subscribers: beanz, mgorny, jprice, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24538
llvm-svn: 281524
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before, the kernel spec would only return PTX for exactly the requested
compute capability. With this patch it will now return the PTX with the
largest compute capability that does not exceed that requested compute
capability.
Reviewers: jlebar
Subscribers: jprice, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24531
llvm-svn: 281417
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This implementation does not currently support multiple concurrent streams, and
it won't allow kernels to be launched with grids larger than one block or
blocks larger than one thread. These limitations could be removed in the future
by launching new threads on the host, but that is not done in this
implementation.
Reviewers: jlebar
Subscribers: beanz, mgorny, jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24473
llvm-svn: 281377
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The .clang-tidy file is copied from the top-level LLVM source directory.
Also fix warnings generated by clang-format:
* Moved SimpleHostPlatformDevice.h so its header include guard could
have the right format.
* Changed signatures of methods taking llvm::Twine by value to take it
by const ref instead.
* Add "noexcept" to some move constructors and assignment operators.
* Removed a bunch of places where single-statement loops and
conditionals were surrounded with braces. (This was not found by the
current clang-tidy, but with a local patch that I hope to upstream
soon.)
Reviewers: jlebar, jprice
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24468
llvm-svn: 281374
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Improve the error-prone interface that allows users to pass host
pointers that haven't been registered to asynchronous copy methods. In
CUDA, this is an extremely easy error to make, and instead of failing at
runtime, it succeeds and gives the right answers by turning the async
copy into a sync copy. So, you silently get a huge performance
degradation if you misuse the old interface. This new interface should
prevent that.
Reviewers: jlebar
Subscribers: jprice, beanz, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24353
llvm-svn: 281225
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There is no purpose in splitting out the Error class from the rest of
the StreamExecutor code. This organization was just a vestige of an old
failed design.
Plus, this change fixes a bug in the build where the utilites library
was not being statically linked in with libstreamexecutor.
Reviewers: jlebar, jprice
Subscribers: beanz, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24434
llvm-svn: 281118
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
With these changes, we can put parallel-libs within llvm/projects and
build as normal.
This is kind of the minimal change I could figure out how to make while
still making us compatible with llvm's build system. Some things I'm
not thrilled about include:
* The creation of a CoreTests directory (the macros really seemed to
want this)
* Pulling SimpleHostPlatformDevice.h into CoreTests. It seems to me
this should live inside unittests/include, or maybe tests/include,
but I didn't want to make that change in this patch.
One important piece of work that remains to be done is to make
$ ninja check-streamexecutor
run all the tests. Right now the only way I've figured out to run the
tests is
$ ninja projects/parallel-libs/streamexecutor/unittests/StreamExecutorUnitTests
$ projects/parallel-libs/streamexecutor/unittests/CoreTests/CoreTests
Reviewers: jhen
Subscribers: beanz, parallel_libs-commits, jprice
Differential Revision: https://reviews.llvm.org/D24368
llvm-svn: 281091
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: jhen
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24240
llvm-svn: 280872
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The only interface that we ever plan to have in this file is
PlatformDevice, so it makes sense to rename the file to reflect that.
Reviewers: jprice
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24269
llvm-svn: 280737
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As pointed out by jprice, these classes don't serve a purpose. Instead,
we stay consistent with the way memory is managed and let the Stream and
Kernel classes directly hold opaque handles to device Stream and Kernel
instances, respectively.
Reviewers: jprice, jlebar
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24213
llvm-svn: 280719
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Simple utility methods will prevent users from making mistakes when
converting element counts to byte counts.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24197
llvm-svn: 280563
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Final step in getting GlobalDeviceMemory to own its handle.
* Make GlobalDeviceMemory movable, but no longer copyable.
* Make Device::freeDeviceMemory function private and make
GlobalDeviceMemoryBase a friend of Device so GlobalDeviceMemoryBase
can free its memory in its destructor.
* Make GlobalDeviceMemory constructor private and make Device a friend
so it can construct GlobalDeviceMemory.
* Remove SharedDeviceMemoryBase class because it is never used.
* Remove explicit memory freeing from example code.
This change just consumes any errors generated during device memory freeing.
The real error handling will be added in a future patch.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24195
llvm-svn: 280509
|
|
|
|
|
|
|
| |
The "install" build target will now copy the StreamExecutor library and
headers to the appropriate subdirectories of CMAKE_INSTALL_PREFIX.
llvm-svn: 280506
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Step 4 of getting GlobalDeviceMemory to own its handle.
Take out code to pack untyped device memory types as kernel arguments.
When GlobalDeviceMemory owns its handle, users will never touch untyped
device memory types, so they will never pass them as kernel args.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24177
llvm-svn: 280496
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Step 2 of getting GlobalDeviceMemory to own its handle.
Use the SimpleHostPlatformDevice allocate methods to create device
arrays for tests, and check for successful copies by dereferncing the
device array handle directly because we know it is really a host
pointer.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24148
llvm-svn: 280428
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is the first in a series of patches that will convert
GlobalDeviceMemory to own its device memory handle. The first step is to
remove GlobalDeviceMemoryBase from the PlatformInterface interfaces and
use raw handles there instead. This is useful because
GlobalDeviceMemoryBase is going to lose its importance in this process.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24114
llvm-svn: 280401
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The example code makes it clear that this is a much better design
decision.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24142
llvm-svn: 280397
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24107
llvm-svn: 280312
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Make the Kernel class follow the pattern of the other classes. It now
has a type-safe user wrapper and a typeless, platform-specific handle.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24043
llvm-svn: 280176
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Abstractions for a StreamExecutor platform
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23857
llvm-svn: 279779
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This more clearly describes what the class is.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23851
llvm-svn: 279669
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The return value from PlatformExecutor::allocateDeviceMemory needs to be
converted from Expected<GlobalDeviceMemoryBase> to
Expected<GlobalDeviceMemory<T>> in Executor::allocateDeviceMemory.
A similar bug is also fixed for Executor::allocateHostMemory.
Thanks to jprice for identifying this bug.
Reviewers: jprice, jlebar
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23849
llvm-svn: 279658
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add Executor methods that block the host until completion. Since these
methods are host-synchronous, they don't require Stream arguments.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23577
llvm-svn: 279640
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: No functional changes just renaming this class for better readability.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23574
llvm-svn: 278833
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add the Stream class and a few of the operations it supports.
Reviewers: jlebar, tra
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23333
llvm-svn: 278829
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add types for device memory and add the code that knows how to pack these
device memory types if they are passed as arguments to kernel launches.
Reviewers: jlebar, tra
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23211
llvm-svn: 278021
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add StreamExecutor kernel types.
Reviewers: jlebar, tra
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23138
llvm-svn: 277827
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add definitions for the KernelLoaderSpec and MultiKernelLoaderSpec
classes to StreamExecutor. Instances of these classes are generated by the
compiler in order to provide host code with a handle to device code.
Reviewers: jlebar, tra
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23038
llvm-svn: 277615
|
|
Summary:
Error handling in StreamExecutor is based on llvm::Error and
llvm::Expected. This CL sets up the StreamExecutor wrapper classes in
the streamexecutor namespace.
All the other StreamExecutor code makes use of this error handling code,
so this is the first CL for checking in StreamExecutor.
Reviewers: jlebar, tra
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D22687
llvm-svn: 277210
|