| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This fixes an issue with clang issuing a warning about unknown CUDA SDK if it's
detected during non-CUDA compilation.
Differential Revision: https://reviews.llvm.org/D76030
(cherry picked from commit eb2ba2ea953b5ea73cdbb598f77470bde1c6a011)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes clang somewhat forward-compatible with new CUDA releases
without having to patch it for every minor release without adding
any new function.
If an unknown version is found, clang issues a warning (can be disabled
with -Wno-cuda-unknown-version) and assumes that it has detected
the latest known version. CUDA releases are usually supersets
of older ones feature-wise, so it should be sufficient to keep
released clang versions working with minor CUDA updates without
having to upgrade clang, too.
Differential Revision: https://reviews.llvm.org/D73231
(cherry picked from commit 12fefeef203ab4ef52d19bcdbd4180608a4deae1)
|
|
|
|
|
|
| |
Skip distro detection when we're not running on Linux, or when the target triple is not Linux. This saves a few OS calls for each invocation of clang.exe.
Differential Revision: https://reviews.llvm.org/D70467
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the remaining part of the OpenMP offload linker scripts which was used for inserting device binaries into the output linked binary. Device binaries are now inserted into the host binary with a help of the wrapper bit-code file which contains device binaries as data. Wrapper bit-code file is dynamically created by the clang driver with a help of new tool clang-offload-wrapper which takes device binaries as input and produces bit-code file with required contents. Wrapper bit-code is then compiled to an object and resulting object is appended to the host linking by the clang driver.
This is the second part of the patch for eliminating OpenMP linker script (please see https://reviews.llvm.org/D64943).
Differential Revision: https://reviews.llvm.org/D68166
llvm-svn: 374219
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D68300
llvm-svn: 373649
|
|
|
|
|
|
|
|
|
|
| |
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
Differential revision: https://reviews.llvm.org/D66259
llvm-svn: 368942
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The bug was reported on the OpenMP-dev list:
.../obj-release/lib/clang/9.0.0/include/__clang_cuda_intrinsics.h:173:35: error: '__nvvm_shfl_sync_idx_i32' needs target feature ptx60|ptx61|ptx63|ptx64
__MAKE_SYNC_SHUFFLES(__shfl_sync, __nvvm_shfl_sync_idx_i32,
This problem occurs when trying to compile a .cu file that requires a newer ptx version (>ptx60 in this case) than ptx42.
Reviewers: tra, ABataev, caomhin
Reviewed By: tra
Subscribers: jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61474
llvm-svn: 359910
|
|
|
|
|
|
|
|
|
| |
CUDA 10.1 tools deprecated some command line options.
fatbinary no longer needs --cuda.
Differential Revision: https://reviews.llvm.org/D61470
llvm-svn: 359838
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These builtins provide access to the new integer and
sub-integer variants of MMA (matrix multiply-accumulate) instructions
provided by CUDA-10.x on sm_75 (AKA Turing) GPUs.
Also added a feature for PTX 6.4. While Clang/LLVM does not generate
any PTX instructions that need it, we still need to pass it through to
ptxas in order to be able to compile code that uses the new 'mma'
instruction as inline assembly (e.g used by NVIDIA's CUTLASS library
https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101)
Differential Revision: https://reviews.llvm.org/D60279
llvm-svn: 359248
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D57771
llvm-svn: 353232
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
..and use it to control that parts of CUDA compilation
that depend on the specific version of CUDA SDK.
This patch has a placeholder for a 'new launch API' support
which is in a separate patch. The list will be further
extended in the upcoming patch to support CUDA-10.1.
Differential Revision: https://reviews.llvm.org/D57487
llvm-svn: 352798
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added support for the -gline-directives-only option + fixed logic of the
debug info for CUDA devices. If optimization level is O0, then options
--[no-]cuda-noopt-device-debug do not affect the debug info level. If
the optimization level is >O0, debug info options are used +
--no-cuda-noopt-device-debug is used or no --cuda-noopt-device-debug is
used, the optimization level for the device code is kept and the
emission of the debug directives is used.
If the opt level is > O0, debug info is requested +
--cuda-noopt-device-debug option is used, the optimization is disabled
for the device code + required debug info is emitted.
Reviewers: tra, echristo
Subscribers: aprantl, guansong, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D51554
llvm-svn: 348930
|
|
|
|
|
|
|
|
|
|
| |
This just extends D40453 (r319317) to Ubuntu.
Reviewed By: Hahnfeld, tra
Differential Revision: https://reviews.llvm.org/D55269
llvm-svn: 348504
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch moves the virtual file system form clang to llvm so it can be
used by more projects.
Concretely the patch:
- Moves VirtualFileSystem.{h|cpp} from clang/Basic to llvm/Support.
- Moves the corresponding unit test from clang to llvm.
- Moves the vfs namespace from clang::vfs to llvm::vfs.
- Formats the lines affected by this change, mostly this is the result of
the added llvm namespace.
RFC on the mailing list:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/126657.html
Differential revision: https://reviews.llvm.org/D52783
llvm-svn: 344140
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original
options as aliases. When -fgpu-rdc is off,
clang will assume the device code in each translation unit does not call
external functions except those in the device library, therefore it is possible
to compile the device code in each translation unit to self-contained kernels
and embed them in the host object, so that the host object behaves like
usual host object which can be linked by lld.
The benefits of this feature is: 1. allow users to create static libraries which
can be linked by host linker; 2. amortized device code linking time.
This patch modifies HIP action builder to insert actions for linking device
code and generating HIP fatbin, and pass HIP fatbin to host backend action.
It extracts code for constructing command for generating HIP fatbin as
a function so that it can be reused by early finalization. It also modifies
codegen of HIP host constructor functions to embed the device fatbin
when it is available.
Differential Revision: https://reviews.llvm.org/D52377
llvm-svn: 343611
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When looking for the bclib Clang considered the default library
path first while it preferred directories in LIBRARY_PATH when
constructing the invocation of nvlink. The latter actually makes
more sense because during development it allows using a non-default
runtime library. So change the search for the bclib to start
looking in directories given by LIBRARY_PATH.
Additionally add a new option --libomptarget-nvptx-path= which
will be searched first. This will be handy for testing purposes.
Differential Revision: https://reviews.llvm.org/D51686
llvm-svn: 343230
|
|
|
|
| |
llvm-svn: 342924
|
|
|
|
|
|
|
| |
The same semantics work for OpenCL, and probably any offload
language. Keep the old name around as an alias.
llvm-svn: 340193
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Some targets support only default set of the debug options and do not
support additional debug options, like NVPTX target. Patch introduced
virtual function supportsDebugInfoOptions() that can be overloaded
by the toolchain, checks if the target supports some debug
options and emits warning when an unsupported debug option is
found.
Reviewers: echristo
Subscribers: aprantl, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D49148
llvm-svn: 338155
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D47268
llvm-svn: 333098
|
|
|
|
|
|
|
|
|
| |
The option enables use of 32-bit pointers for accessing
const/local/shared memory. The feature is disabled by default.
Differential Revision: https://reviews.llvm.org/D46148
llvm-svn: 331938
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D45827
llvm-svn: 330753
|
|
|
|
|
|
|
|
|
|
| |
instructions.
The new instructions were added added for sm_70+ GPUs in CUDA-9.1.
Differential Revision: https://reviews.llvm.org/D45068
llvm-svn: 330296
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
NVPTX target supports debug info in DWARF-2 format. Patch adds emission
of debug info in DWARF-2 by default.
Reviewers: tra, jlebar
Subscribers: aprantl, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D42581
llvm-svn: 330272
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we always include PTX into the fatbin along
with the GPU code.It about doubles the size of the GPU binary
we need to carry in the executable. These options allow control
inclusion of PTX into GPU binary.
This patch does not change the defaults, though we may consider
making no-PTX the default in the future.
Differential Revision: https://reviews.llvm.org/D45495
llvm-svn: 329737
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:
archtype
cas
classs
checkk
compres
definit
frome
iff
inteval
ith
lod
methode
nd
optin
ot
pres
statics
te
thru
Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)
Differential revision: https://reviews.llvm.org/D44188
llvm-svn: 329399
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode.
Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel
Reviewed By: ABataev, grokos
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D43197
llvm-svn: 327460
|
|
|
|
| |
llvm-svn: 327447
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode.
Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel
Reviewed By: ABataev, grokos
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D43197
llvm-svn: 327438
|
|
|
|
|
|
|
|
|
|
|
| |
As a first step, pass '-c/--compile-only' to ptxas so that it
doesn't complain about references to external function. This
will successfully generate object files, but they won't work
at runtime because the registration routines need to adapted.
Differential Revision: https://reviews.llvm.org/D42921
llvm-svn: 324878
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the CUDA toolkit is not installed to its default locations
in /usr/local/cuda, the user is forced to specify --cuda-path.
This is tedious and the driver can be smarter if well-known tools
(like ptxas) can already be found in the PATH environment variable.
Add option --cuda-path-ignore-env if the user wants to ignore
set environment variables. Also use it in the tests to make sure
the driver always finds the same CUDA installation, regardless
of the user's environment.
Differential Revision: https://reviews.llvm.org/D42642
llvm-svn: 323848
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clang can use CUDA-9.1 now, though new APIs (are not implemented yet.
The major change is that headers in CUDA-9.1 went through substantial
changes that started in CUDA-9.0 which required substantial changes
in the cuda compatibility headers provided by clang.
There are two major issues:
* CUDA SDK no longer provides declarations for libdevice functions.
* A lot of device-side functions have become nvcc's builtins and
CUDA headers no longer contain their implementations.
This patch changes the way CUDA headers are handled if we compile
with CUDA 9.x. Both 9.0 and 9.1 are affected.
* Clang provides its own declarations of libdevice functions.
* For CUDA-9.x clang now provides implementation of device-side
'standard library' functions using libdevice.
This patch should not affect compilation with CUDA-8. There may be
some observable differences for CUDA-9.0, though they are not expected
to affect functionality.
Tested: CUDA test-suite tests for all supported combinations of:
CUDA: 7.0,7.5,8.0,9.0,9.1
GPU: sm_20, sm_35, sm_60, sm_70
Differential Revision: https://reviews.llvm.org/D42513
llvm-svn: 323713
|
|
|
|
|
|
|
|
|
| |
../tools/clang/lib/Driver/ToolChains/Cuda.cpp:80:18: error: reference to non-static member function must be called; did you mean to call it with no arguments?
if (Distro(D.getVFS).IsDebian())
~~^~~~~~
()
llvm-svn: 319322
|
|
|
|
| |
llvm-svn: 319319
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reported here:
http://bugs.debian.org/882505
Patch by Andreas Beckmann
Reviewers: Hahnfeld, tra
Reviewed By: tra
Subscribers: jlebar, cfe-commits
Differential Revision: https://reviews.llvm.org/D40453
llvm-svn: 319317
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was previously done in some places, but for example not for
bundling so that single object compilation with -c failed. In
addition cubin was used for all file types during unbundling which
is incorrect for assembly files that are passed to ptxas.
Tighten up the tests so that we can't regress in that area.
Differential Revision: https://reviews.llvm.org/D40250
llvm-svn: 318763
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
CUDA 9's minimum sm is sm_30.
Ideally we should also make sm_30 the default when compiling with CUDA
9, but that seems harder than it should be.
Subscribers: sanjoy
Differential Revision: https://reviews.llvm.org/D39109
llvm-svn: 316611
|
|
|
|
|
|
|
|
|
| |
For the shuffle instructions in reductions we need at least sm_30
but the user may want to customize the default architecture.
Differential Revision: https://reviews.llvm.org/D38883
llvm-svn: 315996
|
|
|
|
|
|
|
|
|
| |
If the user passes -nocudalib, we can live without it being present.
Simplify the code by just checking whether LibDeviceMap is empty.
Differential Revision: https://reviews.llvm.org/D38901
llvm-svn: 315902
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: If we only use the compiler front-end, do not throw an error about the cuda device library not being found. This allows the front-end to be run on systems where no Cuda installation is found.
Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, tra
Reviewed By: tra
Subscribers: hfinkel, tra, cfe-commits
Differential Revision: https://reviews.llvm.org/D37914
llvm-svn: 314217
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Enable the -nocudalib flag for the OpenMP device offloading toolchain as well. Currently it can only be used for the CUDA toolchain.
Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, hfinkel, tra
Reviewed By: tra
Subscribers: hfinkel, cfe-commits
Differential Revision: https://reviews.llvm.org/D37913
llvm-svn: 314164
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
needed.
Summary: When composing the output file name, the path to the file is being dropped. The full path is required.
Reviewers: Hahnfeld, ABataev, caomhin, carlo.bertolli, hfinkel, tra
Reviewed By: tra
Subscribers: hfinkel, tra, cfe-commits
Differential Revision: https://reviews.llvm.org/D37912
llvm-svn: 314156
|
|
|
|
| |
llvm-svn: 314154
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: If we only use the compiler front-end, do not throw an error about the cuda device library not being found. This allows the front-end to be run on systems where no Cuda installation is found.
Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, tra
Reviewed By: tra
Subscribers: hfinkel, tra, cfe-commits
Differential Revision: https://reviews.llvm.org/D37914
llvm-svn: 314150
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38090
llvm-svn: 313820
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.
On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.
Differential Revision: https://reviews.llvm.org/D37576
llvm-svn: 312734
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create a separate test file to contain all tests for OpenMP
offloading to GPUs.
Make libdevice checking more robust by accounting for
the case in which no libdevice is found.
This changes are in connrection with diff: D29660
llvm-svn: 310718
|
|
|
|
|
|
|
|
| |
until a better way to perform these tests is figured out.
Change connected to diff: D29654
llvm-svn: 310625
|
|
|
|
|
|
|
|
|
|
| |
Commit r310489 caused 'openmp-offload.c' test failures on Darwin and other
platforms:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/39230/testReport/junit/Clang/Driver/openmp_offload_c/
The follow-up commits tried to fix the test, but the test is still failing.
llvm-svn: 310580
|