summaryrefslogtreecommitdiffstats
path: root/clang/lib/Driver/ToolChains/Cuda.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [OpenMP] Provide a default GPU arch that is supported byGheorghe-Teodor Bercea2017-08-101-3/+7
| | | | | | | | the underlying hardware. This fixes a bug triggered by diff: D29660 llvm-svn: 310549
* [OpenMP] Enable executable lookup into driver directory.Gheorghe-Teodor Bercea2017-08-091-0/+3
| | | | | | | | | | | | | | Summary: Invoking the compiler inside a script causes the clang-offload-bundler executable to not be found. This patch enables the lookup for executables in the driver directory where the clang-offload-bundler resides. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, ABataev, caomhin Reviewed By: hfinkel Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D36537 llvm-svn: 310513
* [OpenMP] Add flag for overwriting default PTX version for OpenMP targetsGheorghe-Teodor Bercea2017-08-091-1/+7
| | | | | | | | | | | | | | | | | Summary: This flag "--fopenmp-ptx=" enables the overwriting of the default PTX version used for GPU offloaded OpenMP target regions: "+ptx42". Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar Reviewed By: ABataev Subscribers: rengolin, cfe-commits Differential Revision: https://reviews.llvm.org/D29660 llvm-svn: 310489
* [OpenMP] Add flag for disabling the default generation of relocatable OpenMP ↵Gheorghe-Teodor Bercea2017-08-091-1/+4
| | | | | | | | | | | | | | | | target code for NVIDIA GPUs. Summary: Previously we have added the "-c" flag which gets passed to PTXAS by default to generate relocatable OpenMP target code by default. This set of flags exposes control over this behaviour. Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar Reviewed By: ABataev Subscribers: Hahnfeld, rengolin, cfe-commits Differential Revision: https://reviews.llvm.org/D29659 llvm-svn: 310484
* [OpenMP] Make OpenMP generated code for the NVIDIA device relocatable by defaultGheorghe-Teodor Bercea2017-08-091-0/+4
| | | | | | | | | Original Diff: D29642 This patch was previously reverted due to an error with patch D29654 that this depends on. llvm-svn: 310479
* [OpenMP] OpenMP device offloading code generation produces a cubin file ↵Gheorghe-Teodor Bercea2017-08-081-3/+88
| | | | | | | | which is then integrated in the host binary using the host linker. Diff: D29654 llvm-svn: 310362
* Revert r310291, r310300 and r310332 because of test failure on DarwinAlex Lorenz2017-08-081-92/+3
| | | | | | | | | | | | | | | | | | | | | The commit r310291 introduced the failure. r310332 was a test fix commit and r310300 was a followup commit. I reverted these two to avoid merge conflicts when reverting. The 'openmp-offload.c' test is failing on Darwin because the following run lines: // RUN: touch %t1.o // RUN: touch %t2.o // RUN: %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %t1.o %t2.o 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-TWOCUBIN %s trigger the following assertion: Driver.cpp:3418: assert(CachedResults.find(ActionTC) != CachedResults.end() && "Result does not exist??"); llvm-svn: 310345
* [OpenMP] Make OpenMP generated code for the NVIDIA device relocatable by defaultGheorghe-Teodor Bercea2017-08-071-0/+4
| | | | | | | | | | | | | | Summary: When device offloading is enabled and the device is an NVIDIA GPU, OpenMP target regions must be compiled with relocation enabled by passing the "-c" flag to the PTXAS invocation. Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar Reviewed By: Hahnfeld Subscribers: Hahnfeld, rengolin, mkuron, cfe-commits Differential Revision: https://reviews.llvm.org/D29642 llvm-svn: 310300
* [OpenMP] Pass -v to PTXAS if it was passed to the driver.Gheorghe-Teodor Bercea2017-08-071-0/+4
| | | | | | | | | | | | | | Summary: When compiling code being offloaded by OpenMP to an NVIDIA GPU, pass the -v to PTXAS if it was passed to the CLANG driver. Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, jlebar, hfinkel, tstellar Reviewed By: jlebar Subscribers: Hahnfeld, rengolin, cfe-commits Differential Revision: https://reviews.llvm.org/D29644 llvm-svn: 310295
* [OpenMP] Integrate OpenMP target region cubin into host binaryGheorghe-Teodor Bercea2017-08-071-3/+88
| | | | | | | | | | | | | | Summary: OpenMP device offloading code generation produces a cubin file which is then integrated in the host binary using the host linker. Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, rnk, hfinkel, tstellar Reviewed By: hfinkel Subscribers: sfantao, rnk, rengolin, cfe-commits Differential Revision: https://reviews.llvm.org/D29654 llvm-svn: 310291
* [OpenMP] Add flag for specifying the target device architecture for OpenMP ↵Gheorghe-Teodor Bercea2017-08-071-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | device offloading Summary: OpenMP has the ability to offload target regions to devices which may have different architectures. A new -fopenmp-target-arch flag is introduced to specify the device architecture. In this patch I use the new flag to specify the compute capability of the underlying NVIDIA architecture for the OpenMP offloading CUDA tool chain. Only a host-offloading test is provided since full device offloading capability will only be available when [[ https://reviews.llvm.org/D29654 | D29654 ]] lands. Reviewers: hfinkel, Hahnfeld, carlo.bertolli, caomhin, ABataev Reviewed By: hfinkel Subscribers: guansong, cfe-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D34784 llvm-svn: 310263
* [OpenMP] Extend CLANG target options with device offloading kind.Gheorghe-Teodor Bercea2017-07-061-13/+38
| | | | | | | | | | | | | | Summary: Pass the type of the device offloading when building the tool chain for a particular target architecture. This is required when supporting multiple tool chains that target a single device type. In our particular use case, the OpenMP and CUDA tool chains will use the same ```addClangTargetOptions ``` method. This enables the reuse of common options and ensures control over options only supported by a particular tool chain. Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, jlebar, hfinkel, tstellar, Hahnfeld Reviewed By: hfinkel Subscribers: jgravelle-google, aheejin, rengolin, jfb, dschuff, sbc100, cfe-commits Differential Revision: https://reviews.llvm.org/D29647 llvm-svn: 307272
* [Driver] Consolidate tools and toolchains by target platform. (NFC)David L. Jones2017-03-081-0/+488
Summary: (This is a move-only refactoring patch. There are no functionality changes.) This patch splits apart the Clang driver's tool and toolchain implementation files. Each target platform toolchain is moved to its own file, along with the closest-related tools. Each target platform toolchain has separate headers and implementation files, so the hierarchy of classes is unchanged. There are some remaining shared free functions, mostly from Tools.cpp. Several of these move to their own architecture-specific files, similar to r296056. Some of them are only used by a single target platform; since the tools and toolchains are now together, some helpers now live in a platform-specific file. The balance are helpers related to manipulating argument lists, so they are now in a new file pair, CommonArgs.h and .cpp. I've tried to cluster the code logically, which is fairly straightforward for most of the target platforms and shared architectures. I think I've made reasonable choices for these, as well as the various shared helpers; but of course, I'm happy to hear feedback in the review. There are some particular things I don't like about this patch, but haven't been able to find a better overall solution. The first is the proliferation of files: there are several files that are tiny because the toolchain is not very different from its base (usually the Gnu tools/toolchain). I think this is mostly a reflection of the true complexity, though, so it may not be "fixable" in any reasonable sense. The second thing I don't like are the includes like "../Something.h". I've avoided this largely by clustering into the current file structure. However, a few of these includes remain, and in those cases it doesn't make sense to me to sink an existing file any deeper. Reviewers: rsmith, mehdi_amini, compnerd, rnk, javed.absar Subscribers: emaste, jfb, danalbert, srhines, dschuff, jyknight, nemanjai, nhaehnle, mgorny, cfe-commits Differential Revision: https://reviews.llvm.org/D30372 llvm-svn: 297250
OpenPOWER on IntegriCloud