summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGenCUDA
Commit message (Collapse)AuthorAgeFilesLines
...
* [CUDA] Make printf work.Justin Lebar2016-01-232-0/+55
| | | | | | | | | | | | | | | | | | Summary: The code in CGCUDACall is largely based on a patch written by Eli Bendersky: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140324/210218.html That patch implemented an LLVM pass lowering printf to vprintf; this one does something similar, but in Clang codegen. Reviewers: echristo Subscribers: cfe-commits, jhen, tra, majnemer Differential Revision: http://reviews.llvm.org/D16372 llvm-svn: 258642
* [CUDA] Make vtable construction aware of host/device side of CUDA compilation.Artem Belevich2015-12-171-0/+61
| | | | | | | | | | | | | | | | | | | C++ emits vtables for classes that have key function present in the current TU. While we compile CUDA the fact that key function was found in this TU does not mean that we are going to generate code for it. E.g. vtable for a class with host-only methods should not (and can not) be generated on device side, because we'll never generate code for them during device-side compilation. This patch adds an extra CUDA-specific check during key method computation and filters out potential key methods that are not suitable for this side of CUDA compilation. When we codegen vtable, entries for unsuitable methods are set to null. Differential Revision: http://reviews.llvm.org/D15309 llvm-svn: 255911
* Allow linking multiple bitcode files.Artem Belevich2015-10-272-2/+32
| | | | | | | | | | | | | | | | | | Linking options for particular file depend on the option that specifies the file. Currently there are two: * -mlink-bitcode-file links in complete content of the specified file. * -mlink-cuda-bitcode links in only the symbols needed by current TU. Linked symbols are internalized. This bitcode linking mode is used to link device-specific bitcode provided by CUDA. Files are linked in order they are specified on command line. -mlink-cuda-bitcode replaces -fcuda-uses-libdevice flag. Differential Revision: http://reviews.llvm.org/D13913 llvm-svn: 251427
* [CUDA] __global__ functions should always be visible externally.Artem Belevich2015-09-231-6/+1
| | | | | | | | | Adjust __global__ functions with DiscardableODR linkage to use StrongODR linkage instead, so they are visible externally. Differential Revision: http://reviews.llvm.org/D13067 llvm-svn: 248400
* [CUDA] Allow function overloads in CUDA based on host/device attributes.Artem Belevich2015-09-221-0/+214
| | | | | | | | | | | | | | | | | | | The patch makes it possible to parse CUDA files that contain host/device functions with identical signatures, but different attributes without having to physically split source into host-only and device-only parts. This change is needed in order to parse CUDA header files that have a lot of name clashes with standard include files. Gory details are in design doc here: https://goo.gl/EXnymm Feel free to leave comments there or in this review thread. This feature is controlled with CC1 option -fcuda-target-overloads and is disabled by default. Differential Revision: http://reviews.llvm.org/D12453 llvm-svn: 248295
* [CUDA] Add implicit __attribute__((used)) to all __global__ functions.Artem Belevich2015-09-221-0/+15
| | | | | | | | | | This makes sure that we emit kernels that were instantiated from the host code and which would never be explicitly referenced by anything else on device side. Differential Revision: http://reviews.llvm.org/D11666 llvm-svn: 248293
* [CUDA] Postprocess bitcode linked in during device-side CUDA compilation.Artem Belevich2015-09-102-0/+94
| | | | | | | | Link in and internalize the symbols we need from supplied bitcode library. Differential Revision: http://reviews.llvm.org/D11664 llvm-svn: 247317
* [CUDA] Allow trivial constructors as initializer for __shared__ variables.Artem Belevich2015-09-101-0/+14
| | | | | | Differential Revision: http://reviews.llvm.org/D12739 llvm-svn: 247307
* [CUDA] Change initializer for CUDA device code based on CUDA documentation.Jingyue Wu2015-08-222-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: According to CUDA documentation, global variables declared with __device__, __constant__ can be initialized from host code, so mark them as externally initialized. Because __shared__ variables cannot have an initialization as part of their declaration and since the value maybe kept across different kernel invocation, the value of __shared__ is effectively undefined instead of zero initialized. Wrongly using zero initializer may cause illegitimate optimization, e.g. removing unused __constant__ variable because it's not updated in the device code and the value is initialized with zero. Test Plan: test/CodeGenCUDA/address-spaces.cu Patch by Xuetian Weng Reviewers: jholewinski, eliben, tra, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12241 llvm-svn: 245786
* Revert "[CUDA] Add implicit __attribute__((used)) to all __global__ functions."Daniel Jasper2015-08-111-15/+0
| | | | | | This is breaking internal test. I'll provide a reproduction. llvm-svn: 244583
* [CUDA] Add implicit __attribute__((used)) to all __global__ functions.Artem Belevich2015-08-101-0/+15
| | | | | | | | | This allows emitting kernels that were instantiated from the host code and which would never be explicitly referenced otherwise. Differential Revision: http://reviews.llvm.org/D11666 llvm-svn: 244501
* [cuda] Fixed test case failure on s390xArtem Belevich2015-05-111-2/+2
| | | | llvm-svn: 237007
* Fixed test failure on machines with 32-bit size_t.Artem Belevich2015-05-071-1/+1
| | | | llvm-svn: 236773
* [cuda] Include GPU binary into host object file and generate init/deinit code.Artem Belevich2015-05-071-1/+40
| | | | | | | | | | | | - added -fcuda-include-gpubinary option to incorporate results of device-side compilation into host-side one. - generate code to register GPU binaries and associated kernels with CUDA runtime and clean-up on exit. - added test case for init/deinit code generation. Differential Revision: http://reviews.llvm.org/D9507 llvm-svn: 236765
* [cuda] treat file scope __asm as __host__ and ignore it during device-side ↵Artem Belevich2015-04-271-0/+6
| | | | | | | | | | | | | | | | compilation. Currently clang emits file-scope asm during *both* host and device compilation modes which is usually a wrong thing to do. There's no way to attach any attribute to an __asm statement, so there's no way to differentiate between host-side and device-side file-scope asm. This patch makes clang to match nvcc behavior and emit file-scope-asm only during host-side compilation. Differential Revision: http://reviews.llvm.org/D9270 llvm-svn: 235905
* [cuda] Allow using integral non-type template parameters as launch_bounds ↵Artem Belevich2015-04-211-0/+51
| | | | | | | | | | | | | | | | attribute arguments. - Changed CUDALaunchBounds arguments from integers to Expr* so they can be saved in AST for instantiation. - Added support for template instantiation of launch_bounds attrubute. - Moved evaluation of launch_bounds arguments to NVPTXTargetCodeGenInfo:: SetTargetAttributes() where it can be done after template instantiation. - Added a warning on negative launch_bounds arguments. - Amended test cases. Differential Revision: http://reviews.llvm.org/D8985 llvm-svn: 235452
* [cuda] Added support for CUDA built-in variables.Artem Belevich2015-04-211-0/+28
| | | | | | | | | | | | | | | | | Added cuda_builtin_vars.h which implements built-in CUDA variables using __declattr(property). Fields of built-in variables (except for warpSize) are implemented using __declattr(property) which replaces read/write of a member field with a call to a getter/setter member function, in this case with appropriate NVPTX builtin. Added a test case to check diagnostics on attempt to construct or improperly access a built-in variable. Differential Revision: http://reviews.llvm.org/D9064 llvm-svn: 235448
* Revert r235398 "[cuda] Added support for CUDA built-in variables."Artem Belevich2015-04-211-28/+0
| | | | | | r235398 was causing buildbot break due to missing Makefile changes. llvm-svn: 235401
* [cuda] Added support for CUDA built-in variables.Artem Belevich2015-04-211-0/+28
| | | | | | | | | | | | | | | | | Added cuda_builtin_vars.h which implements built-in CUDA variables using __declattr(property). Fields of built-in variables (except for warpSize) are implemented using __declattr(property) which replaces read/write of a member field with a call to a getter/setter member function, in this case with appropriate NVPTX builtin. Added a test case to check diagnostics on attempt to construct or improperly access a built-in variable. Differential Revision: http://reviews.llvm.org/D9064 llvm-svn: 235398
* Fix addrspace when emitting constructors of static local variablesJingyue Wu2015-03-251-0/+17
| | | | | | | | | | | | | | | | | | | | Summary: Due to CUDA's implicit address space casting, the type of a static local variable may be more specific (i.e. with address space qualifiers) than the type expected by the constructor. Emit an addrspacecast in that case. Test Plan: Clang used to crash on the added test. Reviewers: nlewycky, pcc, eliben, rsmith Reviewed By: eliben, rsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8575 llvm-svn: 233208
* Test case updates for explicit type parameter to the gep operatorDavid Blaikie2015-03-131-1/+1
| | | | llvm-svn: 232187
* Update Clang tests to handle explicitly typed load changes in LLVM.David Blaikie2015-02-271-6/+6
| | | | llvm-svn: 230795
* CUDA: Add option to allow host device functions to call host functionsJacques Pienaar2015-02-241-0/+32
| | | | | | Commiting code from review http://reviews.llvm.org/D7841 llvm-svn: 230385
* When generating llvm.used, we may need an addrspacecast instead of a bitcast.Justin Holewinski2015-02-021-0/+8
| | | | | | | | | | | | | | | | | | Summary: This is especially important for targets that use multiple address spaces, and commonly place global variables in address spaces other than zero. Fixes PR22383 Test Plan: New test case added: llvm-used.cu Reviewers: jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7345 llvm-svn: 227861
* IR: Make metadata typeless in assembly, clang sideDuncan P. N. Exon Smith2014-12-152-4/+4
| | | | | | Match LLVM changes from r224257. llvm-svn: 224259
* Move all CUDA testing inputs to Inputs/ subdirectory inside the tests.Eli Bendersky2014-04-287-6/+26
| | | | llvm-svn: 207453
* Add test case for r206302Eli Bendersky2014-04-151-0/+30
| | | | llvm-svn: 206303
* Proper handling of static local variables with address space qualifiers.Eli Bendersky2014-03-241-3/+69
| | | | | | | | Similar to the implementation for globals in r157167. Patch by Jingyue Wu. llvm-svn: 204677
* Remove the -cxx-abi command-line flag.Hans Wennborg2014-01-141-2/+2
| | | | | | | | | | | | | | | This makes the C++ ABI depend entirely on the target: MS ABI for -win32 triples, Itanium otherwise. It's no longer possible to do weird combinations. To be able to run a test with a specific ABI without constraining it to a specific triple, new substitutions are added to lit: %itanium_abi_triple and %ms_abi_triple can be used to get the current target triple adjusted to the desired ABI. For example, if the test suite is running with the i686-pc-win32 target, %itanium_abi_triple will expand to i686-pc-mingw32. Differential Revision: http://llvm-reviews.chandlerc.com/D2545 llvm-svn: 199250
* Update tests in preparation for using the MS ABI for Win32 targetsHans Wennborg2014-01-131-2/+2
| | | | | | | | | | In preparation for making the Win32 triple imply MS ABI mode, make all tests pass in this mode, or make them use the Itanium mode explicitly. Differential Revision: http://llvm-reviews.chandlerc.com/D2401 llvm-svn: 199130
* Fix test failures after addrspacecast added.Matt Arsenault2013-11-151-3/+3
| | | | | | Bitcasts between address spaces are no longer allowed. llvm-svn: 194765
* CHECK-LABEL-ify some code gen tests to improve diagnostic experience when ↵Stephen Lin2013-08-151-2/+2
| | | | | | tests fail. llvm-svn: 188447
* Use kernel metadata to differentiate between kernel and deviceJustin Holewinski2013-03-301-3/+7
| | | | | | functions for the NVPTX target. llvm-svn: 178418
* CUDA: give static storage class to __shared__ and __constant__Peter Collingbourne2012-08-281-0/+8
| | | | | | | | variables without a storage class within a function, to implement CUDA B.2.5: "__shared__ and __constant__ variables have implied static storage [duration]." llvm-svn: 162788
* CUDA: give correct address space to globals declared in functionsPeter Collingbourne2012-08-281-0/+4
| | | | llvm-svn: 162787
* Replace PTX back-end with NVPTX back-end in all places where Clang caresJustin Holewinski2012-05-242-8/+8
| | | | | | NV_CONTRIB llvm-svn: 157403
* CUDA: add CodeGen support for global variable address spaces.Peter Collingbourne2012-05-201-0/+24
| | | | | | | | | Because in CUDA types do not have associated address spaces, globals are declared in their "native" address space, and accessed by bitcasting the pointer to address space 0. This relies on address space 0 being a unified address space. llvm-svn: 157167
* CUDA: IR generation support for device stubsPeter Collingbourne2011-10-061-0/+13
| | | | llvm-svn: 141304
* CUDA: add -fcuda-is-device flagPeter Collingbourne2011-10-062-1/+41
| | | | | | | This frontend-only flag is used by the IR generator to determine whether to filter CUDA declarations for the host or for the device. llvm-svn: 141301
* CUDA: IR generation support for kernel call expressionsPeter Collingbourne2011-10-061-0/+13
| | | | llvm-svn: 141300
* CUDA: set proper calling conventions for PTXPeter Collingbourne2011-10-061-0/+12
llvm-svn: 141296
OpenPOWER on IntegriCloud