summaryrefslogtreecommitdiffstats
path: root/clang/test/SemaCUDA
Commit message (Collapse)AuthorAgeFilesLines
* [CUDA][HIP} Add a test for constexpr default ctorYaxun Liu2019-10-111-0/+33
| | | | | | Differential Revision: https://reviews.llvm.org/D68753 llvm-svn: 374502
* [CUDA][HIP] Fix host/device check with -fopenmpYaxun Liu2019-10-094-1/+65
| | | | | | | | | | | | | | | | CUDA/HIP program may be compiled with -fopenmp. In this case, -fopenmp is only passed to host compilation to take advantages of multi-threads computation. CUDA/HIP and OpenMP both use Sema::DeviceCallGraph to store functions to be analyzed and remove them once they decide the function is sure to be emitted. CUDA/HIP and OpenMP have different functions to determine if a function is sure to be emitted. To check host/device correctly for CUDA/HIP when -fopenmp is enabled, there needs a unified logic to determine whether a function is to be emitted. The logic needs to be aware of both CUDA and OpenMP logic. Differential Revision: https://reviews.llvm.org/D67837 llvm-svn: 374263
* [clang] [AST] Treat "inline gnu_inline" the same way as "extern inline ↵Martin Storsjo2019-09-271-1/+1
| | | | | | | | | | | | | | | | | gnu_inline" in C++ mode This matches how GCC handles it, see e.g. https://gcc.godbolt.org/z/HPplnl. GCC documents the gnu_inline attribute with "In C++, this attribute does not depend on extern in any way, but it still requires the inline keyword to enable its special behavior." The previous behaviour of gnu_inline in C++, without the extern keyword, can be traced back to the original commit that added support for gnu_inline, SVN r69045. Differential Revision: https://reviews.llvm.org/D67414 llvm-svn: 373078
* [CUDA][HIP] Enable kernel function return type deduction.Michael Liao2019-09-251-0/+44
| | | | | | | | | | | | | | | | | Summary: - Even though only `void` is still accepted as the deduced return type, enabling deduction/instantiation on the return type allows more consistent coding. Reviewers: tra, jlebar Subscribers: cfe-commits, yaxunl Tags: #clang Differential Revision: https://reviews.llvm.org/D68031 llvm-svn: 372898
* [CUDA][HIP] Fix hostness of defaulted constructorYaxun Liu2019-09-201-0/+43
| | | | | | | | | | | Clang does not respect the explicit device host attributes of defaulted special members. Also clang does not respect the hostness of special members determined by their first declarations. Clang also adds duplicate implicit device or host attributes in certain cases. This patch fixes that. Differential Revision: https://reviews.llvm.org/D67509 llvm-svn: 372394
* [CUDA][HIP] Re-apply part of r372318.Michael Liao2019-09-192-2/+21
| | | | | | | | | | - r372318 causes violation of `use-of-uninitialized-value` detected by MemorySanitizer. Once `Viable` field is set to false, `FailureKind` needs setting as well as it will be checked during destruction if `Viable` is not true. - Revert the part trying to skip `std::vector` erasing. llvm-svn: 372356
* Revert "[CUDA][HIP] Fix typo in `BestViableFunction`"Mitch Phillips2019-09-192-21/+2
| | | | | | | | Broke the msan buildbots (see comments on rL372318 for more details). This reverts commit eb231d15825ac345b546f4c99372d1cac8f14f02. llvm-svn: 372353
* [CUDA][HIP] Fix typo in `BestViableFunction`Michael Liao2019-09-192-2/+21
| | | | | | | | | | | | | | | | | | | Summary: - Should consider viable ones only when checking SameSide candidates. - Replace erasing with clearing viable flag to reduce data moving/copying. - Add one and revise another one as the diagnostic message are more relevant compared to previous one. Reviewers: tra Subscribers: cfe-commits, yaxunl Tags: #clang Differential Revision: https://reviews.llvm.org/D67730 llvm-svn: 372318
* Renamed and changed the wording of warn_cconv_ignoredSunil Srivastava2019-07-171-1/+1
| | | | | | | | As discussed in D64780 the wording of this warning message is being changed to say 'is not supported' instead of 'ignored', and the diag ID itself is being changed to warn_cconv_not_supported. llvm-svn: 366368
* [HIP] Support attribute hip_pinned_shadowYaxun Liu2019-06-261-0/+25
| | | | | | | | | | | | | | | This patch introduces support of hip_pinned_shadow variable for HIP. A hip_pinned_shadow variable is a global variable with attribute hip_pinned_shadow. It has external linkage on device side and has no initializer. It has internal linkage on host side and has initializer or static constructor. It can be accessed in both device code and host code. This allows HIP runtime to implement support of HIP texture reference. Differential Revision: https://reviews.llvm.org/D62738 llvm-svn: 364381
* Permit redeclarations of a builtin to specify calling convention.Erich Keane2019-03-211-1/+1
| | | | | | | | | | | | | | | | | | | | After https://reviews.llvm.org/rL355317 we noticed that quite a decent amount of code redeclares builtins (memcpy in particular, I believe reduced from an MSVC header) with a calling convention specified. This gets particularly troublesome when the user specifies a new 'default' calling convention on the command line. When looking to add a diagnostic for this case, it was noticed that we had 3 other diagnostics that differed only slightly. This patch ALSO unifies those under a 'select'. Unfortunately, the order of words in ONE of these diagnostics was reversed ("'thiscall' calling convention" vs "calling convention 'thiscall'"), so this patch also standardizes on the former. Differential Revision: https://reviews.llvm.org/D59560 Change-Id: I79f99fe7c2301640755ffdd774b46eb44526bb22 llvm-svn: 356663
* [CUDA][HIP][Sema] Fix template kernel with function as template parameterYaxun Liu2019-03-052-3/+8
| | | | | | | | | | | | | | | | | | | | If a kernel template has a function as its template parameter, a device function should be allowed as template argument since a kernel can call a device function. However, currently if the kernel template is instantiated in a host function, clang will emit an error message saying the device function is an invalid candidate for the template parameter. This happens because clang checks the reference to the device function during parsing the template arguments. At this point, the template is not instantiated yet. Clang incorrectly assumes the device function is called by the host function and emits the error message. This patch fixes the issue by disabling checking of device function during parsing template arguments and deferring the check to the instantion of the template. At that point, the template decl is already available, therefore the check can be done against the instantiated function template decl. Differential Revision: https://reviews.llvm.org/D56411 llvm-svn: 355421
* [CUDA][HIP] Check calling convention based on function targetYaxun Liu2019-02-261-0/+4
| | | | | | | | | | | | MSVC header files using vectorcall to differentiate overloaded functions, which causes failure for AMDGPU target. This is because clang does not check function calling convention based on function target. This patch checks calling convention using the proper target info. Differential Revision: https://reviews.llvm.org/D57716 llvm-svn: 354929
* [OPENMP][CUDA]Do not emit warnings for variables in late-reported asmAlexey Bataev2019-02-261-5/+5
| | | | | | | | | | | | statements. If the assembler instruction is not generated and the delayed diagnostic is emitted, we may end up with extra warning message for variables used in the asm statement. Since the asm statement is not built, the variables may be left non-referenced and it may produce a warning about a use of the non-initialized variables. llvm-svn: 354928
* [AMDGPU] Allow using integral non-type template parametersMichael Liao2019-02-261-1/+117
| | | | | | | | | | | | | | | | | | | Summary: - Allow using integral non-type template parameters in the following attributes __attribute__((amdgpu_flat_work_group_size(<min>, <max>))) __attribute__((amdgpu_waves_per_eu(<min>[, <max>]))) Reviewers: kzhuravl, yaxunl Subscribers: jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58623 llvm-svn: 354909
* [OPENMP] Delayed diagnostics for VLA support.Alexey Bataev2019-02-221-2/+9
| | | | | | Generalized processing of the deferred diagnostics for OpenMP/CUDA code. llvm-svn: 354690
* Revert "[OPENMP] Delayed diagnostics for VLA support."Alexey Bataev2019-02-221-9/+2
| | | | | | | This reverts commit r354679 to fix the problem with the Windows buildbots llvm-svn: 354680
* [OPENMP] Delayed diagnostics for VLA support.Alexey Bataev2019-02-221-2/+9
| | | | | | Generalized processing of the deferred diagnostics for OpenMP/CUDA code. llvm-svn: 354679
* [CUDA]Delayed diagnostics for the asm instructions.Alexey Bataev2019-02-221-0/+118
| | | | | | | | | Adapted targetDiag for the CUDA and used for the delayed diagnostics in asm constructs. Works for both host and device compilation sides. Differential Revision: https://reviews.llvm.org/D58463 llvm-svn: 354671
* Revert "[CUDA]Delayed diagnostics for the asm instructions."Alexey Bataev2019-02-211-118/+0
| | | | | | | This reverts commit r354593 to fix the problem with the crash on windows. llvm-svn: 354596
* [CUDA]Delayed diagnostics for the asm instructions.Alexey Bataev2019-02-211-0/+118
| | | | | | | | | | | | | | | | Summary: Adapted targetDiag for the CUDA and used for the delayed diagnostics in asm constructs. Works for both host and device compilation sides. Reviewers: tra, jlebar Subscribers: jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58463 llvm-svn: 354593
* [CUDA] add support for the new kernel launch API in CUDA-9.2+.Artem Belevich2019-01-312-5/+17
| | | | | | | | | | | | | Instead of calling CUDA runtime to arrange function arguments, the new API constructs arguments in a local array and the kernels are launched with __cudaLaunchKernel(). The old API has been deprecated and is expected to go away in the next CUDA release. Differential Revision: https://reviews.llvm.org/D57488 llvm-svn: 352799
* [HIP] Fix size_t for MSVC environmentYaxun Liu2019-01-301-0/+7
| | | | | | | | | | | In 64 bit MSVC environment size_t is defined as unsigned long long. In single source language like HIP, data layout should be consistent in device and host compilation, therefore copy data layout controlling fields from Aux target for AMDGPU target. Differential Revision: https://reviews.llvm.org/D56318 llvm-svn: 352620
* [CUDA][HIP] Do not diagnose use of _Float16Yaxun Liu2019-01-291-0/+7
| | | | | | | | | | | r352221 caused regressions in CUDA/HIP since device function may use _Float16 whereas host does not support it. In this case host compilation should not diagnose usage of _Float16 in device functions or variables. For now just do not diagnose _Float16 for CUDA/HIP. In the future we should have more precise check. Differential Revision: https://reviews.llvm.org/D57369 llvm-svn: 352488
* [CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructorsYaxun Liu2018-10-092-0/+294
| | | | | | | | | | | | | | | | | | | | | ShouldDeleteSpecialMember is called upon inherited constructors. It calls inferCUDATargetForImplicitSpecialMember. Normally the special member enum passed to ShouldDeleteSpecialMember matches the constructor. However this is not true when inherited constructor is passed, where DefaultConstructor is passed to treat the inherited constructor as DefaultConstructor. However inferCUDATargetForImplicitSpecialMember expects the special member enum argument to match the constructor, which results in assertion when this expection is not satisfied. This patch checks whether the constructor is inherited. If true it will get the real special member enum for the constructor and pass it to inferCUDATargetForImplicitSpecialMember. Differential Revision: https://reviews.llvm.org/D51809 llvm-svn: 344057
* [HIP] Support early finalization of device code for -fno-gpu-rdcYaxun Liu2018-10-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original options as aliases. When -fgpu-rdc is off, clang will assume the device code in each translation unit does not call external functions except those in the device library, therefore it is possible to compile the device code in each translation unit to self-contained kernels and embed them in the host object, so that the host object behaves like usual host object which can be linked by lld. The benefits of this feature is: 1. allow users to create static libraries which can be linked by host linker; 2. amortized device code linking time. This patch modifies HIP action builder to insert actions for linking device code and generating HIP fatbin, and pass HIP fatbin to host backend action. It extracts code for constructing command for generating HIP fatbin as a function so that it can be reused by early finalization. It also modifies codegen of HIP host constructor functions to embed the device fatbin when it is available. Differential Revision: https://reviews.llvm.org/D52377 llvm-svn: 343611
* [cxx2a] P0641R2: (Some) type mismatches on defaulted functions onlyRichard Smith2018-09-281-1/+1
| | | | | | | | | | | | | | | | render the function deleted instead of rendering the program ill-formed. This change also adds an enabled-by-default warning for the case where an explicitly-defaulted special member function of a non-template class is implicitly deleted by the type checking rules. (This fires either due to this language change or due to pre-C++20 reasons for the member being implicitly deleted). I've tested this on a large codebase and found only bugs (where the program means something that's clearly different from what the programmer intended), so this is enabled by default, but we should revisit this if there are problems with this being enabled by default. llvm-svn: 343285
* [CUDA] Ignore uncallable functions when we check for usual deallocators.Artem Belevich2018-09-212-3/+98
| | | | | | | | | Previously clang considered function variants from both sides of compilation and that resulted in picking up wrong deallocation function. Differential Revision: https://reviews.llvm.org/D51808 llvm-svn: 342749
* Revert the tests that should've been reverted in rL341115Artem Belevich2018-08-301-2/+2
| | | | llvm-svn: 341118
* [CUDA/OpenMP] Define only some host macros during device compilationJonas Hahnfeld2018-08-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When compiling CUDA or OpenMP device code Clang parses header files that expect certain predefined macros from the host architecture. To make this work the compiler passes the host triple via the -aux-triple argument and (until now) pulls in all macros for that "auxiliary triple" unconditionally. However this results in defines like __SSE_MATH__ that will trigger inline assembly making use of the "advertised" target features. See the discussion of D47849 and PR38464 for a detailed explanation of the encountered problems. Instead of blacklisting "known bad" examples this patch starts adding defines that are needed for certain headers like bits/wordsize.h and bits/mathinline.h. The disadvantage of this approach is that it decouples the definitions from their target toolchain. However in my opinion it's more important to keep definitions for one header close together. For one this will include a clear documentation why these particular defines are needed. Furthermore it simplifies maintenance because adding defines for a new header or support for a new aux-triple only needs to touch one piece of code. Differential Revision: https://reviews.llvm.org/D50845 llvm-svn: 340681
* [CUDA][HIP] Allow function-scope static const variableYaxun Liu2018-07-281-4/+9
| | | | | | | | | | | | | | | | | | | | | | CUDA 8.0 E.3.9.4 says: Within the body of a __device__ or __global__ function, only __shared__ variables or variables without any device memory qualifiers may be declared with static storage class. It is unclear how a function-scope non-const static variable without device memory qualifier is implemented, therefore only static const variable without device memory qualifier is allowed, which can be emitted as a global variable in constant address space. Currently clang only allows function-scope static variable with __shared__ qualifier. This patch also allows function-scope static const variable without device memory qualifier and emits it as a global variable in constant address space. Differential Revision: https://reviews.llvm.org/D49931 llvm-svn: 338188
* [CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributesYaxun Liu2018-06-121-46/+16
| | | | | | | | | | | | There are HIP applications e.g. Tensorflow 1.3 using amdgpu kernel attributes, however currently they are only allowed on OpenCL kernel functions. This patch will allow amdgpu kernel attributes to be applied to CUDA/HIP __global__ functions. Differential Revision: https://reviews.llvm.org/D47958 llvm-svn: 334561
* [CUDA] Check initializers of instantiated template variables.Artem Belevich2018-06-061-0/+17
| | | | | | | | | We were already performing checks on non-template variables, but the checks on templated ones were missing. Differential Revision: https://reviews.llvm.org/D45231 llvm-svn: 334143
* [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces.Justin Lebar2018-05-171-6/+21
| | | | | | | | | | | | | | | Summary: Previously this triggered a -Wundefined-internal warning. But it's not an undefined variable -- any variable of this form is a pointer to the base of GPU core's shared memory. Reviewers: tra Subscribers: sanjoy, rsmith Differential Revision: https://reviews.llvm.org/D46782 llvm-svn: 332621
* Fix typos in clangAlexander Kornienko2018-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399
* Revert "[CUDA] Check initializers of instantiated template variables."Artem Belevich2018-04-041-17/+0
| | | | | | | This (temporarily) reverts commit r329127 due to the problems it exposed in TensorFlow. llvm-svn: 329229
* [CUDA] Check initializers of instantiated template variables.Artem Belevich2018-04-031-0/+17
| | | | | | | | | We were already performing checks on non-template variables, but the checks on templated ones were missing. Differential Revision: https://reviews.llvm.org/D45231 llvm-svn: 329127
* Fix some handling of AST nodes with diagnostics.Richard Trieu2018-03-282-4/+4
| | | | | | | | | The diagnostic system for Clang can already handle many AST nodes. Instead of converting them to strings first, just hand the AST node directly to the diagnostic system and let it handle the output. Minor changes in some diagnostic output. llvm-svn: 328688
* [CUDA] Fixed false error reporting in case of calling H->G->HD->D.Artem Belevich2018-03-231-0/+7
| | | | | | | | | | Launching a kernel from the host code does not generate code for the kernel itself. This fixes an issue with clang erroneously reporting an error for a HD->D call from within the kernel. Differential Revision: https://reviews.llvm.org/D44837 llvm-svn: 328362
* [CUDA] Allow external variables in separate compilationJonas Hahnfeld2018-02-141-0/+5
| | | | | | | | | | | | According to the CUDA Programming Guide this is prohibited in whole program compilation mode. This makes sense because external references cannot be satisfied in that mode anyway. However, such variables are allowed in separate compilation mode which is a valid use case. Differential Revision: https://reviews.llvm.org/D42923 llvm-svn: 325136
* [CUDA] Report "unsupported VLA" errors only on device side.Artem Belevich2017-11-283-4/+4
| | | | | | | | | | | | This fixes erroneously reported CUDA compilation errors in host-side code during device-side compilation. I've also restricted OpenMP-specific checks to trigger only if we're compiling with OpenMP enabled. Differential Revision: https://reviews.llvm.org/D40275 llvm-svn: 319201
* Determine the attribute subject for diagnostics based on declarative ↵Aaron Ballman2017-11-261-1/+1
| | | | | | | | | | information in DeclNodes.td. This greatly reduces the number of enumerated values used for more complex diagnostics; these are now only required when the "attribute only applies to" diagnostic needs to be generated manually as part of semantic processing. This also clarifies some terminology used by the diagnostic (methods -> Objective-C methods, fields -> non-static data members, etc). Many of the tests needed to be updated in multiple places for the diagnostic wording tweaks. The first instance of the diagnostic for that attribute is fully specified and subsequent instances cut off the complete list (to make it easier if additional subjects are added in the future for the attribute). llvm-svn: 319002
* [CUDA] When compilation fails, print the compilation mode.Justin Lebar2017-09-071-0/+7
| | | | | | | | | | | | | | | | | | Summary: That is, instead of "1 error generated", we now say "1 error generated when compiling for sm_35". This (partially) solves a usability foogtun wherein e.g. users call a function that's only defined on sm_60 when compiling for sm_35, and they get an unhelpful error message. Reviewers: tra Subscribers: sanjoy, cfe-commits Differential Revision: https://reviews.llvm.org/D37548 llvm-svn: 312736
* PR34163: Don't cache an incorrect key function for a class if queried betweenRichard Smith2017-08-122-6/+6
| | | | | | | | | | | the class becoming complete and its inline methods being parsed. This replaces the hack of using the "late parsed template" flag to track member functions with bodies we've not parsed yet; instead we now use the "will have body" flag, which carries the desired implication that the function declaration *is* a definition, and that we've just not parsed its body yet. llvm-svn: 310776
* [CUDA] Let NVPTX inherit the host's calling conventions.Justin Lebar2017-01-051-0/+30
| | | | | | | | | | | | | | | | | | | | Summary: When compiling device code, we may still see host code with explicit calling conventions. NVPTX needs to claim that it supports these CCs, so that (a) we don't raise noisy warnings, and (b) we don't break existing code which relies on the existence of these CCs when specializing templates. (If a CC doesn't exist, clang ignores it, so two template specializations which are different only insofar as one specifies a CC are considered identical and therefore are an error if that CC is not supported.) Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28323 llvm-svn: 291136
* [CUDA] Add __declspec spellings for CUDA attributes.Justin Lebar2017-01-051-0/+34
| | | | | | | | | | | | Summary: CUDA attributes are spelled __declspec(__foo__) on Windows. Reviewers: tra Subscribers: cfe-commits, rnk Differential Revision: https://reviews.llvm.org/D28321 llvm-svn: 291134
* [CUDA] Ignore implicit target attributes during function template instantiation.Artem Belevich2016-12-081-5/+36
| | | | | | | | | | | | | | | | | | | | | | | Some functions and templates are treated as __host__ __device__ even when they don't have explicitly specified target attributes. What's worse, this treatment may change depending on command line options (-fno-cuda-host-device-constexpr) or #pragma clang force_cuda_host_device. Combined with strict checking for matching function target that comes with D25809(r288962), it makes it hard to write code which would explicitly instantiate or specialize some functions regardless of pragmas or command line options in effect. This patch changes the way we match target attributes of base template vs attributes used in explicit instantiation or specialization so that only explicitly specified attributes are considered. This makes base template selection behave consistently regardless of pragma of command line options that may affect CUDA target. Differential Revision: https://reviews.llvm.org/D25845 llvm-svn: 289091
* [CUDA] Improve target attribute checking for function templates.Artem Belevich2016-12-073-55/+114
| | | | | | | | | | | | * __host__ __device__ functions are no longer considered to be redeclarations of __host__ or __device__ functions. This prevents unintentional merging of target attributes across them. * Function target attributes are not considered (and must match) during explicit instantiation and specialization of function templates. Differential Revision: https://reviews.llvm.org/D25809 llvm-svn: 288962
* [CUDA] Use only the GVALinkage on function definitions.Justin Lebar2016-11-081-0/+52
| | | | | | | | | | | | | | | | | | | Summary: Previously we'd look at the GVALinkage of whatever FunctionDecl you happened to be calling. This is not right. In the absence of the gnu_inline attribute, to be handled separately, the function definition determines the function's linkage. So we need to wait until we get a def before we can know whether something is known-emitted. Reviewers: tra Subscribers: cfe-commits, rsmith Differential Revision: https://reviews.llvm.org/D26268 llvm-svn: 286313
* [CUDA] [AST] Allow isInlineDefinitionExternallyVisible to be called on ↵Justin Lebar2016-10-281-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | functions without bodies. Summary: In CUDA compilation, we call isInlineDefinitionExternallyVisible (via getGVALinkageForFunction) on functions while parsing their definitions. At the point in time when we call getGVALinkageForFunction, we haven't yet added the body to the function, so we trip this assert. But as far as I can tell, this is harmless. To work around this, we add a new flag to FunctionDecl, "WillHaveBody". There was other code that was working around the existing assert with a really awful hack -- this change lets us get rid of that hack. Reviewers: rsmith, tra Subscribers: aemerson, cfe-commits Differential Revision: https://reviews.llvm.org/D25640 llvm-svn: 285410
OpenPOWER on IntegriCloud