summaryrefslogtreecommitdiffstats
path: root/clang/test/SemaCUDA
Commit message (Collapse)AuthorAgeFilesLines
* [CUDA][HIP] Allow function-scope static const variableYaxun Liu2018-07-281-4/+9
| | | | | | | | | | | | | | | | | | | | | | CUDA 8.0 E.3.9.4 says: Within the body of a __device__ or __global__ function, only __shared__ variables or variables without any device memory qualifiers may be declared with static storage class. It is unclear how a function-scope non-const static variable without device memory qualifier is implemented, therefore only static const variable without device memory qualifier is allowed, which can be emitted as a global variable in constant address space. Currently clang only allows function-scope static variable with __shared__ qualifier. This patch also allows function-scope static const variable without device memory qualifier and emits it as a global variable in constant address space. Differential Revision: https://reviews.llvm.org/D49931 llvm-svn: 338188
* [CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributesYaxun Liu2018-06-121-46/+16
| | | | | | | | | | | | There are HIP applications e.g. Tensorflow 1.3 using amdgpu kernel attributes, however currently they are only allowed on OpenCL kernel functions. This patch will allow amdgpu kernel attributes to be applied to CUDA/HIP __global__ functions. Differential Revision: https://reviews.llvm.org/D47958 llvm-svn: 334561
* [CUDA] Check initializers of instantiated template variables.Artem Belevich2018-06-061-0/+17
| | | | | | | | | We were already performing checks on non-template variables, but the checks on templated ones were missing. Differential Revision: https://reviews.llvm.org/D45231 llvm-svn: 334143
* [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces.Justin Lebar2018-05-171-6/+21
| | | | | | | | | | | | | | | Summary: Previously this triggered a -Wundefined-internal warning. But it's not an undefined variable -- any variable of this form is a pointer to the base of GPU core's shared memory. Reviewers: tra Subscribers: sanjoy, rsmith Differential Revision: https://reviews.llvm.org/D46782 llvm-svn: 332621
* Fix typos in clangAlexander Kornienko2018-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399
* Revert "[CUDA] Check initializers of instantiated template variables."Artem Belevich2018-04-041-17/+0
| | | | | | | This (temporarily) reverts commit r329127 due to the problems it exposed in TensorFlow. llvm-svn: 329229
* [CUDA] Check initializers of instantiated template variables.Artem Belevich2018-04-031-0/+17
| | | | | | | | | We were already performing checks on non-template variables, but the checks on templated ones were missing. Differential Revision: https://reviews.llvm.org/D45231 llvm-svn: 329127
* Fix some handling of AST nodes with diagnostics.Richard Trieu2018-03-282-4/+4
| | | | | | | | | The diagnostic system for Clang can already handle many AST nodes. Instead of converting them to strings first, just hand the AST node directly to the diagnostic system and let it handle the output. Minor changes in some diagnostic output. llvm-svn: 328688
* [CUDA] Fixed false error reporting in case of calling H->G->HD->D.Artem Belevich2018-03-231-0/+7
| | | | | | | | | | Launching a kernel from the host code does not generate code for the kernel itself. This fixes an issue with clang erroneously reporting an error for a HD->D call from within the kernel. Differential Revision: https://reviews.llvm.org/D44837 llvm-svn: 328362
* [CUDA] Allow external variables in separate compilationJonas Hahnfeld2018-02-141-0/+5
| | | | | | | | | | | | According to the CUDA Programming Guide this is prohibited in whole program compilation mode. This makes sense because external references cannot be satisfied in that mode anyway. However, such variables are allowed in separate compilation mode which is a valid use case. Differential Revision: https://reviews.llvm.org/D42923 llvm-svn: 325136
* [CUDA] Report "unsupported VLA" errors only on device side.Artem Belevich2017-11-283-4/+4
| | | | | | | | | | | | This fixes erroneously reported CUDA compilation errors in host-side code during device-side compilation. I've also restricted OpenMP-specific checks to trigger only if we're compiling with OpenMP enabled. Differential Revision: https://reviews.llvm.org/D40275 llvm-svn: 319201
* Determine the attribute subject for diagnostics based on declarative ↵Aaron Ballman2017-11-261-1/+1
| | | | | | | | | | information in DeclNodes.td. This greatly reduces the number of enumerated values used for more complex diagnostics; these are now only required when the "attribute only applies to" diagnostic needs to be generated manually as part of semantic processing. This also clarifies some terminology used by the diagnostic (methods -> Objective-C methods, fields -> non-static data members, etc). Many of the tests needed to be updated in multiple places for the diagnostic wording tweaks. The first instance of the diagnostic for that attribute is fully specified and subsequent instances cut off the complete list (to make it easier if additional subjects are added in the future for the attribute). llvm-svn: 319002
* [CUDA] When compilation fails, print the compilation mode.Justin Lebar2017-09-071-0/+7
| | | | | | | | | | | | | | | | | | Summary: That is, instead of "1 error generated", we now say "1 error generated when compiling for sm_35". This (partially) solves a usability foogtun wherein e.g. users call a function that's only defined on sm_60 when compiling for sm_35, and they get an unhelpful error message. Reviewers: tra Subscribers: sanjoy, cfe-commits Differential Revision: https://reviews.llvm.org/D37548 llvm-svn: 312736
* PR34163: Don't cache an incorrect key function for a class if queried betweenRichard Smith2017-08-122-6/+6
| | | | | | | | | | | the class becoming complete and its inline methods being parsed. This replaces the hack of using the "late parsed template" flag to track member functions with bodies we've not parsed yet; instead we now use the "will have body" flag, which carries the desired implication that the function declaration *is* a definition, and that we've just not parsed its body yet. llvm-svn: 310776
* [CUDA] Let NVPTX inherit the host's calling conventions.Justin Lebar2017-01-051-0/+30
| | | | | | | | | | | | | | | | | | | | Summary: When compiling device code, we may still see host code with explicit calling conventions. NVPTX needs to claim that it supports these CCs, so that (a) we don't raise noisy warnings, and (b) we don't break existing code which relies on the existence of these CCs when specializing templates. (If a CC doesn't exist, clang ignores it, so two template specializations which are different only insofar as one specifies a CC are considered identical and therefore are an error if that CC is not supported.) Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28323 llvm-svn: 291136
* [CUDA] Add __declspec spellings for CUDA attributes.Justin Lebar2017-01-051-0/+34
| | | | | | | | | | | | Summary: CUDA attributes are spelled __declspec(__foo__) on Windows. Reviewers: tra Subscribers: cfe-commits, rnk Differential Revision: https://reviews.llvm.org/D28321 llvm-svn: 291134
* [CUDA] Ignore implicit target attributes during function template instantiation.Artem Belevich2016-12-081-5/+36
| | | | | | | | | | | | | | | | | | | | | | | Some functions and templates are treated as __host__ __device__ even when they don't have explicitly specified target attributes. What's worse, this treatment may change depending on command line options (-fno-cuda-host-device-constexpr) or #pragma clang force_cuda_host_device. Combined with strict checking for matching function target that comes with D25809(r288962), it makes it hard to write code which would explicitly instantiate or specialize some functions regardless of pragmas or command line options in effect. This patch changes the way we match target attributes of base template vs attributes used in explicit instantiation or specialization so that only explicitly specified attributes are considered. This makes base template selection behave consistently regardless of pragma of command line options that may affect CUDA target. Differential Revision: https://reviews.llvm.org/D25845 llvm-svn: 289091
* [CUDA] Improve target attribute checking for function templates.Artem Belevich2016-12-073-55/+114
| | | | | | | | | | | | * __host__ __device__ functions are no longer considered to be redeclarations of __host__ or __device__ functions. This prevents unintentional merging of target attributes across them. * Function target attributes are not considered (and must match) during explicit instantiation and specialization of function templates. Differential Revision: https://reviews.llvm.org/D25809 llvm-svn: 288962
* [CUDA] Use only the GVALinkage on function definitions.Justin Lebar2016-11-081-0/+52
| | | | | | | | | | | | | | | | | | | Summary: Previously we'd look at the GVALinkage of whatever FunctionDecl you happened to be calling. This is not right. In the absence of the gnu_inline attribute, to be handled separately, the function definition determines the function's linkage. So we need to wait until we get a def before we can know whether something is known-emitted. Reviewers: tra Subscribers: cfe-commits, rsmith Differential Revision: https://reviews.llvm.org/D26268 llvm-svn: 286313
* [CUDA] [AST] Allow isInlineDefinitionExternallyVisible to be called on ↵Justin Lebar2016-10-281-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | functions without bodies. Summary: In CUDA compilation, we call isInlineDefinitionExternallyVisible (via getGVALinkageForFunction) on functions while parsing their definitions. At the point in time when we call getGVALinkageForFunction, we haven't yet added the body to the function, so we trip this assert. But as far as I can tell, this is harmless. To work around this, we add a new flag to FunctionDecl, "WillHaveBody". There was other code that was working around the existing assert with a really awful hack -- this change lets us get rid of that hack. Reviewers: rsmith, tra Subscribers: aemerson, cfe-commits Differential Revision: https://reviews.llvm.org/D25640 llvm-svn: 285410
* [CUDA] Simplify some repeated diagnostic expectations in CUDA tests.Justin Lebar2016-10-214-22/+7
| | | | | | | | Instead of repeating the diagnostic, use "expected-note N". Test-only change. llvm-svn: 284882
* Declare H and H new/delete.Artem Belevich2016-10-211-1/+45
| | | | llvm-svn: 284879
* [CUDA] When we emit an error that might have been deferred, also print a ↵Justin Lebar2016-10-197-4/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | callstack. Summary: Previously, when you did something not allowed in a host+device function and then caused it to be codegen'ed, we would print out an error telling you that you did something bad, but we wouldn't tell you how we decided that the function needed to be codegen'ed. This change causes us to print out a callstack when emitting deferred errors. This is immensely helpful when debugging highly-templated code, where it's often unclear how a function became known-emitted. We only print the callstack once per function, after we print the all deferred errors. This patch also switches all of our hashtables to using canonical FunctionDecls instead of regular FunctionDecls. This prevents a number of bugs, some of which are caught by tests added here, in which we assume that two FDs for the same function have the same pointer value. Reviewers: rnk Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D25704 llvm-svn: 284647
* [CUDA] Emit errors for wrong-side calls made on the same line as ↵Justin Lebar2016-10-191-0/+39
| | | | | | | | | | | | | | | | | | | | | | | non-wrong-side calls. Summary: This fixes two related bugs: 1) Previously, if you had a non-wrong side call at some source code location L, we wouldn't emit errors for wrong-side calls that appeared at L. 2) We'd only emit one wrong-side error per source code location, when we actually want to emit it twice if we hit this line more than once due to e.g. template instantiation. Reviewers: tra Subscribers: rnk, cfe-commits Differential Revision: https://reviews.llvm.org/D25702 llvm-svn: 284643
* [CUDA] Rework tests now that we emit deferred diagnostics during sema. ↵Justin Lebar2016-10-1910-137/+86
| | | | | | | | | | | | | | | | | | | | | | Test-only change. Summary: Previously we had to split out a lot of our tests into a test that checked only immediate errors and a test that checked only deferred errors. This was because, if you emitted any immediate errors, we wouldn't run codegen, where the deferred errors were emitted. We've fixed this, and now emit deferred errors during sema. This lets us merge a bunch of tests, and lets us convert some other tests to -fsyntax-only. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25755 llvm-svn: 284553
* [CUDA] Fix false-positive in known-emitted handling.Justin Lebar2016-10-171-0/+44
| | | | | | | | | | | | | | | | | Previously: When compiling for host, our constructed call graph went *through* kernel calls. This meant that if we had host calls kernel calls HD we would incorrectly mark the HD function as known-emitted on the host side, and thus perform host-side checks on it. Fixing this exposed another issue, wherein when marking a function as known-emitted, we also need to traverse the callgraph of its template, because non-dependent calls are attached to a function's template, not its instantiation. llvm-svn: 284355
* [CUDA] Emit deferred diagnostics during Sema rather than during codegen.Justin Lebar2016-10-134-10/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Emitting deferred diagnostics during codegen was a hack. It did work, but usability was poor, both for us as compiler devs and for users. We don't codegen if there are any sema errors, so for users this meant that they wouldn't see deferred errors if there were any non-deferred errors. For devs, this meant that we had to carefully split up our tests so that when we tested deferred errors, we didn't emit any non-deferred errors. This change moves checking for deferred errors into Sema. See the big comment in SemaCUDA.cpp for an overview of the idea. This checking adds overhead to compilation, because we have to maintain a partial call graph. As a result, this change makes deferred errors a CUDA-only concept (whereas before they were a general concept). If anyone else wants to use this framework for something other than CUDA, we can generalize at that time. This patch makes the minimal set of test changes -- after this lands, I'll go back through and do a cleanup of the tests that we no longer have to split up. Reviewers: rnk Subscribers: cfe-commits, rsmith, tra Differential Revision: https://reviews.llvm.org/D25541 llvm-svn: 284158
* [CUDA] Allow static variables in __host__ __device__ functions, so long as ↵Justin Lebar2016-10-132-3/+23
| | | | | | | | | | | | they're never codegen'ed for device. Reviewers: tra, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25150 llvm-svn: 284145
* [CUDA] Disallow __shared__ variables in host functions.Justin Lebar2016-10-131-0/+1
| | | | | | | | | | Reviewers: tra, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25143 llvm-svn: 284144
* [CUDA] Add Sema::CUDADiagBuilder and Sema::CUDADiagIf{Device,Host}Code().Justin Lebar2016-10-132-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Together these let you easily create diagnostics that - are never emitted for host code - are always emitted for __device__ and __global__ functions, and - are emitted for __host__ __device__ functions iff these functions are codegen'ed. At the moment there are only three diagnostics that need this treatment, but I have more to add, and it's not sustainable to write code for emitting every such diagnostic twice, and from a special wrapper in SemaCUDA.cpp. While we're at it, don't emit the function name in err_cuda_device_exceptions: It's not necessary to print it, and making this work in the new framework in the face of a null value for dyn_cast<FunctionDecl>(CurContext) isn't worth the effort. Reviewers: rnk Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D25139 llvm-svn: 284143
* Added REQUIRED triples to the test that fails on some ARM buildbots.Artem Belevich2016-10-121-0/+3
| | | | llvm-svn: 283964
* [CUDA] Make touching a kernel from a __host__ __device__ function a deferred ↵Justin Lebar2016-10-123-19/+37
| | | | | | | | | | | | | | | | error. Previously, this was an immediate, don't pass go, don't collect $200 error. But this precludes us from writing code like __host__ __device__ void launch_kernel() { kernel<<<...>>>(); } Such code isn't wrong, following our notions of right and wrong in CUDA, unless it's codegen'ed. llvm-svn: 283963
* Aligned allocation versus CUDA: make deallocation function preference orderRichard Smith2016-10-111-0/+13
| | | | | | | | | | | | | match other CUDA preference orders, per discussion with jlebar. We now model this in an attempt to match overload resolution as closely as possible: - First, we throw out all non-callable (due to CUDA host/device mismatch) operator delete functions. - Then we apply sizedness / alignedness preferences based on whether the type is overaligned and whether the deallocation function is a member. - Finally, we use the CUDA callability preference as a tiebreaker. llvm-svn: 283830
* [CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h.Justin Lebar2016-10-081-6/+6
| | | | | | | | | | | | Summary: This matches the idiom we use for our other CUDA wrapper headers. Reviewers: tra Subscribers: beanz, mgorny, cfe-commits Differential Revision: https://reviews.llvm.org/D24978 llvm-svn: 283679
* [CUDA] Do a better job at detecting wrong-side calls.Justin Lebar2016-10-082-1/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Move CheckCUDACall from ActOnCallExpr and BuildDeclRefExpr to DiagnoseUseOfDecl. This lets us catch some edge cases we were missing, specifically around class operators. This necessitates a few other changes: - Avoid emitting duplicate deferred diags in CheckCUDACall. Previously we'd carefully placed our call to CheckCUDACall such that it would only ever run once for a particular callsite. But now this isn't the case. - Emit deferred diagnostics from a template specialization/instantiation's primary template, in addition to from the specialization/instantiation itself. DiagnoseUseOfDecl ends up putting the deferred diagnostics on the template, rather than the specialization, so we need to check both. Reviewers: rsmith Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D24573 llvm-svn: 283637
* [CUDA] Disallow overloading destructors.Justin Lebar2016-10-033-50/+33
| | | | | | | | | | | | | | | | | | | | | | Summary: We'd attempted to allow this, but turns out we were doing a very bad job. :) Making this work properly would be a giant change in clang. For example, we'd need to make CXXRecordDecl::getDestructor() context-sensitive, because the destructor you end up with depends on where you're calling it from. For now (and hopefully for ever), just disallow overloading of destructors in CUDA. Reviewers: rsmith Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D24571 llvm-svn: 283120
* [CUDA] Allow extern __shared__ on empty-length arrays.Justin Lebar2016-10-021-1/+10
| | | | | | "extern __shared__ int x[]" is OK. llvm-svn: 283068
* [CUDA] Disallow __constant__ local variables.Justin Lebar2016-09-301-0/+8
| | | | | | | | | | Reviewers: tra, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25129 llvm-svn: 282986
* [CUDA] Disallow 'extern __shared__' variables.Justin Lebar2016-09-302-0/+16
| | | | | | | | | | | | | | | | | | Also add a test that we disallow __constant__ __shared__ int x; because it's possible to break this without breaking __shared__ __constant__ int x; Reviewers: rnk Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D25125 llvm-svn: 282985
* [CUDA] Fix implicit-device-lambda.cu after r282911.Justin Lebar2016-09-301-2/+2
| | | | | | | This commit added a warning that we're (correctly) hitting in this test. Just ignore it. llvm-svn: 282927
* [CUDA] Make lambdas inherit __host__ and __device__ attributes from the ↵Justin Lebar2016-09-302-0/+113
| | | | | | | | | | | | | | scope in which they're created. Summary: NVCC compat. Fixes bug 30567. Reviewers: tra Subscribers: cfe-commits, rnk Differential Revision: https://reviews.llvm.org/D25105 llvm-svn: 282880
* [CUDA] Disallow variable-length arrays in CUDA device code.Justin Lebar2016-09-282-0/+33
| | | | | | | | | | Reviewers: tra Subscribers: cfe-commits, jhen Differential Revision: https://reviews.llvm.org/D25050 llvm-svn: 282647
* [CUDA] Disallow exceptions in device code.Justin Lebar2016-09-282-0/+59
| | | | | | | | | | Reviewers: tra Subscribers: cfe-commits, jhen Differential Revision: https://reviews.llvm.org/D25036 llvm-svn: 282646
* [AMDGPU] Expose flat work group size, register and wave control attributesKonstantin Zhuravlyov2016-09-262-14/+110
| | | | | | | | | __attribute__((amdgpu_flat_work_group_size(<min>, <max>))) - request minimum and maximum flat work group size __attribute__((amdgpu_waves_per_eu(<min>[, <max>]))) - request minimum and/or maximum waves per execution unit Differential Revision: https://reviews.llvm.org/D24513 llvm-svn: 282371
* [CUDA] Add test checking our ability to take a function pointer to a ↵Justin Lebar2016-09-141-0/+31
| | | | | | | | | | | | | | __global__ function on the host side. Summary: This functionality is used by Thrust. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24581 llvm-svn: 281543
* [CUDA] Do not merge CUDA target attributes.Artem Belevich2016-09-132-0/+40
| | | | | | | | | | | | CUDA target attributes are used for function overloading and must not be merged. This fixes a bug where attributes were inherited during function template specialization in CUDA and made it impossible for specialized function to provide its own target attributes. Differential Revision: https://reviews.llvm.org/D24522 llvm-svn: 281406
* [CUDA] Fix "declared here" note on deferred wrong-side errors.Justin Lebar2016-08-162-0/+19
| | | | | | | Previously we weren't deferring these "declared here" notes, which is obviously wrong. llvm-svn: 278767
* [CUDA] Raise an error if a wrong-side call is codegen'ed.Justin Lebar2016-08-153-0/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Some function calls in CUDA are allowed to appear in semantically-correct programs but are an error if they're ever codegen'ed. Specifically, a host+device function may call a host function, but it's an error if such a function is ever codegen'ed in device mode (and vice versa). Previously, clang made no attempt to catch these errors. For the most part, they would be caught by ptxas, and reported as "call to unknown function 'foo'". Now we catch these errors and report them the same as we report other illegal calls (e.g. a call from a host function to a device function). This has a small change in error-message behavior for calls that were previously disallowed (e.g. calls from a host to a device function). Previously, we'd catch disallowed calls fairly early, before doing additional semantic checking e.g. of the call's arguments. Now we catch these illegal calls at the very end of our semantic checks, so we'll only emit a "illegal CUDA call" error if the call is otherwise well-formed. Reviewers: tra, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23242 llvm-svn: 278759
* [CUDA] Reject calls to __device__ functions from host variable global ↵Justin Lebar2016-08-101-0/+32
| | | | | | | | | | | | initializers. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23335 llvm-svn: 278196
* [CUDA] Print a "previous-decl" note when calling an illegal member fn.Justin Lebar2016-08-101-3/+3
| | | | | | | | | | | | | | | Summary: When we emit err_ref_bad_target, we should emit a "'method' declared here" note. We already do so in most places, just not in BuildCallToMemberFunction. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23240 llvm-svn: 278195
OpenPOWER on IntegriCloud