| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
https://reviews.llvm.org/D38371
This patch implements codegen for the combined 'teams distribute" OpenMP pragma and adds regression tests for all its clauses.
llvm-svn: 314905
|
| |
|
|
|
|
|
|
|
|
| |
declaration.
Patch allows using of the `#pragma omp declare target`| `#pragma omp end
declare target` directives inside the structures if we need to mark as
declare target only some static members.
llvm-svn: 314833
|
| |
|
|
|
|
|
|
|
|
| |
directives.
The argument of the `device` clause in target-based executable
directives must be captured to support codegen for the `target`
directives with the `depend` clauses.
llvm-svn: 314686
|
| |
|
|
| |
llvm-svn: 314220
|
| |
|
|
|
|
|
|
|
|
| |
directives.
If the variable is used in the target-based region but is not found in
any private|mapping clause, then generate implicit firstprivate|map
clauses for these implicitly mapped variables.
llvm-svn: 314205
|
| |
|
|
|
|
|
|
|
| |
If the captured variable has some redeclarations we may run into the
situation where the redeclaration is used instead of the canonical
declaration and we may consider this variable as one not captured
before.
llvm-svn: 313880
|
| |
|
|
|
|
|
|
|
|
|
| |
When the value specified for n in ordered(n) is larger than the number of loops a segmentation fault can occur in one of two ways when attempting to print out a diagnostic for an associated depend(sink : vec):
1) The iteration vector vec contains less than n items
2) The iteration vector vec contains a variable that is not a loop control variable
This patch addresses both of these issues.
Differential Revision: https://reviews.llvm.org/D38049
llvm-svn: 313675
|
| |
|
|
|
|
|
| |
According to upcoming OpenMP 5.0 all classes/structs are now considered
as mappable, even polymorphic and with static members.
llvm-svn: 313141
|
| |
|
|
|
|
|
|
|
|
| |
move constructor.
Previously user-defined reduction initializer was considered as an
assignment expression, not as initializer. Fixed this by treating the
initializer expression as an initializer.
llvm-svn: 312638
|
| |
|
|
|
|
|
|
|
|
|
|
| |
step>1.
If the loop is a loot with random access iterators and the iteration
construct is represented it += n, then the compiler crashed because of
reusing of the same MaterializedTemporaryExpr around N. Patch fixes it
by using the expression as written, without any special kind of
wrappings.
llvm-svn: 312292
|
| |
|
|
|
|
|
|
| |
Capturing of the global variables occurs only in target regions. Patch
fixes it and allows capturing of globals in all target executable
directives.
llvm-svn: 312024
|
| |
|
|
| |
llvm-svn: 311908
|
| |
|
|
|
|
|
|
|
| |
SEGFAULT at compile time
Compiler crashed when tried to rebuild non-template expression in
dependent context.
llvm-svn: 311777
|
| |
|
|
|
|
|
|
|
|
|
| |
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310387
|
| |
|
|
|
|
| |
This reverts commit r310377.
llvm-svn: 310379
|
| |
|
|
|
|
|
|
|
|
|
| |
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310377
|
| |
|
|
|
|
| |
This reverts commit r310360.
llvm-svn: 310364
|
| |
|
|
|
|
|
|
|
|
|
| |
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310360
|
| |
|
|
|
|
| |
This reverts commit r310104.
llvm-svn: 310135
|
| |
|
|
|
|
|
|
|
|
|
| |
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310104
|
| |
|
|
|
|
|
| |
According to upcoming OpenMP 5.0 all addressable lvalue expressions are
allowed in deoend clause.
llvm-svn: 309309
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added codegen for task-based directive with in_reduction clause.
```
<body>
```
The next code is emitted:
```
void *td;
...
td = call i8* @__kmpc_task_reduction_init();
...
<type> *priv = (<type> *)call i8* @__kmpc_task_reduction_get_th_data(i32
GTID, i8* td, i8* <orig>)
```
llvm-svn: 309270
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added codegen for taskgroup directive with task_reduction clause.
```
<body>
```
The next code is emitted:
```
%struct.kmp_task_red_input_t red_init[n];
void *td;
call void @__kmpc_taskgroup(%ident_t id, i32 gtid)
...
red_init[i].shar = &<item>;
red_init[i].size = sizeof(<item>);
red_init[i].init = (void*)initializer_function;
red_init[i].fini = (void*)destructor_function;
red_init[i].comb = (void*)combiner_function;
red_init[i].flags = flags;
...
td = call i8* @__kmpc_task_reduction_init(i32 gtid, i32 n, i8*
(void*)red_init);
call void @__kmpc_end_taskgroup(%ident_t id, i32 gtid)
void initializer_function(i8* priv) {
*(<type>*)priv = <red_init>;
ret void;
}
void destructor_function(i8* priv) {
(<type>*)priv->~();
ret void;
}
void combiner_function(i8* inout, i8* in) {
*(<type>*)inout = *(<type>*)inout <red_id> *(<type>*)in;
ret void;
}
```
llvm-svn: 308979
|
| |
|
|
|
|
|
| |
This patch allows to use in_reduction clause even if the innermost
directive is not taskgroup.
llvm-svn: 308883
|
| |
|
|
| |
llvm-svn: 308783
|
| |
|
|
| |
llvm-svn: 308773
|
| |
|
|
|
|
|
| |
Parsing/sema analysis for 'in_reduction' clause for task-based
directives.
llvm-svn: 308768
|
| |
|
|
| |
llvm-svn: 308759
|
| |
|
|
|
|
|
|
| |
If the member declaration is captured in the OMPCapturedExprDecl, we may
loose data-sharing attribute info for this declaration. Patch fixes this
bug.
llvm-svn: 308629
|
| |
|
|
|
|
| |
Parsing/sema analysis of the 'task_reduction' clause.
llvm-svn: 308352
|
| |
|
|
|
|
| |
NFC.
llvm-svn: 308317
|
| |
|
|
|
|
| |
Adds codegen for taskloop-based directives.
llvm-svn: 308174
|
| |
|
|
|
|
|
|
| |
Added checks for the reduction clauses in the taskloop directives:
1. Only addressable items must be used in reduction clauses.
2. Reduction clauses cannot be used with nogroup clauses.
llvm-svn: 307693
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Combined directives like 'target parallel' have two captured statements.
Sema has to check the right one from the right direction.
Previously, Sema::IsOpenMPCapturedByRef would return false for mapped
scalars on combined directives. This results in a wrong signature of
the outlined function which triggers an assertion:
void llvm::CallInst::init(llvm::FunctionType *, llvm::Value *, ArrayRef<llvm::Value *>, ArrayRef<OperandBundleDef>, const llvm::Twine &): Assertion `(i >= FTy->getNumParams() || FTy->getParamType(i) == Args[i]->getType()) && "Calling a function with a bad signature!"' failed.
Fixes PR30975 (and PR31985). New function was taken from clang-ykt.
Differential Revision: https://reviews.llvm.org/D34888
llvm-svn: 306956
|
| |
|
|
|
|
|
|
| |
According to OpenMP 5.0 at least one 'map' or 'use_device_ptr' clause
must be specified for 'target data' construct. Patch adds support for
this feature.
llvm-svn: 304216
|
| |
|
|
|
|
|
| |
Add an extra check for the iterator during checks of the data-sharing
attributes.
llvm-svn: 301549
|
| |
|
|
|
|
|
| |
Remove some unneccesary code from the function after the fix for ASAN
buildbots.
llvm-svn: 301547
|
| |
|
|
| |
llvm-svn: 301536
|
| |
|
|
|
|
|
|
|
| |
If some function template is instantiated during handling of OpenMP
code, currently it may cause crash of compiler because of trying of
capturing variables in non-capturing function scopes. Patch fixes this
bug.
llvm-svn: 301416
|
| |
|
|
|
|
|
| |
Threadprivate variables do no need to be handled in the Stack of all
directives, moving it out for better performance and memory.
llvm-svn: 301410
|
| |
|
|
|
|
|
|
|
|
| |
omp for
https://reviews.llvm.org/D32237
This patch prepares sema with additional fields to support all those composite and combined constructs of OpenMP that include pragma 'distribute' and 'for', such as 'distribute parallel for'. It also extends the regression tests for 'distribute parallel for' and adds a new one.
llvm-svn: 300802
|
| |
|
|
|
|
|
|
| |
- also replace direct equality checks against the ConstantEvaluated enumerator with isConstantEvaluted(), in anticipation of adding finer granularity to the various ConstantEvaluated contexts and reinstating certain restrictions on where lambda expressions can occur in C++17.
- update the clang tablegen backend that uses these Enumerators, and add the relevant scope where needed.
llvm-svn: 299316
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
checkNestingOfRegions uses CancelRegion to determine whether cancel and
cancellation point are valid in the given nesting. This leads to unuseful
diagnostics if CancelRegion is invalid. The given test case has produced:
region cannot be closely nested inside 'parallel' region
As a solution, introduce checkCancelRegion and call it first to get the
expected error:
one of 'for', 'parallel', 'sections' or 'taskgroup' is expected
Differential Revision: https://reviews.llvm.org/D30135
llvm-svn: 295808
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
parallel for'
https://reviews.llvm.org/D29922
This patch adds two fields for use in the implementation of 'distribute parallel for':
The increment expression for the distribute loop. As the chunk assigned to a team is executed by multiple threads within the 'parallel for' region, the increment expression has to correspond to the value returned by the related runtime call (for_static_init).
The upper bound of the innermost loop ('for' in 'distribute parallel for') is not the globalUB expression normally used for pragma 'for' when found in isolation. It is instead the upper bound of the chunk assigned to the team ('distribute' loop). In this way, we prevent teams from executing chunks assigned to other teams.
The use of these two fields can be see in a related explanatory patch:
https://reviews.llvm.org/D29508
llvm-svn: 295497
|
| |
|
|
|
|
|
|
|
|
|
| |
The thread_limit-clause on the combined directive applies to the
'teams' region of this construct. We modify the ThreadLimitClause
class to capture the clause expression within the 'target' region.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29087
llvm-svn: 293049
|
| |
|
|
|
|
|
|
|
|
|
| |
The num_teams-clause on the combined directive applies to the
'teams' region of this construct. We modify the NumTeamsClause
class to capture the clause expression within the 'target' region.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29085
llvm-svn: 293048
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for codegen of 'target teams' on the host.
This combined directive has two captured statements, one for the
'teams' region, and the other for the 'parallel'.
This target teams region is offloaded using the __tgt_target_teams()
call. The patch sets the number of teams as an argument to
this call.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29084
llvm-svn: 293005
|
| |
|
|
|
|
| |
patches.
llvm-svn: 293003
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for codegen of 'target teams' on the host.
This combined directive has two captured statements, one for the
'teams' region, and the other for the 'parallel'.
This target teams region is offloaded using the __tgt_target_teams()
call. The patch sets the number of teams as an argument to
this call.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29084
llvm-svn: 293001
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The num_threads-clause on the combined directive applies to the
'parallel' region of this construct. We modify the NumThreadsClause
class to capture the clause expression within the 'target' region.
The offload runtime call for 'target parallel' is changed to
__tgt_target_teams() with 1 team and the number of threads set by
this clause or a default if none.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29082
llvm-svn: 292997
|