| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Add parsing, sema analysis and codegen for 'if' clause in 'cancel' directive.
llvm-svn: 247976
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if 'cancel' is found.
Patch improves codegen for OpenMP constructs. If the OpenMP region does not have internal 'cancel' construct, a call to 'void __kmpc_barrier()' runtime function is generated for all implicit/explicit barriers. If the region has inner 'cancel' directive, then
```
if (__kmpc_cancel_barrier())
exit from outer construct;
```
code is generated.
Also, the code for 'canellation point' directive is not generated if parent directive does not have 'cancel' directive.
llvm-svn: 247681
|
|
|
|
|
|
|
|
| |
references.
Patch makes codegen to preserve alignment of the shared variables captured and used in OpenMP regions.
llvm-svn: 247401
|
|
|
|
|
|
|
|
|
| |
captured variables.
Currently all variables used in OpenMP regions are captured into a record and passed to outlined functions in this record. It may result in some poor performance because of too complex analysis later in optimization passes. Patch makes to emit outlined functions for parallel-based regions with a list of captured variables. It reduces code for 2*n GEPs, stores and loads at least.
Codegen for task-based regions remains unchanged because runtime requires that all captured variables are passed in captured record.
llvm-svn: 247251
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce an Address type to bundle a pointer value with an
alignment. Introduce APIs on CGBuilderTy to work with Address
values. Change core APIs on CGF/CGM to traffic in Address where
appropriate. Require alignments to be non-zero. Update a ton
of code to compute and propagate alignment information.
As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.
The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned. Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay. I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.
Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.
We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment. In particular,
field access now uses alignmentAtOffset instead of min.
Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs. For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint. That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.
ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments. In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments. That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.
I partially punted on applying this work to CGBuiltin. Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.
llvm-svn: 246985
|
|
|
|
|
|
| |
Fix processing of shared variables with reference types in OpenMP constructs. Previously, if the variable was not marked in one of the private clauses, the reference to this variable was emitted incorrectly and caused an assertion later.
llvm-svn: 246846
|
|
|
|
|
|
| |
Fixed codegen for extended format of 'if' clauses with special 'directive-name-modifier' + ast-print tests for extended format of 'if' clause.
llvm-svn: 246748
|
|
|
|
|
|
|
|
|
| |
This replaces the filtered generic iterator with a type-specfic one based
on dyn_cast instead of comparing the kind enum. This allows us to use
range-based for loops and eliminates casts. No functionality change
intended.
llvm-svn: 246384
|
|
|
|
|
|
|
|
|
| |
Add emission of metadata for simd loops in presence of 'simdlen' clause.
If 'simdlen' clause is provided without 'safelen' clause, the vectorizer width for the loop is set to value of 'simdlen' clause + all read/write ops in loop are marked with '!llvm.mem.parallel_loop_access' metadata.
If 'simdlen' clause is provided along with 'safelen' clause, the vectorizer width for the loop is set to value of 'simdlen' clause + all read/write ops in loop are not marked with '!llvm.mem.parallel_loop_access' metadata.
If 'safelen' clause is provided without 'simdlen' clause, the vectorizer width for the loop is set to value of 'safelen' clause + all read/write ops in loop are not marked with '!llvm.mem.parallel_loop_access' metadata.
llvm-svn: 245697
|
|
|
|
|
|
| |
Add parsing/sema analysis for 'simdlen' clause in simd directives. Also add check that if both 'safelen' and 'simdlen' clauses are specified, the value of 'simdlen' parameter is less than the value of 'safelen' parameter.
llvm-svn: 245692
|
|
|
|
|
|
| |
OpenMP 4.1 allows to use variables with reference types in all private clauses (private, firstprivate, lastprivate, linear etc.). Patch allows to use such variables and fixes codegen for linear variables with reference types.
llvm-svn: 245268
|
|
|
|
|
|
|
| |
blender uses statements expression in condition of the loop under control of the '#pragma omp parallel for'. This condition is used several times in different expressions required for codegen of the loop directive. If there are some variables defined in statement expression, it fires an assert during codegen because of redefinition of the same variables.
We have to rebuild several expression to be sure that all variables are unique.
llvm-svn: 245041
|
|
|
|
|
|
|
|
|
| |
construct.
This is on behalf of Kelvin Li.
http://reviews.llvm.org/D11475
llvm-svn: 244569
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
float_cast_overflow is the only UBSan check without a source location attached.
This patch propagates SourceLocations where necessary to get them to the
EmitCheck() call.
Reviewers: rsmith, ABataev, rjmccall
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D11757
llvm-svn: 244568
|
|
|
|
|
|
|
| |
This is committed on behalf of Kelvin Li
http://reviews.llvm.org/D11469?id=31227
llvm-svn: 244325
|
|
|
|
|
|
| |
OpenMP 4.1 allows to use variables with reference types in private clauses and, therefore, in init expressions of the cannonical loop forms.
llvm-svn: 244209
|
|
|
|
|
|
| |
if TLS is enabled in OpenMP code generation.
llvm-svn: 243277
|
|
|
|
|
|
|
| |
for OpenMP 4 target data directive parsing and sema.
This commit is on behalf of Kelvin Li.
llvm-svn: 242785
|
|
|
|
|
|
| |
Rename Vectorizer to Vectorize and VectorizeUnroll to InterleaveCount.
llvm-svn: 242241
|
|
|
|
|
|
|
|
|
|
| |
Add the next codegen for 'omp cancel' directive:
if (__kmpc_cancel()) {
__kmpc_cancel_barrier();
<exit construct>;
}
llvm-svn: 241429
|
|
|
|
|
|
|
|
|
|
| |
Generate the next code for 'cancellation point':
if (__kmpc_cancellationpoint()) {
__kmpc_cancel_barrier();
<exit construct>;
}
llvm-svn: 241336
|
|
|
|
|
|
|
|
|
| |
++range)‘ pattern to range for loops.
The pattern was born out of the lack of range-based for loops in C++98
and is somewhat obscure. No functionality change intended.
llvm-svn: 241300
|
|
|
|
|
|
| |
Implemented parsing/sema analysis + (de)serialization.
llvm-svn: 241253
|
|
|
|
|
|
|
|
|
|
| |
The next code is generated for this construct:
```
if (__kmpc_cancellationpoint(ident_t *loc, kmp_int32 global_tid, kmp_int32 cncl_kind) != 0)
<exit from outer innermost construct>;
```
llvm-svn: 241239
|
|
|
|
|
|
|
|
|
| |
default simd alignment.
Adds type trait "__builtin_omp_required_simd_align" after discussions here http://reviews.llvm.org/D9894
Differential Revision: http://reviews.llvm.org/D10597
llvm-svn: 241237
|
|
|
|
|
|
| |
Add parsing and sema analysis for 'omp cancellation point' directive.
llvm-svn: 241145
|
|
|
|
|
|
|
|
| |
If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called instead of __kmpc_omp_task().
If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called.
Array sections are not supported yet.
llvm-svn: 240532
|
|
|
|
|
|
| |
Parsing and sema analysis (without support for array sections in arguments) for 'depend' clause (used in 'task' directive, OpenMP 4.0).
llvm-svn: 240409
|
|
|
|
|
|
|
| |
Adds emission of the code for 'proc_bind(master|close|spread)' clause:
call void @__kmpc_push_proc_bind(<loc>, i32 thread_id, i32 4|3|2)
llvm-svn: 240018
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added parsing, sema analysis and codegen for '#pragma omp taskgroup' directive (OpenMP 4.0).
The code for directive is generated the following way:
#pragma omp taskgroup
<body>
void __kmpc_taskgroup(<loc>, thread_id);
<body>
void __kmpc_end_taskgroup(<loc>, thread_id);
llvm-svn: 240011
|
|
|
|
|
|
| |
Codegen for this directive is a combined codegen for 'omp parallel' region with 'omp for simd' region inside. Clauses are supported.
llvm-svn: 240006
|
|
|
|
|
|
| |
Added codegen for combined 'omp for simd' directives, that is a combination of 'omp for' directive followed by 'omp simd' directive. Includes support for all clauses.
llvm-svn: 239990
|
|
|
|
| |
llvm-svn: 239889
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following code is generated for reduction clause within 'omp simd' loop construct:
#pragma omp simd reduction(op:var)
for (...)
<body>
alloca priv_var
priv_var = <initial reduction value>;
<loop_start>:
<body> // references to original 'var' are replaced by 'priv_var'
<loop_end>:
var op= priv_var;
llvm-svn: 239881
|
|
|
|
|
|
| |
Added codegen for lastprivate clauses within simd loop-based directives.
llvm-svn: 239813
|
|
|
|
|
|
| |
Previously the last iteration for simd loop-based OpenMP constructs were generated as a separate code. This feature is not required and codegen is simplified.
llvm-svn: 239810
|
|
|
|
|
|
| |
Destroy RuntimeCleanupScope before generation of termination instruction in parallel loop precondition.
llvm-svn: 239524
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in CapturedStmt.
Reworked codegen for privates in tasks:
call @kmpc_omp_task_alloc();
...
call @kmpc_omp_task(task_proxy);
void map_privates(.privates_rec. *privs, type1 ** priv1_ref, ..., typen **privn_ref) {
*priv1_ref = &privs->private1;
...
*privn_ref = &privs->privaten;
ret void
}
i32 task_entry(i32 ThreadId, i32 PartId, void* privs, void (void*, ...) map_privates, shareds* captures) {
type1 **priv1;
...
typen **privn;
call map_privates(privs, priv1, ..., privn);
<Task body with priv1, .., privn instead of the captured variables>.
ret i32
}
i32 task_proxy(i32 ThreadId, kmp_task_t_with_privates *tt) {
call task_entry(ThreadId, tt->task_data.PartId, &tt->privates, map_privates, tt->task_data.shareds);
}
llvm-svn: 238010
|
|
|
|
|
|
| |
For parameters we shall take a derived type of parameters, not the original one.
llvm-svn: 237882
|
|
|
|
|
|
| |
If loop control variable in a worksharing construct is marked as lastprivate, we should copy last calculated value of private counter back to original variable.
llvm-svn: 237879
|
|
|
|
|
|
| |
loops with ordered clause must be generated the same way as dynamic loops, but with static scheduleing.
llvm-svn: 237788
|
|
|
|
|
|
| |
This modification generates proper copyin/initialization sequences for array variables/parameters. Before they were considered as pointers, not arrays.
llvm-svn: 237691
|
|
|
|
|
|
| |
In some rare cases shared copies of lastprivate/firstprivate variables were not updated after the loop directive.
llvm-svn: 237243
|
|
|
|
|
|
| |
'schedule' clause for combined directives requires additional processing. Special helper variable is generated, that is captured in the outlined parallel region for 'parallel for' region. This captured variable is used to store chunk expression from the 'schedule' clause in this 'parallel for' region.
llvm-svn: 237100
|
|
|
|
|
|
| |
Do not emit 'atomicrmw' instruction for simple atomic constructs with non-integer expressions.
llvm-svn: 236828
|
|
|
|
| |
llvm-svn: 236821
|
|
|
|
|
|
|
|
|
|
| |
dynamic/guided scheduling.
Inner bodies of OpenMP worksharing loop-based constructs with dynamic or guided scheduling are allowed to be marked with !llvm.mem.parallel_loop_access metadata for better optimization. Worksharing constructs with static scheduling cannot be marked this way (according to OpenMP standard "A data dependence between the same logical iterations in two such loops is guaranteed").
Constructs with auto and runtime scheduling are also not marked because automatically chosen scheduling may be static also.
Differential Revision: http://reviews.llvm.org/D9518
llvm-svn: 236693
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For tasks codegen for private/firstprivate variables are different rather than for other directives.
1. Build an internal structure of privates for each private variable:
struct .kmp_privates_t. {
Ty1 var1;
...
Tyn varn;
};
2. Add a new field to kmp_task_t type with list of privates.
struct kmp_task_t {
void * shareds;
kmp_routine_entry_t routine;
kmp_int32 part_id;
kmp_routine_entry_t destructors;
.kmp_privates_t. privates;
};
3. Create a function with destructors calls for all privates after end of task region.
kmp_int32 .omp_task_destructor.(kmp_int32 gtid, kmp_task_t *tt) {
~Destructor(&tt->privates.var1);
...
~Destructor(&tt->privates.varn);
return 0;
}
4. Perform initialization of all firstprivate fields (by simple copying for POD data, copy constructor calls for classes) + provide address of a destructor function after kmpc_omp_task_alloc() and before kmpc_omp_task() calls.
kmp_task_t *new_task = __kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry);
CopyConstructor(new_task->privates.var1, *new_task->shareds.var1_ref);
new_task->shareds.var1_ref = &new_task->privates.var1;
...
CopyConstructor(new_task->privates.varn, *new_task->shareds.varn_ref);
new_task->shareds.varn_ref = &new_task->privates.varn;
new_task->destructors = .omp_task_destructor.;
kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task)
Differential Revision: http://reviews.llvm.org/D9370
llvm-svn: 236479
|
|
|
|
| |
llvm-svn: 236315
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For tasks codegen for private/firstprivate variables are different rather than for other directives.
1. Build an internal structure of privates for each private variable:
struct .kmp_privates_t. {
Ty1 var1;
...
Tyn varn;
};
2. Add a new field to kmp_task_t type with list of privates.
struct kmp_task_t {
void * shareds;
kmp_routine_entry_t routine;
kmp_int32 part_id;
kmp_routine_entry_t destructors;
.kmp_privates_t. privates;
};
3. Create a function with destructors calls for all privates after end of task region.
kmp_int32 .omp_task_destructor.(kmp_int32 gtid, kmp_task_t *tt) {
~Destructor(&tt->privates.var1);
...
~Destructor(&tt->privates.varn);
return 0;
}
4. Perform default initialization of all private fields (no initialization for POD data, default constructor calls for classes) + provide address of a destructor function after kmpc_omp_task_alloc() and before kmpc_omp_task() calls.
kmp_task_t *new_task = __kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry);
DefaultConstructor(new_task->privates.var1);
new_task->shareds.var1_ref = &new_task->privates.var1;
...
DefaultConstructor(new_task->privates.varn);
new_task->shareds.varn_ref = &new_task->privates.varn;
new_task->destructors = .omp_task_destructor.;
kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task)
Differential Revision: http://reviews.llvm.org/D9322
llvm-svn: 236207
|