| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Most of the generated functions for the OpenMP were generated with
disabled debug info. Patch fixes this for better user experience.
llvm-svn: 321816
|
|
|
|
|
|
|
|
|
|
| |
distribute parallel for' on host
https://reviews.llvm.org/D41709
This patch includes code generation and testing for offloading when target device is host.
llvm-svn: 321759
|
|
|
|
|
|
|
|
|
| |
only.
Added support for -fopenmp-simd option that allows compilation of
simd-based constructs without emission of OpenMP runtime calls.
llvm-svn: 321560
|
|
|
|
|
|
| |
Added codegen for `depend` clauses on `target data update` directives.
llvm-svn: 321493
|
|
|
|
|
|
| |
Added codegen for the `nowait` clause in target data constructs.
llvm-svn: 320717
|
|
|
|
|
|
| |
Added basic codegen for `nowait` clauses in target-based directives.
llvm-svn: 320613
|
|
|
|
|
|
|
| |
Host + generic device codegen for `target teams distribute simd`
directive.
llvm-svn: 320608
|
|
|
|
|
|
|
| |
OpenMP 5.0 added support for `reduction` clause in target-based
directives. Patch adds this support to clang.
llvm-svn: 320596
|
|
|
|
|
|
| |
Host + default devices codegen for `target teams distribute` directive.
llvm-svn: 320149
|
|
|
|
|
|
|
|
|
|
|
|
| |
The adjustment is calculated with CreatePtrDiff() which returns
the difference in (base) elements. This is passed to CreateGEP()
so make sure that the GEP base has the correct pointer type:
It needs to be a pointer to the base type, not a pointer to a
constant sized array.
Differential Revision: https://reviews.llvm.org/D40911
llvm-svn: 319931
|
|
|
|
|
|
|
|
| |
Previously we emitted `__tgt_target_teams` only for standalone teams
directives. This patch allows emit this function for all teams-based
directives.
llvm-svn: 319585
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes the following failures uncovered by D39245:
Clang :: OpenMP/task_firstprivate_codegen.cpp
Clang :: OpenMP/task_private_codegen.cpp
Clang :: OpenMP/taskloop_firstprivate_codegen.cpp
Clang :: OpenMP/taskloop_lastprivate_codegen.cpp
Clang :: OpenMP/taskloop_private_codegen.cpp
Clang :: OpenMP/taskloop_simd_firstprivate_codegen.cpp
Clang :: OpenMP/taskloop_simd_lastprivate_codegen.cpp
Clang :: OpenMP/taskloop_simd_private_codegen.cpp
Reviewers: rjmccall, ABataev, AndreyChurbanov
Reviewed By: rjmccall, ABataev
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D39947
llvm-svn: 319222
|
|
|
|
|
|
|
|
|
| |
constructs, NFC.
Improved handling of cancel|cancellation point directives inside
target-based for directives.
llvm-svn: 319046
|
|
|
|
|
|
|
|
|
|
|
| |
signatures, clang-side
This clang patch changes the __tgt_* API function signatures in preparation for the new map interface.
Changes are: Device IDs 32bits --> 64bits, Flags 32bits --> 64bits
Differential revision: https://reviews.llvm.org/D40281
llvm-svn: 318789
|
|
|
|
| |
llvm-svn: 318578
|
|
|
|
|
|
| |
Added codegen support for `target simd` directive.
llvm-svn: 318536
|
|
|
|
|
|
|
|
| |
directive.
Added missed support for cancelling of target parallel for construct.
llvm-svn: 318434
|
|
|
|
|
|
| |
Added codegen for `#pragma omp target parallel for simd` and clauses.
llvm-svn: 317813
|
|
|
|
| |
llvm-svn: 317719
|
|
|
|
|
|
|
|
|
|
| |
This patch renames some of the flag names of the clang/libomptarget map interface. The old names are slightly misleading, whereas the new ones describe in a better way what each flag is about.
Only the macros within the enumeration are renamed, there is no change in functionality therefore there are no updated regression tests.
Differential Revision: https://reviews.llvm.org/D39745
llvm-svn: 317598
|
|
|
|
|
|
|
| |
The compiler may crash under some conditions if the getInvokeDest() is
used, but later it is not used. Fixed this problem in OpenMP.
llvm-svn: 317227
|
|
|
|
|
|
|
| |
If the thread id is requested in windows mode within funclets, we may
generate incorrect function call that could lead to broken codegen.
llvm-svn: 317208
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes various places in clang to propagate may-alias
TBAA access descriptors during construction of lvalues, thus
eliminating the need for the LValueBaseInfo::MayAlias flag.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D39008
llvm-svn: 316988
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D39177
llvm-svn: 316896
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases the compiler can deduce the length of an array section
as constants. With this information, VLAs can be avoided in place of
a constant sized array or even a scalar value if the length is 1.
Example:
int a[4], b[2];
pragma omp parallel reduction(+: a[1:2], b[1:1])
{ }
For chained array sections, this optimization is restricted to cases
where all array sections except the last have a constant length 1.
This trivially guarantees that there are no holes in the memory region
that needs to be privatized.
Example:
int c[3][4];
pragma omp parallel reduction(+: c[1:1][1:2])
{ }
This relands commit r316229 that I reverted in r316235 because it
failed on some bots. During investigation I found that this was because
Clang and GCC evaluate the two arguments to emplace_back() in
ReductionCodeGen::emitSharedLValue() in a different order, hence
leading to a different order of generated instructions in the final
LLVM IR. Fix this by passing in the arguments from temporary variables
that are evaluated in a defined order.
Differential Revision: https://reviews.llvm.org/D39136
llvm-svn: 316362
|
|
|
|
|
|
|
|
|
|
| |
This breaks at least two buildbots:
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/1175
http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/10478
This reverts commit r316229 during local investigation.
llvm-svn: 316235
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases the compiler can deduce the length of an array section
as constants. With this information, VLAs can be avoided in place of
a constant sized array or even a scalar value if the length is 1.
Example:
int a[4], b[2];
pragma omp parallel reduction(+: a[1:2], b[1:1])
{ }
For chained array sections, this optimization is restricted to cases
where all array sections except the last have a constant length 1.
This trivially guarantees that there are no holes in the memory region
that needs to be privatized.
Example:
int c[3][4];
pragma omp parallel reduction(+: c[1:1][1:2])
{ }
Differential Revision: https://reviews.llvm.org/D39136
llvm-svn: 316229
|
|
|
|
|
|
|
|
|
| |
reduction.
If the reduction is an array or an array section and reduction operation
is declare reduction without initializer, it may lead to crash.
llvm-svn: 315611
|
|
|
|
|
|
|
|
|
|
|
| |
in C.
If we try to get the lvalue for thread_id variables in inlined regions,
we did not use the correct version of function. Fixed this bug by adding
overrided version of the function getThreadIDVariableLValue for inlined
regions.
llvm-svn: 315578
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables explicit generation of TBAA information in all
cases where LValue base info is propagated or constructed in
non-trivial ways. Eventually, we will consider each of these
cases to make sure the TBAA information is correct and not too
conservative. For now, we just fall back to generating TBAA info
from the access type.
This patch should not bring in any functional changes.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38733
llvm-svn: 315575
|
|
|
|
| |
llvm-svn: 315467
|
|
|
|
|
|
|
|
| |
If both taskloop and task directives are used at the same time in one
program, we may ran into the situation when the particular type for task
directive is reused for taskloop directives. Patch fixes this problem.
llvm-svn: 315464
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Besides obvious code simplification, avoiding explicit creation
of LValueBaseInfo objects makes it easier to make TBAA
information to be part of such objects.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38695
llvm-svn: 315289
|
|
|
|
| |
llvm-svn: 315196
|
|
|
|
| |
llvm-svn: 315185
|
|
|
|
|
|
|
| |
Simplified and generalized codegen for non-offloading part that works if
offloading is failed or condition of the `if` clause is `false`.
llvm-svn: 314670
|
|
|
|
|
|
|
|
|
|
| |
directives.
If the variable is used in the target-based region but is not found in
any private|mapping clause, then generate implicit firstprivate|map
clauses for these implicitly mapped variables.
llvm-svn: 314205
|
|
|
|
|
|
|
|
|
|
|
|
| |
__kmpc_for_static_fini().
Added special flags for calls of __kmpc_for_static_fini(), like previous
ly for __kmpc_for_static_init(). Added flag OMP_IDENT_WORK_DISTRIBUTE
for distribute cnstruct, OMP_IDENT_WORK_SECTIONS for sections-based
constructs and OMP_IDENT_WORK_LOOP for loop-based constructs in
location flags.
llvm-svn: 312642
|
|
|
|
|
|
|
|
|
|
| |
move constructor.
Previously user-defined reduction initializer was considered as an
assignment expression, not as initializer. Fixed this by treating the
initializer expression as an initializer.
llvm-svn: 312638
|
|
|
|
|
|
|
|
|
|
|
| |
If exceptions are enabled, there may be a problem with the codegen of
the finalization functions from OpenMP runtime. It happens because of
the problem with the getting of thread identifier value. Patch tries to
fix it by using the result of the call of function
__kmpc_global_thread_num() rather than loading of value of outlined
function parameter.
llvm-svn: 311007
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
__kmpc_for_static_init().
OpenMP 5.0 will include OpenMP Tools interface that requires distinguishing different worksharing constructs.
Since the same entry point (__kmp_for_static_init(ident_t *loc,
kmp_int32 global_tid,........)) is called in case static
loop/sections/distribute it is suggested using 'flags' field of the
ident_t structure to pass the type of the construct.
llvm-svn: 310865
|
|
|
|
|
|
|
| |
After some changes in clang/LLVM debug info for task-based regions was
not generated at all. Patch fixes this problem.
llvm-svn: 310850
|
|
|
|
|
|
| |
General improvement of the outlined functions calls.
llvm-svn: 310840
|
|
|
|
|
|
|
|
|
|
|
| |
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310387
|
|
|
|
| |
llvm-svn: 310098
|
|
|
|
|
|
| |
Adds codegen for taskloop-based directives.
llvm-svn: 308174
|
|
|
|
|
|
|
| |
Reworked codegen for reduction clauses for future support of reductions
in task-based directives.
llvm-svn: 307910
|
|
|
|
|
|
|
|
|
| |
If taskloop directive has no associated nogroup clause, it must emitted
inside implicit taskgroup block. Runtime supports it, but we need to
generate implicit taskgroup block explicitly to support future
reductions codegen.
llvm-svn: 307822
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, if the some of the parameters are captured by value, this
argument is converted to uintptr_t type and thus we loosing the debug
info about real type of the argument (captured variable):
```
void @.outlined_function.(uintptr %par);
...
%a = alloca i32
%a.casted = alloca uintptr
%cast = bitcast uintptr* %a.casted to i32*
%a.val = load i32, i32 *%a
store i32 %a.val, i32 *%cast
%a.casted.val = load uintptr, uintptr* %a.casted
call void @.outlined_function.(uintptr %a.casted.val)
...
```
To resolve this problem, in debug mode a speciall external wrapper
function is generated, that calls the outlined function with the correct
parameters types:
```
void @.wrapper.(uintptr %par) {
%a = alloca i32
%cast = bitcast i32* %a to uintptr*
store uintptr %par, uintptr *%cast
%a.val = load i32, i32* %a
call void @.outlined_function.(i32 %a)
ret void
}
void @.outlined_function.(i32 %par);
...
%a = alloca i32
%a.casted = alloca uintptr
%cast = bitcast uintptr* %a.casted to i32*
%a.val = load i32, i32 *%a
store i32 %a.val, i32 *%cast
%a.casted.val = load uintptr, uintptr* %a.casted
call void @.wrapper.(uintptr %a.casted.val)
...
```
llvm-svn: 306697
|
|
|
|
| |
llvm-svn: 306419
|