summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen/CGOpenMPRuntime.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [OPENMP] parsing and sema support for 'close' map-type-modifierKelvin Li2018-12-181-32/+33
| | | | | | | | | | | | A map clause with the close map-type-modifier is a hint to prefer that the variables are mapped using a copy into faster memory. Patch by Ahsan Saghir (saghir) Differential Revision: https://reviews.llvm.org/D55719 llvm-svn: 349551
* [OPENMP][NVPTX]Mark __kmpc_barrier functions as convergent.Alexey Bataev2018-12-041-7/+12
| | | | | | | | __kmpc_barrier runtime functions must be marked as convergent to prevent some dangerous optimizations. Also, for NVPTX target all barriers must be emitted as simple barriers. llvm-svn: 348271
* Revert "Revert r347417 "Re-Reinstate 347294 with a fix for the failures.""Fangrui Song2018-11-301-4/+7
| | | | | | | | | It seems the two failing tests can be simply fixed after r348037 Fix 3 cases in Analysis/builtin-functions.cpp Delete the bad CodeGen/builtin-constant-p.c for now llvm-svn: 348053
* Revert r347417 "Re-Reinstate 347294 with a fix for the failures."Fangrui Song2018-11-301-7/+4
| | | | | | | | | | Kept the "indirect_builtin_constant_p" test case in test/SemaCXX/constant-expression-cxx1y.cpp while we are investigating why the following snippet fails: extern char extern_var; struct { int a; } a = {__builtin_constant_p(extern_var)}; llvm-svn: 348039
* Re-commit r347417 "Re-Reinstate 347294 with a fix for the failures."Hans Wennborg2018-11-281-4/+7
| | | | | | | This was reverted in r347656 due to me thinking it caused a miscompile of Chromium. Turns out it was the Chromium code that was broken. llvm-svn: 347756
* Revert r347417 "Re-Reinstate 347294 with a fix for the failures."Hans Wennborg2018-11-271-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This caused a miscompile in Chrome (see crbug.com/908372) that's illustrated by this small reduction: static bool f(int *a, int *b) { return !__builtin_constant_p(b - a) || (!(b - a)); } int arr[] = {1,2,3}; bool g() { return f(arr, arr + 3); } $ clang -O2 -S -emit-llvm a.cc -o - g() should return true, but after r347417 it became false for some reason. This also reverts the follow-up commits. r347417: > Re-Reinstate 347294 with a fix for the failures. > > Don't try to emit a scalar expression for a non-scalar argument to > __builtin_constant_p(). > > Third time's a charm! r347446: > The result of is.constant() is unsigned. r347480: > A __builtin_constant_p() returns 0 with a function type. r347512: > isEvaluatable() implies a constant context. > > Assume that we're in a constant context if we're asking if the expression can > be compiled into a constant initializer. This fixes the issue where a > __builtin_constant_p() in a compound literal was diagnosed as not being > constant, even though it's always possible to convert the builtin into a > constant. r347531: > A "constexpr" is evaluated in a constant context. Make sure this is reflected > if a __builtin_constant_p() is a part of a constexpr. llvm-svn: 347656
* [OPENMP][NVPTX]Emit default locations as constant with undefined mode.Alexey Bataev2018-11-211-8/+10
| | | | | | | | | | | | For the NVPTX target default locations should be emitted as constants + additional info must be emitted in the reserved_2 field of the ident_t structure. The 1st bit controls the execution mode and the 2nd bit controls use of the lightweight runtime. The combination of the bits for Non-SPMD mode + lightweight runtime represents special undefined mode, used outside of the target regions for orphaned directives or functions. Should allow and additional optimization inside of the target regions. llvm-svn: 347425
* Re-Reinstate 347294 with a fix for the failures.Bill Wendling2018-11-211-4/+7
| | | | | | | | | Don't try to emit a scalar expression for a non-scalar argument to __builtin_constant_p(). Third time's a charm! llvm-svn: 347417
* Revert r347364 again, the fix was incomplete.Nico Weber2018-11-211-7/+4
| | | | llvm-svn: 347389
* Reinstate 347294 with a fix for the failures.Bill Wendling2018-11-201-4/+7
| | | | | | | EvaluateAsInt() is sometimes called in a constant context. When that's the case, we need to specify it as so. llvm-svn: 347364
* [OPENMP]Make lambda mapping follow reqs for PTR_AND_OBJ mapping.Alexey Bataev2018-11-081-17/+25
| | | | | | | | The base pointer for the lambda mapping must point to the lambda capture placement and pointer must point to the captured variable itself. Patch fixes this problem. llvm-svn: 346408
* [OPENMP]Fix handling of the globals during compilation for the device.Alexey Bataev2018-11-071-57/+82
| | | | | | | | Fixed lookup for the target regions in unused virtual functions + fixed processing of the global variables not marked as declare target but emitted during debug info emission. llvm-svn: 346343
* [OPENMP]Change the mapping type for lambda captures.Alexey Bataev2018-11-021-3/+3
| | | | | | The previously used combination `PTR_AND_OBJ | PRIVATE` could be used for mapping of some data in Fortran. Changed it to `PTR_AND_OBJ | LITERAL`. llvm-svn: 345982
* [OPENMP] Support for mapping of the lambdas in target regions.Alexey Bataev2018-10-301-0/+84
| | | | | | | | | | Added support for mapping of lambdas in the target regions. It scans all the captures by reference in the lambda, implicitly maps those variables in the target region and then later reinstate the addresses of references in lambda to the correct addresses of the captured|privatized variables. llvm-svn: 345609
* [OpenMP][NVPTX] Use single loops when generating code for distribute ↵Gheorghe-Teodor Bercea2018-10-291-0/+12
| | | | | | | | | | | | | | | | parallel for Summary: This patch adds a new code generation path for bound sharing directives containing distribute parallel for. The new code generation scheme applies to chunked schedules on distribute and parallel for directives. The scheme simplifies the code that is being generated by eliminating the need for an outer for loop over chunks for both distribute and parallel for directives. In the case of distribute it applies to any sized chunk while in the parallel for case it only applies when chunk size is 1. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D53448 llvm-svn: 345509
* Do not always request an implicit taskgroup region inside the kmpc_taskloop ↵Alexey Bataev2018-10-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | function Summary: For the following code: ``` int i; #pragma omp taskloop for (i = 0; i < 100; ++i) {} #pragma omp taskloop nogroup for (i = 0; i < 100; ++i) {} ``` Clang emits the following LLVM IR: ``` ... call void @__kmpc_taskgroup(%struct.ident_t* @0, i32 %0) %2 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates*)* @.omp_task_entry. to i32 (i32, i8*)*)) ... call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %2, i32 1, i64* %8, i64* %9, i64 %13, i32 0, i32 0, i64 0, i8* null) call void @__kmpc_end_taskgroup(%struct.ident_t* @0, i32 %0) ... %15 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.1*)* @.omp_task_entry..2 to i32 (i32, i8*)*)) ... call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %15, i32 1, i64* %21, i64* %22, i64 %26, i32 0, i32 0, i64 0, i8* null) ``` The first set of instructions corresponds to the first taskloop construct. It is important to note that the implicit taskgroup region associated with the taskloop construct has been materialized in our IR: the `__kmpc_taskloop` occurs inside a taskgroup region. Note also that this taskgroup region does not exist in our second taskloop because we are using the `nogroup` clause. The issue here is the 4th argument of the kmpc_taskloop call, starting from the end, is always a zero. Checking the LLVM OpenMP RT implementation, we see that this argument corresponds to the nogroup parameter: ``` void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int if_val, kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup, int sched, kmp_uint64 grainsize, void *task_dup); ``` So basically we always tell to the RT to do another taskgroup region. For the first taskloop, this means that we create two taskgroup regions. For the second example, it means that despite the fact we had a nogroup clause we are going to have a taskgroup region, so we unnecessary wait until all descendant tasks have been executed. Reviewers: ABataev Reviewed By: ABataev Subscribers: rogfer01, cfe-commits Differential Revision: https://reviews.llvm.org/D53636 llvm-svn: 345180
* [OPENMP] Fix emission of the __kmpc_global_thread_num.Alexey Bataev2018-10-051-4/+35
| | | | | | | | | Fixed emission of the __kmpc_global_thread_num() so that it is not messed up with alloca instructions anymore. Plus, fixes emission of the __kmpc_global_thread_num() functions in the target outlined regions so that they are not called before runtime is initialized. llvm-svn: 343856
* [OPENMP] Add support for OMP5 requires directive + unified_address clauseKelvin Li2018-09-261-0/+2
| | | | | | | | | Add support for OMP5.0 requires directive and unified_address clause. Patches to follow will include support for additional clauses. Differential Revision: https://reviews.llvm.org/D52359 llvm-svn: 343063
* Make compare function in r342648 have strict weak ordering.Richard Trieu2018-09-211-2/+9
| | | | | | | Comparison functions used in sorting algorithms need to have strict weak ordering. Remove the assert and allow comparisons on all lists. llvm-svn: 342774
* [NFC] remove unused variableJF Bastien2018-09-211-1/+0
| | | | | | It was causing a warning. llvm-svn: 342750
* [OPENMP] Add support for mapping memory pointed by member pointer.Alexey Bataev2018-09-201-11/+259
| | | | | | Added support for map(s, s.ptr[0:1]) kind of mapping. llvm-svn: 342648
* [OPENMP] Fix PR38903: Crash on instantiation of the non-dependentAlexey Bataev2018-09-131-19/+12
| | | | | | | | | | | declare reduction. If the declare reduction construct with the non-dependent type is defined in the template construct, the compiler might crash on the template instantition. Reworked the whole instantiation scheme for the declare reduction constructs to fix this problem correctly. llvm-svn: 342151
* [OPENMP] Do not create offloading entry for declare target variablesAlexey Bataev2018-08-291-1/+9
| | | | | | | | | declarations. We should not create offloading entries for declare target var declarations as it causes compiler crash. llvm-svn: 340968
* [OPENMP] Create non-const ident_t objects.Mike Rice2018-08-291-12/+13
| | | | | | | | | | | Currently ident_t objects are created const when debug info is not enabled, but the libittnotify libray in the OpenMP runtime writes to the reserved_2 field (See __kmp_itt_region_forking in openmp/runtime/src/kmp_itt.inl). Now create ident_t objects non-const. Differential Revision: https://reviews.llvm.org/D51331 llvm-svn: 340934
* [OPENMP] FIx processing of declare target variables.Alexey Bataev2018-08-151-6/+9
| | | | | | | | The compiler may produce unexpected error messages/crashes when declare target variables were used. Patch fixes problems with the declarations marked as declare target to or link. llvm-svn: 339805
* [OPENMP] Fix processing of declare target construct.Alexey Bataev2018-08-141-28/+10
| | | | | | | The attribute marked as inheritable since OpenMP 5.0 supports it + additional fixes to support new functionality. llvm-svn: 339704
* [OPENMP] Fix emission of the loop doacross constructs.Alexey Bataev2018-08-131-32/+60
| | | | | | | The number of loops associated with the OpenMP loop constructs should not be considered as the number loops to collapse. llvm-svn: 339603
* Revert "[OPENMP] Fix emission of the loop doacross constructs."Alexey Bataev2018-08-131-60/+32
| | | | | | This reverts commit r339568 because of the problems with the buildbots. llvm-svn: 339574
* [OPENMP] Fix emission of the loop doacross constructs.Alexey Bataev2018-08-131-32/+60
| | | | | | | The number of loops associated with the OpenMP loop constructs should not be considered as the number loops to collapse. llvm-svn: 339568
* Port getLocEnd -> getEndLocStephen Kelly2018-08-091-1/+1
| | | | | | | | | | Reviewers: teemperor! Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D50351 llvm-svn: 339386
* Port getLocStart -> getBeginLocStephen Kelly2018-08-091-12/+12
| | | | | | | | | | Reviewers: teemperor! Subscribers: jholewinski, whisperity, jfb, cfe-commits Differential Revision: https://reviews.llvm.org/D50350 llvm-svn: 339385
* [OPENMP] Mark variables captured in declare target region as implicitlyAlexey Bataev2018-08-071-1/+18
| | | | | | | | | declare target. According to OpenMP 5.0, variables captured in lambdas in declare target regions must be considered as implicitly declare target. llvm-svn: 339152
* [OpenMP] Encode offload target triples into comdat key for offload ↵Sergey Dmitriev2018-08-031-1/+13
| | | | | | | | | | | | initialization code Encoding offload target triples onto comdat group key for offload initialization code guarantees that it will be executed once per each unique combination of offload targets. Differential Revision: https://reviews.llvm.org/D50218 llvm-svn: 338916
* [OPENMP] Change linkage of offloading symbols to support droppingAlexey Bataev2018-07-311-2/+4
| | | | | | | | offload targets. Changed the linkage of omp_offloading.img_start.<triple> and omp_offloading.img_end.<triple> symbols from external to external weak to allow dropping of some targets during linking. llvm-svn: 338413
* [OPENMP] Prevent problems with linking of the static variables.Alexey Bataev2018-07-311-0/+13
| | | | | | No need to change the linkage, we can avoid the problem using special variable. That points to the original variable and, thus, prevent some of the optimizations that might break the compilation. llvm-svn: 338399
* [OPENMP] ThreadId in serialized parallel regions is 0.Alexey Bataev2018-07-251-2/+2
| | | | | | | | The first argument for the parallel outlined functions, called as serialized parallel regions, should be a pointer to the global thread id that always is 0. llvm-svn: 337957
* Fix unused variable warning.Erich Keane2018-07-191-1/+1
| | | | llvm-svn: 337473
* The patch adds support for the new map interface between clang and ↵Alexey Bataev2018-07-191-293/+492
| | | | | | | | | | | libomptarget. The changes in the interface are the following: device IDs are now 64-bit integers (as opposed to 32-bit) map flags are 64-bit long (used to be 32-bit) mappings for partially mapped structs are now calculated at compile time and members of partially mapped structs are flagged using the MEMBER_OF field Support for is_device_ptr on struct members was dropped - this functionality is not supported by the OpenMP standard and its implementation is technically infeasible (however, use_device_ptr on struct members works as a non-standard extension of the compiler) llvm-svn: 337468
* [OPENMP] Fix checks for declare target link entries.Alexey Bataev2018-07-161-13/+37
| | | | | | | | If the declare target link entries are created but not used, the compiler will produce an error message. Patch improves handling of such situations + improves checks for possibly lost declare target variables. llvm-svn: 337207
* [OPENMP] Fix syntactic errors in error messages.Alexey Bataev2018-07-161-2/+2
| | | | | | Fixed spelling of the offloading error messages. llvm-svn: 337196
* Diagnose missing 'template' keywords in more cases.Richard Smith2018-06-261-1/+1
| | | | | | | | | | | | | | | | | | | | We track when we see a name-shaped expression followed by a '<' token and parse the '<' as a comparison. Then: * if we see a token sequence that cannot possibly be an expression but can be a template argument (in particular, a type-id) that follows either a ',' or the '<', diagnose that the '<' was supposed to start a template argument list, and * if we see '>()', diagnose that the '<' was supposed to start a template argument list. This only changes the diagnostic for error cases, and in practice appears to catch the most common cases where a missing 'template' keyword leads to parse errors within a template. Differential Revision: https://reviews.llvm.org/D48571 llvm-svn: 335687
* Move helper classes into anonymous namespaces. NFCI.Benjamin Kramer2018-05-151-3/+4
| | | | llvm-svn: 332400
* [OPENMP] Generate unique names for offloading regions id.Alexey Bataev2018-05-091-1/+1
| | | | | | | It is required to emit unique names for offloading regions ids. Required to support compilation and linking of several compilation units. llvm-svn: 331899
* [OPENMP] Mark global tors/dtors as used.Alexey Bataev2018-05-091-0/+2
| | | | | | | | | | | If the global variables are marked as declare target and they need ctors/dtors, these ctors/dtors are emitted and then invoked by the offloading runtime library. They are not explicitly used in the emitted code and thus can be optimized out. Patch marks these functions as used, so the optimizer cannot remove these function during the optimization phase. llvm-svn: 331879
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-091-94/+94
| | | | | | | | | | | | | | | | | | | This is similar to the LLVM change https://reviews.llvm.org/D46290. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46320 llvm-svn: 331834
* [OPENMP, NVPTX] Fix linkage of the global entries.Alexey Bataev2018-05-081-5/+5
| | | | | | | The linkage of the global entries must be weak to enable support of redefinition of the same target regions in multiple compilation units. llvm-svn: 331768
* [OPENMP, NVPTX] Added support for L2 parallelism.Alexey Bataev2018-05-071-7/+0
| | | | | | | Added initial codegen for level 2, 3 etc. parallelism. Currently, all the second, the third etc. parallel regions will run sequentially. llvm-svn: 331642
* [OPENMP] Support C++ member functions in the device constructs.Alexey Bataev2018-05-021-10/+3
| | | | | | | Added correct emission of the C++ member functions for the device function when they are used in the device constructs. llvm-svn: 331365
* [OPENMP] Emit names of the globals depending on target.Alexey Bataev2018-05-021-72/+118
| | | | | | | | Some symbols are not allowed to be used as names on some targets. Patch ries to unify the emission of the names of LLVM globals so they could be used on different targets. llvm-svn: 331358
* [OPENMP] Emit template instatiation|specialization functions forAlexey Bataev2018-05-011-1/+10
| | | | | | | | | | devices. If the function is an instantiation|specialization of the template and is used in the device code, the definitions of such functions should be emitted for the device. llvm-svn: 331261
OpenPOWER on IntegriCloud