summaryrefslogtreecommitdiffstats
path: root/clang/test/OpenMP/nvptx_target_teams_codegen.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [OPENMP][NVPTX]Mark more functions as always_inline for betterAlexey Bataev2019-05-211-14/+23
| | | | | | | | | | | performance. Internally generated functions must be marked as always_inlines in most cases. Patch marks some extra reduction function + outlined parallel functions as always_inline for better performance, but only if the optimization is requested. llvm-svn: 361269
* [OPENMP][NVPTX]Use __kmpc_barrier_simple_spmd(nullptr, 0) instead ofAlexey Bataev2019-01-031-6/+6
| | | | | | | | | | nvvm_barrier0. Use runtime functions instead of the direct call to the nvvm intrinsics. It allows to prevent some dangerous LLVM optimizations, that breaks the code for the NVPTX target. llvm-svn: 350328
* [OpenMP] Add a new version of the SPMD deinit kernel functionGheorghe-Teodor Bercea2018-11-291-1/+1
| | | | | | | | | | | | | | Summary: This patch adds a new runtime for the SPMD deinit kernel function which replaces the previous function. The new function takes as argument the flag which signals whether the runtime is required or not. This enables the compiler to optimize out the part of the deinit function which are not needed. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D54970 llvm-svn: 347915
* [OPENMP][NVPTX]Extend number of constructs executed in SPMD mode.Alexey Bataev2018-11-091-1/+1
| | | | | | | | | If the statements between target|teams|distribute directives does not require execution in master thread, like constant expressions, null statements, simple declarations, etc., such construct can be xecuted in SPMD mode. llvm-svn: 346551
* [OPENMP][NVPTX] Add support for lightweight runtime.Alexey Bataev2018-08-291-3/+3
| | | | | | | | If the target construct can be executed in SPMD mode + it is a loop based directive with static scheduling, we can use lightweight runtime support. llvm-svn: 340953
* [OpenMP] Initialize data sharing stack for SPMD caseGheorghe-Teodor Bercea2018-07-131-0/+1
| | | | | | | | | | | | | | Summary: In the SPMD case, we need to initialize the data sharing and globalization infrastructure. This covers the case when an SPMD region calls a function in a different compilation unit. Reviewers: ABataev, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D49188 llvm-svn: 337015
* [OPENMP, NVPTX] Initial support for L2 parallelism in SPMD mode.Alexey Bataev2018-05-101-7/+34
| | | | | | | | Added initial support for L2 parallelism in SPMD mode. Note, though, that the orphaned parallel directives are not currently supported in SPMD mode. llvm-svn: 332016
* [OPENMP, NVPTX] Fix codegen for the teams reduction.Alexey Bataev2018-04-061-4/+4
| | | | | | | Added NUW flags for all the add|mul|sub operations + replaced sdiv by udiv as we operate on unsigned values only (addresses, converted to integers) llvm-svn: 329411
* [OpenMP] Remove implicit data sharing code gen that aims to use device ↵Gheorghe-Teodor Bercea2018-03-071-2/+2
| | | | | | | | | | | | | | | | shared memory Summary: Remove this scheme for now since it will be covered by another more generic scheme using global memory. This code will be worked into an optimization for the generic data sharing scheme. Removing this completely and then adding it via future patches will make all future data sharing patches cleaner. Reviewers: ABataev, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43625 llvm-svn: 326948
* [OPENMP] Replace calls of getAssociatedStmt().Alexey Bataev2018-01-121-6/+2
| | | | | | | | | | | | | getAssociatedStmt() returns the outermost captured statement for the OpenMP directive. It may return incorrect region in case of combined constructs. Reworked the code to reduce the number of calls of getAssociatedStmt() and used getInnermostCapturedStmt() and getCapturedStmt() functions instead. In case of firstprivate variables it may lead to an extra allocas generation for private copies even if the variable is passed by value into outlined function and could be used directly as private copy. llvm-svn: 322393
* [OpenMP] Further adjustments of nvptx runtime functionsJonas Hahnfeld2017-12-271-2/+2
| | | | | | | | Pass in default value of 1, similar to previous commit r318836. Differential Revision: https://reviews.llvm.org/D41012 llvm-svn: 321486
* [OpenMP] Adjust arguments of nvptx runtime functionsJonas Hahnfeld2017-11-221-2/+2
| | | | | | | | | | | | | In the future the compiler will analyze whether the OpenMP runtime needs to be (fully) initialized and avoid that overhead if possible. The functions already take an argument to transfer that information to the runtime, so pass in the default value 1. (This is needed for binary compatibility with libomptarget-nvptx currently being upstreamed.) Differential Revision: https://reviews.llvm.org/D40354 llvm-svn: 318836
* [OPENMP] Codegen for `target teams` directive.Alexey Bataev2017-11-221-2/+6
| | | | | | Added codegen of the clauses for `target teams` directive. llvm-svn: 318834
* [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs ↵Gheorghe-Teodor Bercea2017-11-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | using OpenMP device offloading Summary: This patch is part of the development effort to add support in the current OpenMP GPU offloading implementation for implicitly sharing variables between a target region executed by the team master thread and the worker threads within that team. This patch is the first of three required for successfully performing the implicit sharing of master thread variables with the worker threads within a team. The remaining two patches are: - Patch D38978 to the LLVM NVPTX backend which ensures the lowering of shared variables to an device memory which allows the sharing of references; - Patch (coming soon) is a patch to libomptarget runtime library which ensures that a list of references to shared variables is properly maintained. A simple code snippet which illustrates an implicit data sharing situation is as follows: ``` #pragma omp target { // master thread only int v; #pragma omp parallel { // worker threads // use v } } ``` Variable v is implicitly shared from the team master thread which executes the code in between the target and parallel directives. The worker threads must operate on the latest version of v, including any updates performed by the master. The code generated in this patch relies on the LLVM NVPTX patch (mentioned above) which prevents v from being lowered in the thread local memory of the master thread thus making the reference to this variable un-shareable with the workers. This ensures that the code generated by this patch is correct. Since the parallel region is outlined the passing of arguments to the outlined regions must preserve the original order of arguments. The runtime therefore maintains a list of references to shared variables thus ensuring their passing in the correct order. The passing of arguments to the outlined parallel function is performed in a separate function which the data sharing infrastructure constructs in this patch. The function is inlined when optimizations are enabled. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, Hahnfeld, ABataev, caomhin Reviewed By: ABataev Subscribers: cfe-commits, jholewinski Differential Revision: https://reviews.llvm.org/D38976 llvm-svn: 318773
* [OpenMP] Codegen support for 'target teams' on the NVPTX device.Arpith Chacko Jacob2017-01-261-0/+222
This is a simple patch to teach OpenMP codegen to emit the construct in Generic mode. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29143 llvm-svn: 293183
OpenPOWER on IntegriCloud