summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [OpenMP] Initialize data sharing stack for SPMD caseGheorghe-Teodor Bercea2018-07-131-5/+15
| | | | | | | | | | | | | | Summary: In the SPMD case, we need to initialize the data sharing and globalization infrastructure. This covers the case when an SPMD region calls a function in a different compilation unit. Reviewers: ABataev, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D49188 llvm-svn: 337015
* [OPENMP, NVPTX] Do not globalize local variables in parallel regions.Alexey Bataev2018-07-091-10/+3
| | | | | | | | | | In generic data-sharing mode we are allowed to not globalize local variables that escape their declaration context iff they are declared inside of the parallel region. We can do this because L2 parallel regions are executed sequentially and, thus, we do not need to put shared local variables in the global memory. llvm-svn: 336567
* [OPENMP, NVPTX] Reduce the number of the globalized variables.Alexey Bataev2018-06-261-9/+43
| | | | | | | | | Patch tries to make better analysis of the variables that should be globalized. From now, instead of all parallel directives it will check only distribute parallel .. directives and check only for firstprivte/lastprivate variables if they must be globalized. llvm-svn: 335632
* [OPENMP, NVPTX] Fix reduction of the big data types/structures.Alexey Bataev2018-06-221-21/+115
| | | | | | | | If the shuffle is required for the reduced structures/big data type, current code may cause compiler crash because of the loading of the aggregate values. Patch fixes this problem. llvm-svn: 335377
* [OPENMP, NVPTX] Fix globalization of the variables passed to orphanedAlexey Bataev2018-06-211-43/+55
| | | | | | | | | | parallel region. If the current construct requires sharing of the local variable in the inner parallel region, this variable must be globalized to avoid runtime crash. llvm-svn: 335285
* [OPENMP, NVPTX] Emit simple reduction if requested.Alexey Bataev2018-06-181-0/+6
| | | | | | | If simple reduction is requested, use the simple reduction instead of the runtime functions calls. llvm-svn: 334962
* [OPENMP, NVPTX] Fixed codegen for orphaned parallel region.Alexey Bataev2018-05-251-25/+19
| | | | | | | | | | | | | | If orphaned parallel region is found, the next code must be emitted: ``` if(__kmpc_is_spmd_exec_mode() || __kmpc_parallel_level(loc, gtid)) Serialized execution. else if (IsMasterThread()) Prepare and signal worker. else Outined function call. ``` llvm-svn: 333301
* [OPENMP, NVPTX] Add check for SPMD mode in orphaned parallel directives.Alexey Bataev2018-05-161-6/+34
| | | | | | | | If the orphaned directive is executed in SPMD mode, we need to emit the check for the SPMD mode and run the orphaned parallel directive in sequential mode. llvm-svn: 332467
* [OPENMP, NVPTX] Do not globalize variables with reference/pointer types.Alexey Bataev2018-05-151-20/+19
| | | | | | | | In generic data-sharing mode we do not need to globalize variables/parameters of reference/pointer types. They already are placed in the global memory. llvm-svn: 332380
* [OPENMP, NVPTX] Do not use SPMD mode for target simd and target teamsAlexey Bataev2018-05-111-19/+13
| | | | | | | | | distribute simd directives. Directives `target simd` and `target teams distribute simd` must be executed in non-SPMD mode. llvm-svn: 332129
* [OPENMP, NVPTX] Initial support for L2 parallelism in SPMD mode.Alexey Bataev2018-05-101-59/+164
| | | | | | | | Added initial support for L2 parallelism in SPMD mode. Note, though, that the orphaned parallel directives are not currently supported in SPMD mode. llvm-svn: 332016
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-091-20/+20
| | | | | | | | | | | | | | | | | | | This is similar to the LLVM change https://reviews.llvm.org/D46290. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46320 llvm-svn: 331834
* [OPENMP, NVPTX] Codegen for critical construct.Alexey Bataev2018-05-071-0/+60
| | | | | | Added correct codegen for the critical construct on NVPTX devices. llvm-svn: 331652
* [OPENMP, NVPTX] Added support for L2 parallelism.Alexey Bataev2018-05-071-97/+346
| | | | | | | Added initial codegen for level 2, 3 etc. parallelism. Currently, all the second, the third etc. parallel regions will run sequentially. llvm-svn: 331642
* [OPENMP] Add support for reductions on simd directives in targetAlexey Bataev2018-05-021-11/+47
| | | | | | | | regions. Added codegen for `simd reduction()` constructs in target directives. llvm-svn: 331393
* [OPENMP] Emit names of the globals depending on target.Alexey Bataev2018-05-021-1/+2
| | | | | | | | Some symbols are not allowed to be used as names on some targets. Patch ries to unify the emission of the names of LLVM globals so they could be used on different targets. llvm-svn: 331358
* [OPENMP] Do not cast captured by value variables with pointer types inAlexey Bataev2018-04-231-1/+2
| | | | | | | | | | | NVPTX target. When generating the wrapper function for the offloading region, we need to call the outlined function and cast the arguments correctly to follow the ABI. Usually, variables captured by value are casted to `uintptr_t` type. But this should not performed for the variables with pointer type. llvm-svn: 330620
* [OPENMP] General code improvements.Alexey Bataev2018-04-161-132/+136
| | | | llvm-svn: 330154
* [OPENMP] Additional attributes for the pointer parameters.Alexey Bataev2018-04-101-0/+6
| | | | | | Added attributes for better optimization of the OpenMP code. llvm-svn: 329751
* [OPENMP, NVPTX] Fix codegen for the teams reduction.Alexey Bataev2018-04-061-25/+19
| | | | | | | Added NUW flags for all the add|mul|sub operations + replaced sdiv by udiv as we operate on unsigned values only (addresses, converted to integers) llvm-svn: 329411
* [OPENMP] Added emission of offloading data sections for declare targetAlexey Bataev2018-03-301-1/+16
| | | | | | | | | | variables. Added emission of the offloading data sections for the variables within declare target regions + fixes emission of the declare target variables marked as declare target not within the declare target region. llvm-svn: 328888
* [OpenMP][Clang] Add call to global data sharing stack initialization on the ↵Gheorghe-Teodor Bercea2018-03-221-0/+5
| | | | | | | | | | | | | | | | workers side Summary: The workers also need to initialize the global stack. The call to the initialization function needs to happen after the kernel_init() function is called by the master. This ensures that the per-team data structures of the runtime have been initialized. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D44749 llvm-svn: 328219
* [OPENMP, NVPTX] Codegen for target distribute parallel combinedAlexey Bataev2018-03-201-8/+46
| | | | | | | | | | constructs in generic mode. Fixed codegen for distribute parallel combined constructs. We have to pass and read the shared lower and upper bound from the distribute region in the inner parallel region. Patch is for generic mode. llvm-svn: 327990
* [OPENMP, NVPTX] Globalization of the private redeclarations.Alexey Bataev2018-03-201-77/+141
| | | | | | | | If the generic codegen is enabled and private copy of the original variable escapes the declaration context, this private copy should be globalized just like it was the original variable. llvm-svn: 327985
* [OPENMP, NVPTX] Reworked castToType() function, NFC.Alexey Bataev2018-03-191-27/+34
| | | | | | | Reworked function castToType to use more frontend functionality rather than the backend. llvm-svn: 327873
* [OPENMP] Fix build with MSVC, NFC.Alexey Bataev2018-03-191-2/+2
| | | | llvm-svn: 327868
* [OPENMP, NVPTX] Emit correct thread id.Alexey Bataev2018-03-191-24/+27
| | | | | | | We emitted fake thread id for the outined function in NVPTX codegen. Patch adds emission of the real thread id. llvm-svn: 327867
* Fix compilation warning introduced in r327654Mikael Holmen2018-03-161-1/+1
| | | | | | | | | | | | | | The compiler complained about ../tools/clang/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:184:15: error: unused variable 'CSI' [-Werror,-Wunused-variable] if (auto *CSI = CGF.CapturedStmtInfo) { ^ 1 error generated. I don't know this code but it seems like an easy fix so I push it anyway to get rid of the warning. llvm-svn: 327694
* [OPENMP, NVPTX] Improve globalization of the variables captured by value.Alexey Bataev2018-03-151-64/+155
| | | | | | | | | | | | | If the variable is captured by value and the corresponding parameter in the outlined function escapes its declaration context, this parameter must be globalized. To globalize it we need to get the address of the original parameter, load the value, store it to the global address and use this global address instead of the original. Patch improves globalization for parallel|teams regions + functions in declare target regions. llvm-svn: 327654
* [OpenMP] Add OpenMP data sharing infrastructure using global memoryGheorghe-Teodor Bercea2018-03-141-22/+582
| | | | | | | | | | | | | | | | | Summary: This patch handles the Clang code generation phase for the OpenMP data sharing infrastructure. TODO: add a more detailed description. Reviewers: ABataev, carlo.bertolli, caomhin, hfinkel, Hahnfeld Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43660 llvm-svn: 327513
* [OpenMP] Remove implicit data sharing code gen that aims to use device ↵Gheorghe-Teodor Bercea2018-03-071-186/+26
| | | | | | | | | | | | | | | | shared memory Summary: Remove this scheme for now since it will be covered by another more generic scheme using global memory. This code will be worked into an optimization for the generic data sharing scheme. Removing this completely and then adding it via future patches will make all future data sharing patches cleaner. Reviewers: ABataev, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43625 llvm-svn: 326948
* Pass a GlobalDecl to SetInternalFunctionAttributes. NFC.Rafael Espindola2018-02-281-6/+6
| | | | | | | | This just reduces the noise in a followup patch. Part of D43900. llvm-svn: 326385
* [OpenMP] Extend NVPTX SPMD implementation of combined constructsCarlo Bertolli2018-02-281-21/+12
| | | | | | | | Differential Revision: https://reviews.llvm.org/D43852 This patch extends the SPMD implementation to all target constructs and guards this implementation under a new flag. llvm-svn: 326368
* Recommit rL323952: [DebugInfo] Enable debug information for C99 VLA types.Sander de Smalen2018-02-031-1/+1
| | | | | | Fixed build issue when building with g++-4.8 (specialization after instantiation). llvm-svn: 324173
* Reverting patch rL323952 due to build errors that ISander de Smalen2018-02-011-1/+1
| | | | | | haven't encountered in local builds. llvm-svn: 323956
* [DebugInfo] Enable debug information for C99 VLA typesSander de Smalen2018-02-011-1/+1
| | | | | | | | | | | | | | | | | | | | | Summary: This patch enables debugging of C99 VLA types by generating more precise LLVM Debug metadata, using the extended DISubrange 'count' field that takes a DIVariable. This should implement: Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's) https://bugs.llvm.org/show_bug.cgi?id=30553 Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits Differential Revision: https://reviews.llvm.org/D41698 llvm-svn: 323952
* [OPENMP] Remove more empty SourceLocations() from the code.Alexey Bataev2018-01-231-18/+15
| | | | | | | Removed more empty SourceLocations() from the OpenMP code and replaced with the correct locations for better debug info emission. llvm-svn: 323232
* [OPENMP] Replace calls of getAssociatedStmt().Alexey Bataev2018-01-121-1/+1
| | | | | | | | | | | | | getAssociatedStmt() returns the outermost captured statement for the OpenMP directive. It may return incorrect region in case of combined constructs. Reworked the code to reduce the number of calls of getAssociatedStmt() and used getInnermostCapturedStmt() and getCapturedStmt() functions instead. In case of firstprivate variables it may lead to an extra allocas generation for private copies even if the variable is passed by value into outlined function and could be used directly as private copy. llvm-svn: 322393
* [OPENMP] Fix debug info for outlined functions in NVPTX + add more tests.Alexey Bataev2018-01-081-4/+4
| | | | | | | Fixed name of emitted outlined functions in NVPTX target + extra tests for the debug info. llvm-svn: 322022
* [OPENMP] Fix casting in NVPTX support library.Alexey Bataev2018-01-041-37/+49
| | | | | | | | If the reduction required shuffle in the NVPTX codegen, we may need to cast the reduced value to the integer type. This casting was implemented incorrectly and may cause compiler crash. Patch fixes this problem. llvm-svn: 321818
* [OPENMP] Add debug info for generated functions.Alexey Bataev2018-01-041-71/+81
| | | | | | | Most of the generated functions for the OpenMP were generated with disabled debug info. Patch fixes this for better user experience. llvm-svn: 321816
* [OpenMP] Further adjustments of nvptx runtime functionsJonas Hahnfeld2017-12-271-11/+20
| | | | | | | | Pass in default value of 1, similar to previous commit r318836. Differential Revision: https://reviews.llvm.org/D41012 llvm-svn: 321486
* [OpenMP] Add function attribute for triggering data sharing.Gheorghe-Teodor Bercea2017-12-121-0/+2
| | | | | | | | | | | | | | | | Summary: The backend should only emit data sharing code for the cases where it is needed. A new function attribute is used by Clang to enable data sharing only for the cases where OpenMP semantics require it and there are variables that need to be shared. Reviewers: hfinkel, Hahnfeld, ABataev, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: cfe-commits, jholewinski Differential Revision: https://reviews.llvm.org/D41123 llvm-svn: 320527
* [OPENMP] Do not mark captured variables as artificial in debug info.Alexey Bataev2017-11-221-3/+11
| | | | | | | Captured variables should not be marked as artificial parameters in outlined functions in debug info. llvm-svn: 318843
* [OpenMP] Adjust arguments of nvptx runtime functionsJonas Hahnfeld2017-11-221-12/+20
| | | | | | | | | | | | | In the future the compiler will analyze whether the OpenMP runtime needs to be (fully) initialized and avoid that overhead if possible. The functions already take an argument to transfer that information to the runtime, so pass in the default value 1. (This is needed for binary compatibility with libomptarget-nvptx currently being upstreamed.) Differential Revision: https://reviews.llvm.org/D40354 llvm-svn: 318836
* [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs ↵Gheorghe-Teodor Bercea2017-11-211-23/+171
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | using OpenMP device offloading Summary: This patch is part of the development effort to add support in the current OpenMP GPU offloading implementation for implicitly sharing variables between a target region executed by the team master thread and the worker threads within that team. This patch is the first of three required for successfully performing the implicit sharing of master thread variables with the worker threads within a team. The remaining two patches are: - Patch D38978 to the LLVM NVPTX backend which ensures the lowering of shared variables to an device memory which allows the sharing of references; - Patch (coming soon) is a patch to libomptarget runtime library which ensures that a list of references to shared variables is properly maintained. A simple code snippet which illustrates an implicit data sharing situation is as follows: ``` #pragma omp target { // master thread only int v; #pragma omp parallel { // worker threads // use v } } ``` Variable v is implicitly shared from the team master thread which executes the code in between the target and parallel directives. The worker threads must operate on the latest version of v, including any updates performed by the master. The code generated in this patch relies on the LLVM NVPTX patch (mentioned above) which prevents v from being lowered in the thread local memory of the master thread thus making the reference to this variable un-shareable with the workers. This ensures that the code generated by this patch is correct. Since the parallel region is outlined the passing of arguments to the outlined regions must preserve the original order of arguments. The runtime therefore maintains a list of references to shared variables thus ensuring their passing in the correct order. The passing of arguments to the outlined parallel function is performed in a separate function which the data sharing infrastructure constructs in this patch. The function is inlined when optimizations are enabled. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, Hahnfeld, ABataev, caomhin Reviewed By: ABataev Subscribers: cfe-commits, jholewinski Differential Revision: https://reviews.llvm.org/D38976 llvm-svn: 318773
* [clang] Remove redundant return [NFC]Mandeep Singh Grang2017-11-131-1/+0
| | | | | | | | | | | | | | Reviewers: rsmith, sfantao, mcrosier Reviewed By: mcrosier Subscribers: jholewinski, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D39915 llvm-svn: 318074
* [OPENMP] Codegen for `#pragma omp target parallel for simd`.Alexey Bataev2017-11-091-0/+1
| | | | | | Added codegen for `#pragma omp target parallel for simd` and clauses. llvm-svn: 317813
* [OPENMP] Codegen for `#pragma omp target parallel for`.Alexey Bataev2017-11-081-0/+1
| | | | llvm-svn: 317719
* Convert clang::LangAS to a strongly typed enumAlexander Richardson2017-10-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Convert clang::LangAS to a strongly typed enum Currently both clang AST address spaces and target specific address spaces are represented as unsigned which can lead to subtle errors if the wrong type is passed. It is especially confusing in the CodeGen files as it is not possible to see what kind of address space should be passed to a function without looking at the implementation. I originally made this change for our LLVM fork for the CHERI architecture where we make extensive use of address spaces to differentiate between capabilities and pointers. When merging the upstream changes I usually run into some test failures or runtime crashes because the wrong kind of address space is passed to a function. By converting the LangAS enum to a C++11 we can catch these errors at compile time. Additionally, it is now obvious from the function signature which kind of address space it expects. I found the following errors while writing this patch: - ItaniumRecordLayoutBuilder::LayoutField was passing a clang AST address space to TargetInfo::getPointer{Width,Align}() - TypePrinter::printAttributedAfter() prints the numeric value of the clang AST address space instead of the target address space. However, this code is not used so I kept the current behaviour - initializeForBlockHeader() in CGBlocks.cpp was passing LangAS::opencl_generic to TargetInfo::getPointer{Width,Align}() - CodeGenFunction::EmitBlockLiteral() was passing a AST address space to TargetInfo::getPointerWidth() - CGOpenMPRuntimeNVPTX::translateParameter() passed a target address space to Qualifiers::addAddressSpace() - CGOpenMPRuntimeNVPTX::getParameterAddress() was using llvm::Type::getPointerTo() with a AST address space - clang_getAddressSpace() returns either a LangAS or a target address space. As this is exposed to C I have kept the current behaviour and added a comment stating that it is probably not correct. Other than this the patch should not cause any functional changes. Reviewers: yaxunl, pcc, bader Reviewed By: yaxunl, bader Subscribers: jlebar, jholewinski, nhaehnle, Anastasia, cfe-commits Differential Revision: https://reviews.llvm.org/D38816 llvm-svn: 315871
OpenPOWER on IntegriCloud