summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [OpenMP] Update target codegen for NVPTX device.Arpith Chacko Jacob2017-01-051-139/+129
| | | | | | | | | | | | | | | | | This patch includes updates for codegen of the target region for the NVPTX device. It moves initializers from the compiler to the runtime and updates the worker loop to assume parallel work is retrieved from the runtime. A subsequent patch will update the codegen to retrieve the parallel work using calls to the runtime. It includes the removal of the inline attribute for the worker loop and disabling debug info in it. This allows codegen for a target directive and serial execution on the NVPTX device. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28125 llvm-svn: 291121
* Reverting commit r290983 while debugging test failure on windows.Arpith Chacko Jacob2017-01-041-129/+139
| | | | llvm-svn: 290989
* [OpenMP] Update target codegen for NVPTX device.Arpith Chacko Jacob2017-01-041-139/+129
| | | | | | | | | | | | | | | | | This patch includes updates for codegen of the target region for the NVPTX device. It moves initializers from the compiler to the runtime and updates the worker loop to assume parallel work is retrieved from the runtime. A subsequent patch will update the codegen to retrieve the parallel work using calls to the runtime. It includes the removal of the inline attribute for the worker loop and disabling debug info in it. This allows codegen for a target directive and serial execution on the NVPTX device. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28125 llvm-svn: 290983
* [OpenMP] Code cleanup for NVPTX OpenMP codegenArpith Chacko Jacob2017-01-031-33/+31
| | | | | | | | | | | This patch cleans up private methods for NVPTX OpenMP codegen. It converts private members to static functions to follow the coding style of CGOpenMPRuntime.cpp and declutter the header file. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28124 llvm-svn: 290904
* Cleanup the handling of noinline function attributes, -fno-inline,Chandler Carruth2016-12-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -fno-inline-functions, -O0, and optnone. These were really, really tangled together: - We used the noinline LLVM attribute for -fno-inline - But not for -fno-inline-functions (breaking LTO) - But we did use it for -finline-hint-functions (yay, LTO is happy!) - But we didn't for -O0 (LTO is sad yet again...) - We had weird structuring of CodeGenOpts with both an inlining enumeration and a boolean. They interacted in weird ways and needlessly. - A *lot* of set smashing went on with setting these, and then got worse when we considered optnone and other inlining-effecting attributes. - A bunch of inline affecting attributes were managed in a completely different place from -fno-inline. - Even with -fno-inline we failed to put the LLVM noinline attribute onto many generated function definitions because they didn't show up as AST-level functions. - If you passed -O0 but -finline-functions we would run the normal inliner pass in LLVM despite it being in the O0 pipeline, which really doesn't make much sense. - Lastly, we used things like '-fno-inline' to manipulate the pass pipeline which forced the pass pipeline to be much more parameterizable than it really needs to be. Instead we can *just* use the optimization level to select a pipeline and control the rest via attributes. Sadly, this causes a bunch of churn in tests because we don't run the optimizer in the tests and check the contents of attribute sets. It would be awesome if attribute sets were a bit more FileCheck friendly, but oh well. I think this is a significant improvement and should remove the semantic need to change what inliner pass we run in order to comply with the requested inlining semantics by relying completely on attributes. It also cleans up tho optnone and related handling a bit. One unfortunate aspect of this is that for generating alwaysinline routines like those in OpenMP we end up removing noinline and then adding alwaysinline. I tried a bunch of other approaches, but because we recompute function attributes from scratch and don't have a declaration here I couldn't find anything substantially cleaner than this. Differential Revision: https://reviews.llvm.org/D28053 llvm-svn: 290398
* [OPENMP] Fix for PR31417: assert failure when compiling trivial openmpAlexey Bataev2016-12-221-0/+4
| | | | | | | | | | program Offload related code is not quite ready yet, but some simple examples must not crash the compiler. Patch fixes the problem in offloading code with exceptions. llvm-svn: 290364
* [OPENMP] Codegen for teams directive for NVPTXCarlo Bertolli2016-04-041-0/+44
| | | | | | | | | | | | This patch implements the teams directive for the NVPTX backend. It is different from the host code generation path as it: Does not call kmpc_fork_teams. All necessary teams and threads are started upon touching the target region, when launching a CUDA kernel, and their execution is coordinated through sequential and parallel regions within the target region. Does not call kmpc_push_num_teams even if a num_teams of thread_limit clause is present. Setting the number of teams and the thread limit is implemented by the nvptx-related runtime. Please note that I am now passing a Clang Expr * to emitPushNumTeams instead of the originally chosen llvm::Value * type. The reason for that is that I want to avoid emitting expressions for num_teams and thread_limit if they are not needed in the target region. http://reviews.llvm.org/D17963 llvm-svn: 265304
* [OPENMP] Allow runtime insert its own code inside OpenMP regions.Alexey Bataev2016-03-291-13/+17
| | | | | | | | | | Solution unifies interface of RegionCodeGenTy type to allow insert runtime-specific code before/after main codegen action defined in CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy for general OpenMP directives, but must be allowed to insert its own (required) code to support target specific codegen. llvm-svn: 264700
* Revert "[OPENMP] Allow runtime insert its own code inside OpenMP regions."Alexey Bataev2016-03-281-17/+13
| | | | | | Reverting because of failed tests. llvm-svn: 264577
* [OPENMP] Allow runtime insert its own code inside OpenMP regions.Alexey Bataev2016-03-281-13/+17
| | | | | | | | | | Solution unifies interface of RegionCodeGenTy type to allow insert runtime-specific code before/after main codegen action defined in CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy for general OpenMP directives, but must be allowed to insert its own (required) code to support target specific codegen. llvm-svn: 264576
* Revert "[OPENMP] Allow runtime insert its own code inside OpenMP regions."Alexey Bataev2016-03-281-17/+13
| | | | | | This reverts commit 3ee791165100607178073f14531a0dc90c622b36. llvm-svn: 264570
* [OPENMP] Allow runtime insert its own code inside OpenMP regions.Alexey Bataev2016-03-281-13/+17
| | | | | | | | | | Solution unifies interface of RegionCodeGenTy type to allow insert runtime-specific code before/after main codegen action defined in CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy for general OpenMP directives, but must be allowed to insert its own (required) code to support target specific codegen. llvm-svn: 264569
* Fix warning about extra semicolon. NFC.Vasileios Kalintiris2016-03-221-1/+1
| | | | llvm-svn: 264035
* [OpenMP] Base support for target directive codegen on NVPTX device.Arpith Chacko Jacob2016-03-221-1/+327
| | | | | | | | | | | | | | Summary: This patch adds base support for codegen of the target directive on the NVPTX device. Reviewers: ABataev Differential Revision: http://reviews.llvm.org/D17877 Reworked test case after buildbot failure on windows. Updated patch to integrate r263837 and test case nvptx_target_firstprivate_codegen.cpp. llvm-svn: 264018
* Revert r263783 as buildbot failure is being investigated.Arpith Chacko Jacob2016-03-181-322/+1
| | | | llvm-svn: 263784
* [OpenMP] Base support for target directive codegen on NVPTX device.Arpith Chacko Jacob2016-03-181-1/+322
| | | | | | | | | | | | | Summary: Reworked test case after buildbot failure on windows. This patch adds base support for codegen of the target directive on the NVPTX device. Reviewers: ABataev Differential Revision: http://reviews.llvm.org/D17877 llvm-svn: 263783
* Revert commit http://reviews.llvm.org/D17877 to fix tests on x86.Arpith Chacko Jacob2016-03-151-322/+1
| | | | llvm-svn: 263589
* [OpenMP] Base support for target directive codegen on NVPTX device.Arpith Chacko Jacob2016-03-151-1/+322
| | | | | | | | | | | Summary: This patch adds base support for codegen of the target directive on the NVPTX device. Reviewers: ABataev Differential Revision: http://reviews.llvm.org/D17877 llvm-svn: 263587
* Reverted http://reviews.llvm.org/D17877 to fix tests.Arpith Chacko Jacob2016-03-151-322/+1
| | | | llvm-svn: 263555
* [OpenMP] Base support for target directive codegen on NVPTX device.Arpith Chacko Jacob2016-03-151-1/+322
| | | | | | | | | | | Summary: This patch adds base support for codegen of the target directive on the NVPTX device. Reviewers: ABataev Differential Revision: http://reviews.llvm.org/D17877 llvm-svn: 263552
* [OPENMP 4.0] Codegen for 'declare reduction' construct.Alexey Bataev2016-03-041-0/+1
| | | | | | | Emit function for 'combiner' part of 'declare reduction' construct and 'initialilzer' part, if any. llvm-svn: 262699
* Re-apply for the 2nd-time r259977 - [OpenMP] Reorganize code to allow ↵Samuel Antao2016-02-081-0/+21
| | | | | | | | specialized code generation for different devices. This was reverted by r260036, but was not the cause of the problem in the buildbot. llvm-svn: 260106
* Revert "Re-apply r259977 - [OpenMP] Reorganize code to allow specialized ↵Renato Golin2016-02-071-21/+0
| | | | | | | | code generation for different devices." This reverts commit r259985, as it still fails one buildbot. llvm-svn: 260036
* Re-apply r259977 - [OpenMP] Reorganize code to allow specialized code ↵Samuel Antao2016-02-061-0/+21
| | | | | | | | generation for different devices. This was reverted due to a failure in a buildbot, but it turned out the failure was unrelated. llvm-svn: 259985
* Revert r259977 - [OpenMP] Reorganize code to allow specialized code ↵Samuel Antao2016-02-061-21/+0
| | | | | | | | generation for different devices. It triggered some problem in the configuration related with zlib and exposed in the driver. llvm-svn: 259984
* [OpenMP] Reorganize code to allow specialized code generation for different ↵Samuel Antao2016-02-061-0/+21
devices. Summary: Different devices may in some cases require different code generation schemes in order to implement OpenMP. This is required not only for performance reasons, but also because it may not be possible to have the current (default) implementation working for these devices. E.g. GPU's cannot implement the same scheme a target such as powerpc or x86b would use, in the sense that it does not have the ability to fork threads, instead all the threads are always executing and need to be managed by the implementation. This patch proposes a reorganization of the code in the OpenMP code generation to pave the way to have specialized implementation of OpenMP support. More than a "real" patch this is more a request for comments in order to understand if what is proposed is acceptable or if there are better/easier ways to do it. In this patch part of the common OpenMP codegen infrastructure is moved to a new file under a new namespace (CGOpenMPCommon) so it can be shared between the default implementation and the specialized one. When CGOpenMPRuntime is created, an attempt to select a specialized implementation is done. In the patch a specialization for nvptx targets is done which currently checks if the target is an OpenMP device and trap if it is not. Let me know comments suggestions you may have. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: Hahnfeld, cfe-commits, fraggamuffin, caomhin, jholewinski Differential Revision: http://reviews.llvm.org/D16784 llvm-svn: 259977
OpenPOWER on IntegriCloud