diff options
Diffstat (limited to 'libstdc++-v3/doc/html/ext/parallel_mode.html')
-rw-r--r-- | libstdc++-v3/doc/html/ext/parallel_mode.html | 593 |
1 files changed, 593 insertions, 0 deletions
diff --git a/libstdc++-v3/doc/html/ext/parallel_mode.html b/libstdc++-v3/doc/html/ext/parallel_mode.html new file mode 100644 index 00000000000..7ca8dbe937c --- /dev/null +++ b/libstdc++-v3/doc/html/ext/parallel_mode.html @@ -0,0 +1,593 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!DOCTYPE html + PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> + +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> +<head> + <meta name="AUTHOR" content="bkoz@gcc.gnu.org (Benjamin Kosnik)" /> + <meta name="KEYWORDS" content="c++, libstdc++, gdb, g++, debug" /> + <meta name="DESCRIPTION" content="The libstdc++ parallel mode." /> + <meta name="GENERATOR" content="emacs and ten fingers" /> + <title>The libstdc++ parallel mode</title> +<link rel="StyleSheet" href="lib3styles.css" type="text/css" /> +<link rel="Copyright" href="17_intro/license.html" type="text/html" /> +</head> +<body> + +<h1 class="centered"><a name="top">The libstdc++ parallel mode</a></h1> + +<p class="fineprint"><em> + The latest version of this document is always available at + <a href="http://gcc.gnu.org/onlinedocs/libstdc++/parallel_mode.html"> + http://gcc.gnu.org/onlinedocs/libstdc++/parallel_mode.html</a>. +</em></p> + +<p><em> + To the <a href="http://gcc.gnu.org/libstdc++/">libstdc++ homepage</a>. +</em></p> + +<!-- ####################################################### --> +<hr /> +<p> The libstdc++ parallel mode is an experimental parallel +implementation of many algorithms the C++ Standard Library. +</p> + +<p> +Several of the standard algorithms, for instance +<code>std::sort</code>, are made parallel using OpenMP +annotations. These parallel mode constructs and can be invoked by +explicit source declaration or by compiling existing sources with a +specific compiler flag. +</p> + +<h3 class="left"><a name="parallel">The libstdc++ parallel mode</a></h3> + +<p>The libstdc++ parallel mode performs parallelization of algorithms, +function objects, classes, and functions in the C++ Standard.</p> + +<h4 class="left">Using the libstdc++ parallel mode</h4> + +<p>To use the libstdc++ parallel mode, compile your application with + the compiler flag <code>-D_GLIBCXX_PARALLEL -fopenmp</code>. This + will link in <code>libgomp</code>, the GNU OpenMP <a + href="http://gcc.gnu.org/onlinedocs/libgomp">implementation</a>, + whose presence is mandatory. In addition, hardware capable of atomic + operations is mandatory. Actually activating these atomic + operations may require explicit compiler flags on some targets + (like sparc and x86), such as <code>-march=i686</code>, + <code>-march=native</code> or <code>-mcpu=v9</code>. +</p> + +<p>Note that the <code>_GLIBCXX_PARALLEL</code> define may change the + sizes and behavior of standard class templates such as + <code>std::search</code>, and therefore one can only link code + compiled with parallel mode and code compiled without parallel mode + if no instantiation of a container is passed between the two + translation units. Parallel mode functionality has distinct linkage, + and cannot be confused with normal mode symbols.</p> + + +<p>The following library components in the include +<code><numeric></code> are included in the parallel mode:</p> +<ul> + <li><code>std::accumulate</code></li> + <li><code>std::adjacent_difference</code></li> + <li><code>std::inner_product</code></li> + <li><code>std::partial_sum</code></li> +</ul> + +<p>The following library components in the include +<code><algorithm></code> are included in the parallel mode:</p> +<ul> + <li><code>std::adjacent_find</code></li> + <li><code>std::count</code></li> + <li><code>std::count_if</code></li> + <li><code>std::equal</code></li> + <li><code>std::find</code></li> + <li><code>std::find_if</code></li> + <li><code>std::find_first_of</code></li> + <li><code>std::for_each</code></li> + <li><code>std::generate</code></li> + <li><code>std::generate_n</code></li> + <li><code>std::lexicographical_compare</code></li> + <li><code>std::mismatch</code></li> + <li><code>std::search</code></li> + <li><code>std::search_n</code></li> + <li><code>std::transform</code></li> + <li><code>std::replace</code></li> + <li><code>std::replace_if</code></li> + <li><code>std::max_element</code></li> + <li><code>std::merge</code></li> + <li><code>std::min_element</code></li> + <li><code>std::nth_element</code></li> + <li><code>std::partial_sort</code></li> + <li><code>std::partition</code></li> + <li><code>std::random_shuffle</code></li> + <li><code>std::set_union</code></li> + <li><code>std::set_intersection</code></li> + <li><code>std::set_symmetric_difference</code></li> + <li><code>std::set_difference</code></li> + <li><code>std::sort</code></li> + <li><code>std::stable_sort</code></li> + <li><code>std::unique_copy</code></li> +</ul> + +<p>The following library components in the includes +<code><set></code> and <code><map></code> are included in the parallel mode:</p> +<ul> + <li><code>std::(multi_)map/set<T>::(multi_)map/set(Iterator begin, Iterator end)</code> (bulk construction)</li> + <li><code>std::(multi_)map/set<T>::insert(Iterator begin, Iterator end)</code> (bulk insertion)</li> +</ul> + + +<h4 class="left">Using the parallel algorithms without parallel mode</h4> + +<p>When it is not feasible to recompile your entire application, or + only specific algorithms need to be parallel-aware, individual + parallel algorithms can be made available explicitly. These + parallel algorithms are functionally equivalent to the standard + drop-in algorithms used in parallel mode, but they are available in + a separate namespace as GNU extensions and may be used in programs + compiled with either release mode or with parallel mode. The + following table provides the names and headers of the parallel + algorithms: +</p> + + +<table title="Parallel algorithms" border="1"> + <tr> + <th>Algorithm</th> + <th>Header</th> + <th>Parallel algorithm</th> + <th>Parallel header</th> + </tr> + <tr> + <td>std::accumulate</td> + <td><numeric></td> + <td>__gnu_parallel::accumulate</td> + <td><parallel/numeric></td> + </tr> + <tr> + <td>std::adjacent_difference</td> + <td><numeric></td> + <td>__gnu_parallel::adjacent_difference</td> + <td><parallel/numeric></td> + </tr> + <tr> + <td>std::inner_product</td> + <td><numeric></td> + <td>__gnu_parallel::inner_product</td> + <td><parallel/numeric></td> + </tr> + <tr> + <td>std::partial_sum</td> + <td><numeric></td> + <td>__gnu_parallel::partial_sum</td> + <td><parallel/numeric></td> + </tr> + <tr> + <td>std::adjacent_find</td> + <td><algorithm></td> + <td>__gnu_parallel::adjacent_find</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::count</td> + <td><algorithm></td> + <td>__gnu_parallel::count</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::count_if</td> + <td><algorithm></td> + <td>__gnu_parallel::count_if</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::equal</td> + <td><algorithm></td> + <td>__gnu_parallel::equal</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::find</td> + <td><algorithm></td> + <td>__gnu_parallel::find</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::find_if</td> + <td><algorithm></td> + <td>__gnu_parallel::find_if</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::find_first_of</td> + <td><algorithm></td> + <td>__gnu_parallel::find_first_of</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::for_each</td> + <td><algorithm></td> + <td>__gnu_parallel::for_each</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::generate</td> + <td><algorithm></td> + <td>__gnu_parallel::generate</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::generate_n</td> + <td><algorithm></td> + <td>__gnu_parallel::generate_n</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::lexicographical_compare</td> + <td><algorithm></td> + <td>__gnu_parallel::lexicographical_compare</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::mismatch</td> + <td><algorithm></td> + <td>__gnu_parallel::mismatch</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::search</td> + <td><algorithm></td> + <td>__gnu_parallel::search</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::search_n</td> + <td><algorithm></td> + <td>__gnu_parallel::search_n</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::transform</td> + <td><algorithm></td> + <td>__gnu_parallel::transform</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::replace</td> + <td><algorithm></td> + <td>__gnu_parallel::replace</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::replace_if</td> + <td><algorithm></td> + <td>__gnu_parallel::replace_if</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::max_element</td> + <td><algorithm></td> + <td>__gnu_parallel::max_element</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::merge</td> + <td><algorithm></td> + <td>__gnu_parallel::merge</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::min_element</td> + <td><algorithm></td> + <td>__gnu_parallel::min_element</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::nth_element</td> + <td><algorithm></td> + <td>__gnu_parallel::nth_element</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::partial_sort</td> + <td><algorithm></td> + <td>__gnu_parallel::partial_sort</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::partition</td> + <td><algorithm></td> + <td>__gnu_parallel::partition</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::random_shuffle</td> + <td><algorithm></td> + <td>__gnu_parallel::random_shuffle</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::set_union</td> + <td><algorithm></td> + <td>__gnu_parallel::set_union</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::set_intersection</td> + <td><algorithm></td> + <td>__gnu_parallel::set_intersection</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::set_symmetric_difference</td> + <td><algorithm></td> + <td>__gnu_parallel::set_symmetric_difference</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::set_difference</td> + <td><algorithm></td> + <td>__gnu_parallel::set_difference</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::sort</td> + <td><algorithm></td> + <td>__gnu_parallel::sort</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::stable_sort</td> + <td><algorithm></td> + <td>__gnu_parallel::stable_sort</td> + <td><parallel/algorithm></td> + </tr> + + <tr> + <td>std::unique_copy</td> + <td><algorithm></td> + <td>__gnu_parallel::unique_copy</td> + <td><parallel/algorithm></td> + </tr> + +</table> + + +<h4 class="left">Parallel mode semantics</h4> + +<p> The parallel mode STL algorithms are currently not exception-safe, +i. e. user-defined functors must not throw exceptions. +</p> + +<p> Since the current GCC OpenMP implementation does not support +OpenMP parallel regions in concurrent threads, +it is not possible to call parallel STL algorithm in +concurrent threads, either. +It might work with other compilers, though.</p> + + +<h4 class="left">Configuration and Tuning</h4> + +<p> Some algorithm variants can be enabled/disabled/selected at compile-time. +See <a href="latest-doxygen/compiletime__settings_8h.html"> +<code><compiletime_settings.h></code></a> and +See <a href="latest-doxygen/compiletime__settings_8h.html"> +<code><features.h></code></a> for details. +</p> + +<p> +To specify the number of threads to be used for an algorithm, +use <code>omp_set_num_threads</code>. +To force a function to execute sequentially, +even though parallelism is switched on in general, +add <code>__gnu_parallel::sequential_tag()</code> +to the end of the argument list. +</p> + +<p> +Parallelism always incurs some overhead. Thus, it is not +helpful to parallelize operations on very small sets of data. +There are measures to avoid parallelizing stuff that is not worth it. +For each algorithm, a minimum problem size can be stated, +usually using the variable +<code>__gnu_parallel::Settings::[algorithm]_minimal_n</code>. +Please see <a href="latest-doxygen/settings_8h.html"> +<code><settings.h></code></a> for details.</p> + + + +<h4 class="left">Interface basics and general design</h4> + +<p>All parallel algorithms are intended to have signatures that are +equivalent to the ISO C++ algorithms replaced. For instance, the +<code>std::adjacent_find</code> function is declared as: +</p> +<pre> +namespace std +{ + template<typename _FIter> + _FIter + adjacent_find(_FIter, _FIter); +} +</pre> + +<p> +Which means that there should be something equivalent for the parallel +version. Indeed, this is the case: +</p> + +<pre> +namespace std +{ + namespace __parallel + { + template<typename _FIter> + _FIter + adjacent_find(_FIter, _FIter); + + ... + } +} +</pre> + +<p>But.... why the elipses? +</p> + +<p> The elipses in the example above represent additional overloads +required for the parallel version of the function. These additional +overloads are used to dispatch calls from the ISO C++ function +signature to the appropriate parallel function (or sequential +function, if no parallel functions are deemed worthy), based on either +compile-time or run-time conditions. +</p> + +<p> Compile-time conditions are referred to as "embarrassingly +parallel," and are denoted with the appropriate dispatch object, ie +one of <code>__gnu_parallel::sequential_tag</code>, +<code>__gnu_parallel::parallel_tag</code>, +<code>__gnu_parallel::balanced_tag</code>, +<code>__gnu_parallel::unbalanced_tag</code>, +<code>__gnu_parallel::omp_loop_tag</code>, or +<code>__gnu_parallel::omp_loop_static_tag</code>. +</p> + +<p> Run-time conditions depend on the hardware being used, the number +of threads available, etc., and are denoted by the use of the enum +<code>__gnu_parallel::parallelism</code>. Values of this enum include +<code>__gnu_parallel::sequential</code>, +<code>__gnu_parallel::parallel_unbalanced</code>, +<code>__gnu_parallel::parallel_balanced</code>, +<code>__gnu_parallel::parallel_omp_loop</code>, +<code>__gnu_parallel::parallel_omp_loop_static</code>, or +<code>__gnu_parallel::parallel_taskqueue</code>. +</p> + +<p> Putting all this together, the general view of overloads for the +parallel algorithms look like this: +</p> +<ul> + <li>ISO C++ signature</li> + <li>ISO C++ signature + sequential_tag argument</li> + <li>ISO C++ signature + parallelism argument</li> +</ul> + +<p> Please note that the implementation may use additional functions +(designated with the <code>_switch</code> suffix) to dispatch from the +ISO C++ signature to the correct parallel version. Also, some of the +algorithms do not have support for run-time conditions, so the last +overload is therefore missing. +</p> + + +<h4 class="left">Relevant namespaces</h4> + +<p> One namespace contain versions of code that are explicitly sequential: +<code>__gnu_serial</code>. +</p> + +<p> Two namespaces contain the parallel mode: +<code>std::__parallel</code> and <code>__gnu_parallel</code>. +</p> + +<p> Parallel implementations of standard components, including +template helpers to select parallelism, are defined in <code>namespace +std::__parallel</code>. For instance, <code>std::transform</code> from +<algorithm> has a parallel counterpart in +<code>std::__parallel::transform</code> from +<parallel/algorithm>. In addition, these parallel +implementations are injected into <code>namespace +__gnu_parallel</code> with using declarations. +</p> + +<p> Support and general infrastructure is in <code>namespace +__gnu_parallel</code>. +</p> + +<p> More information, and an organized index of types and functions +related to the parallel mode on a per-namespace basis, can be found in +the generated source documentation. +</p> + +<h4 class="left">Testing</h4> + +<p> Both the normal conformance and regression tests and the +supplemental performance tests work.</p> + +<p> To run the conformance and regression tests with the parallel mode +active,</p> +<code>make check-parallel</code> + +<p>The log and summary files for conformance testing are in the +<code>testsuite/parallel</code> directory.</p> + +<p> To run the performance tests with the parallel mode active, </p> +<code>make check-performance-parallel</code> + +<p>The result file for performance testing are in the +<code>testsuite</code> directory, in the file +<code>libstdc++_performance.sum</code>. In addition, the policy-based +containers have their own visualizations, which have additional +software dependencies than the usual bare-boned text file, and can be +generated by using the <code>make doc-performance</code> rule in the +testsuite's Makefile.</p> + +<p>Return <a href="#top">to the top of the page</a> or + <a href="http://gcc.gnu.org/libstdc++/">to the libstdc++ homepage</a>. +</p> + + +<h4 class="left">References / Further Reading</h4> + +<p> +Johannes Singler, Peter Sanders, Felix Putze. The Multi-Core Standard Template Library. Euro-Par 2007: Parallel Processing. (LNCS 4641) +</p> + +<p> +Leonor Frias, Johannes Singler: Parallelization of Bulk Operations for STL Dictionaries. Workshop on Highly Parallel Processing on a Chip (HPPC) 2007. (LNCS) +</p> + +<!-- ####################################################### --> + +<hr /> +<p class="fineprint"><em> +See <a href="17_intro/license.html">license.html</a> for copying conditions. +Comments and suggestions are welcome, and may be sent to +<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>. +</em></p> + + +</body> +</html> |