diff options
Diffstat (limited to 'libstdc++-v3/doc/html/23_containers/howto.html')
-rw-r--r-- | libstdc++-v3/doc/html/23_containers/howto.html | 457 |
1 files changed, 457 insertions, 0 deletions
diff --git a/libstdc++-v3/doc/html/23_containers/howto.html b/libstdc++-v3/doc/html/23_containers/howto.html new file mode 100644 index 00000000000..c4b6eb856f5 --- /dev/null +++ b/libstdc++-v3/doc/html/23_containers/howto.html @@ -0,0 +1,457 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!DOCTYPE html + PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> + +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> +<head> + <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> + <meta name="AUTHOR" content="pme@gcc.gnu.org (Phil Edwards)" /> + <meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" /> + <meta name="DESCRIPTION" content="HOWTO for the libstdc++ chapter 23." /> + <meta name="GENERATOR" content="vi and eight fingers" /> + <title>libstdc++ HOWTO: Chapter 23: Containers</title> +<link rel="StyleSheet" href="../lib3styles.css" type="text/css" /> +<link rel="Start" href="../documentation.html" type="text/html" + title="GNU C++ Standard Library" /> +<link rel="Prev" href="../22_locale/howto.html" type="text/html" + title="Localization" /> +<link rel="Next" href="../24_iterators/howto.html" type="text/html" + title="Iterators" /> +<link rel="Copyright" href="../17_intro/license.html" type="text/html" /> +<link rel="Help" href="../faq/index.html" type="text/html" title="F.A.Q." /> +</head> +<body> + +<h1 class="centered"><a name="top">Chapter 23: Containers</a></h1> + +<p>Chapter 23 deals with container classes and what they offer. +</p> + + +<!-- ####################################################### --> +<hr /> +<h1>Contents</h1> +<ul> + <li><a href="#1">Making code unaware of the container/array difference</a></li> + <li><a href="#2">Variable-sized bitmasks</a></li> + <li><a href="#3">Containers and multithreading</a></li> + <li><a href="#4">"Hinting" during insertion</a></li> + <li><a href="#5">Bitmasks and string arguments</a></li> + <li><a href="#6"><code>std::list::size()</code> is O(n)!</a></li> + <li><a href="#7">Space overhead management for vectors</a></li> +</ul> + +<hr /> + +<!-- ####################################################### --> + +<h2><a name="1">Making code unaware of the container/array difference</a></h2> + <p>You're writing some code and can't decide whether to use builtin + arrays or some kind of container. There are compelling reasons + to use one of the container classes, but you're afraid that you'll + eventually run into difficulties, change everything back to arrays, + and then have to change all the code that uses those data types to + keep up with the change. + </p> + <p>If your code makes use of the standard algorithms, this isn't as + scary as it sounds. The algorithms don't know, nor care, about + the kind of "container" on which they work, since the + algorithms are only given endpoints to work with. For the container + classes, these are iterators (usually <code>begin()</code> and + <code>end()</code>, but not always). For builtin arrays, these are + the address of the first element and the + <a href="../24_iterators/howto.html#2">past-the-end</a> element. + </p> + <p>Some very simple wrapper functions can hide all of that from the + rest of the code. For example, a pair of functions called + <code>beginof</code> can be written, one that takes an array, another + that takes a vector. The first returns a pointer to the first + element, and the second returns the vector's <code>begin()</code> + iterator. + </p> + <p>The functions should be made template functions, and should also + be declared inline. As pointed out in the comments in the code + below, this can lead to <code>beginof</code> being optimized out of + existence, so you pay absolutely nothing in terms of increased + code size or execution time. + </p> + <p>The result is that if all your algorithm calls look like + </p> + <pre> + std::transform(beginof(foo), endof(foo), beginof(foo), SomeFunction);</pre> + <p>then the type of foo can change from an array of ints to a vector + of ints to a deque of ints and back again, without ever changing any + client code. + </p> + <p>This author has a collection of such functions, called "*of" + because they all extend the builtin "sizeof". It started + with some Usenet discussions on a transparent way to find the length + of an array. A simplified and much-reduced version for easier + reading is <a href="wrappers_h.txt">given here</a>. + </p> + <p>Astute readers will notice two things at once: first, that the + container class is still a <code>vector<T></code> instead of a + more general <code>Container<T></code>. This would mean that + three functions for <code>deque</code> would have to be added, another + three for <code>list</code>, and so on. This is due to problems with + getting template resolution correct; I find it easier just to + give the extra three lines and avoid confusion. + </p> + <p>Second, the line + </p> + <pre> + inline unsigned int lengthof (T (&)[sz]) { return sz; } </pre> + <p>looks just weird! Hint: unused parameters can be left nameless. + </p> + <p>Return <a href="#top">to top of page</a> or + <a href="../faq/index.html">to the FAQ</a>. + </p> + +<hr /> +<h2><a name="2">Variable-sized bitmasks</a></h2> + <p>No, you cannot write code of the form + </p> + <!-- Careful, the leading spaces in PRE show up directly. --> + <pre> + #include <bitset> + + void foo (size_t n) + { + std::bitset<n> bits; + .... + } </pre> + <p>because <code>n</code> must be known at compile time. Your compiler is + correct; it is not a bug. That's the way templates work. (Yes, it + <em>is</em> a feature.) + </p> + <p>There are a couple of ways to handle this kind of thing. Please + consider all of them before passing judgement. They include, in + no particular order: + </p> + <ul> + <li>A very large N in <code>bitset<N></code>.</li> + <li>A container<bool>.</li> + <li>Extremely weird solutions.</li> + </ul> + <p><strong>A very large N in + <code>bitset<N></code>. </strong> It has + been pointed out a few times in newsgroups that N bits only takes up + (N/8) bytes on most systems, and division by a factor of eight is pretty + impressive when speaking of memory. Half a megabyte given over to a + bitset (recall that there is zero space overhead for housekeeping info; + it is known at compile time exactly how large the set is) will hold over + four million bits. If you're using those bits as status flags (e.g., + "changed"/"unchanged" flags), that's a <em>lot</em> + of state. + </p> + <p>You can then keep track of the "maximum bit used" during some + testing runs on representative data, make note of how many of those bits + really need to be there, and then reduce N to a smaller number. Leave + some extra space, of course. (If you plan to write code like the + incorrect example above, where the bitset is a local variable, then you + may have to talk your compiler into allowing that much stack space; + there may be zero space overhead, but it's all allocated inside the + object.) + </p> + <p><strong>A container<bool>. </strong> The Committee + made provision + for the space savings possible with that (N/8) usage previously mentioned, + so that you don't have to do wasteful things like + <code>Container<char></code> or + <code>Container<short int></code>. + Specifically, <code>vector<bool></code> is required to be + specialized for that space savings. + </p> + <p>The problem is that <code>vector<bool></code> doesn't behave like a + normal vector anymore. There have been recent journal articles which + discuss the problems (the ones by Herb Sutter in the May and + July/August 1999 issues of + <u>C++ Report</u> cover it well). Future revisions of the ISO C++ + Standard will change the requirement for <code>vector<bool></code> + specialization. In the meantime, <code>deque<bool></code> is + recommended (although its behavior is sane, you probably will not get + the space savings, but the allocation scheme is different than that + of vector). + </p> + <p><strong>Extremely weird solutions. </strong> If you have + access to + the compiler and linker at runtime, you can do something insane, like + figuring out just how many bits you need, then writing a temporary + source code file. That file contains an instantiation of + <code>bitset</code> + for the required number of bits, inside some wrapper functions with + unchanging signatures. Have your program then call the + compiler on that file using Position Independent Code, then open the + newly-created object file and load those wrapper functions. You'll have + an instantiation of <code>bitset<N></code> for the exact + <code>N</code> + that you need at the time. Don't forget to delete the temporary files. + (Yes, this <em>can</em> be, and <em>has been</em>, done.) + </p> + <!-- I wonder if this next paragraph will get me in trouble... --> + <p>This would be the approach of either a visionary genius or a raving + lunatic, depending on your programming and management style. Probably + the latter. + </p> + <p>Which of the above techniques you use, if any, are up to you and your + intended application. Some time/space profiling is indicated if it + really matters (don't just guess). And, if you manage to do anything + along the lines of the third category, the author would love to hear + from you... + </p> + <p>Also note that the implementation of bitset used in libstdc++ has + <a href="../ext/sgiexts.html#ch23">some extensions</a>. + </p> + <p>Return <a href="#top">to top of page</a> or + <a href="../faq/index.html">to the FAQ</a>. + </p> + +<hr /> +<h2><a name="3">Containers and multithreading</a></h2> + <p>This section discusses issues surrounding the design of + multithreaded applications which use Standard C++ containers. + All information in this section is current as of the gcc 3.0 + release and all later point releases. Although earlier gcc + releases had a different approach to threading configuration and + proper compilation, the basic code design rules presented here + were similar. For information on all other aspects of + multithreading as it relates to libstdc++, including details on + the proper compilation of threaded code (and compatibility between + threaded and non-threaded code), see Chapter 17. + </p> + <p>Two excellent pages to read when working with the Standard C++ + containers and threads are + <a href="http://www.sgi.com/tech/stl/thread_safety.html">SGI's + http://www.sgi.com/tech/stl/thread_safety.html</a> and + <a href="http://www.sgi.com/tech/stl/Allocators.html">SGI's + http://www.sgi.com/tech/stl/Allocators.html</a>. + </p> + <p><em>However, please ignore all discussions about the user-level + configuration of the lock implementation inside the STL + container-memory allocator on those pages. For the sake of this + discussion, libstdc++ configures the SGI STL implementation, + not you. This is quite different from how gcc pre-3.0 worked. + In particular, past advice was for people using g++ to + explicitly define _PTHREADS or other macros or port-specific + compilation options on the command line to get a thread-safe + STL. This is no longer required for any port and should no + longer be done unless you really know what you are doing and + assume all responsibility.</em> + </p> + <p>Since the container implementation of libstdc++ uses the SGI + code, we use the same definition of thread safety as SGI when + discussing design. A key point that beginners may miss is the + fourth major paragraph of the first page mentioned above + ("For most clients,"...), which points out that + locking must nearly always be done outside the container, by + client code (that'd be you, not us). There is a notable + exceptions to this rule. Allocators called while a container or + element is constructed uses an internal lock obtained and + released solely within libstdc++ code (in fact, this is the + reason STL requires any knowledge of the thread configuration). + </p> + <p>For implementing a container which does its own locking, it is + trivial to provide a wrapper class which obtains the lock (as + SGI suggests), performs the container operation, and then + releases the lock. This could be templatized <em>to a certain + extent</em>, on the underlying container and/or a locking + mechanism. Trying to provide a catch-all general template + solution would probably be more trouble than it's worth. + </p> + <p>The STL implementation is currently configured to use the + high-speed caching memory allocator. Some people like to + test and/or normally run threaded programs with a different + default. For all details about how to globally override this + at application run-time see <a href="../ext/howto.html#3">here</a>. + </p> + <p>There is a better way (not standardized yet): It is possible to + force the malloc-based allocator on a per-case-basis for some + application code. The library team generally believes that this + is a better way to tune an application for high-speed using this + implementation of the STL. There is + <a href="../ext/howto.html#3">more information on allocators here</a>. + </p> + <p>Return <a href="#top">to top of page</a> or + <a href="../faq/index.html">to the FAQ</a>. + </p> + +<hr /> +<h2><a name="4">"Hinting" during insertion</a></h2> + <p>Section [23.1.2], Table 69, of the C++ standard lists this function + for all of the associative containers (map, set, etc): + </p> + <pre> + a.insert(p,t);</pre> + <p>where 'p' is an iterator into the container 'a', and 't' is the item + to insert. The standard says that "<code>t</code> is inserted + as close as possible to the position just prior to + <code>p</code>." (Library DR #233 addresses this topic, referring to + <a href='http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1780.html'>N1780</a>. + Since version 4.2 GCC implements the resolution to DR 233, so that + insertions happen as close as possible to the hint. For earlier releases + the hint was only used as described below. + </p> + <p>Here we'll describe how the hinting works in the libstdc++ + implementation, and what you need to do in order to take advantage of + it. (Insertions can change from logarithmic complexity to amortized + constant time, if the hint is properly used.) Also, since the current + implementation is based on the SGI STL one, these points may hold true + for other library implementations also, since the HP/SGI code is used + in a lot of places. + </p> + <p>In the following text, the phrases <em>greater than</em> and <em>less + than</em> refer to the results of the strict weak ordering imposed on + the container by its comparison object, which defaults to (basically) + "<". Using those phrases is semantically sloppy, but I + didn't want to get bogged down in syntax. I assume that if you are + intelligent enough to use your own comparison objects, you are also + intelligent enough to assign "greater" and "lesser" + their new meanings in the next paragraph. *grin* + </p> + <p>If the <code>hint</code> parameter ('p' above) is equivalent to: + </p> + <ul> + <li><code>begin()</code>, then the item being inserted should have a key + less than all the other keys in the container. The item will + be inserted at the beginning of the container, becoming the new + entry at <code>begin()</code>. + </li> + <li><code>end()</code>, then the item being inserted should have a key + greater than all the other keys in the container. The item will + be inserted at the end of the container, becoming the new entry + at <code>end()</code>. + </li> + <li>neither <code>begin()</code> nor <code>end()</code>, then: Let <code>h</code> + be the entry in the container pointed to by <code>hint</code>, that + is, <code>h = *hint</code>. Then the item being inserted should have + a key less than that of <code>h</code>, and greater than that of the + item preceding <code>h</code>. The new item will be inserted + between <code>h</code> and <code>h</code>'s predecessor. + </li> + </ul> + <p>For <code>multimap</code> and <code>multiset</code>, the restrictions are + slightly looser: "greater than" should be replaced by + "not less than" and "less than" should be replaced + by "not greater than." (Why not replace greater with + greater-than-or-equal-to? You probably could in your head, but the + mathematicians will tell you that it isn't the same thing.) + </p> + <p>If the conditions are not met, then the hint is not used, and the + insertion proceeds as if you had called <code> a.insert(t) </code> + instead. (<strong>Note </strong> that GCC releases prior to 3.0.2 + had a bug in the case with <code>hint == begin()</code> for the + <code>map</code> and <code>set</code> classes. You should not use a hint + argument in those releases.) + </p> + <p>This behavior goes well with other containers' <code>insert()</code> + functions which take an iterator: if used, the new item will be + inserted before the iterator passed as an argument, same as the other + containers. + </p> + <p><strong>Note </strong> also that the hint in this implementation is a + one-shot. The older insertion-with-hint routines check the immediately + surrounding entries to ensure that the new item would in fact belong + there. If the hint does not point to the correct place, then no + further local searching is done; the search begins from scratch in + logarithmic time. + </p> + <p>Return <a href="#top">to top of page</a> or + <a href="../faq/index.html">to the FAQ</a>. + </p> + +<hr /> +<h2><a name="5">Bitmasks and string arguments</a></h2> + <p>Bitmasks do not take char* nor const char* arguments in their + constructors. This is something of an accident, but you can read + about the problem: follow the library's "Links" from the + homepage, and from the C++ information "defect reflector" + link, select the library issues list. Issue number 116 describes the + problem. + </p> + <p>For now you can simply make a temporary string object using the + constructor expression: + </p> + <pre> + std::bitset<5> b ( std::string("10110") ); + </pre> + instead of + <pre> + std::bitset<5> b ( "10110" ); // invalid + </pre> + <p>Return <a href="#top">to top of page</a> or + <a href="../faq/index.html">to the FAQ</a>. + </p> + +<hr /> +<h2><a name="6"><code>std::list::size()</code> is O(n)!</a></h2> + <p>Yes it is, and that's okay. This is a decision that we preserved when + we imported SGI's STL implementation. The following is quoted from + <a href="http://www.sgi.com/tech/stl/FAQ.html">their FAQ</a>: + </p> + <blockquote> + <p>The size() member function, for list and slist, takes time + proportional to the number of elements in the list. This was a + deliberate tradeoff. The only way to get a constant-time size() for + linked lists would be to maintain an extra member variable containing + the list's size. This would require taking extra time to update that + variable (it would make splice() a linear time operation, for example), + and it would also make the list larger. Many list algorithms don't + require that extra word (algorithms that do require it might do better + with vectors than with lists), and, when it is necessary to maintain + an explicit size count, it's something that users can do themselves. + </p> + <p>This choice is permitted by the C++ standard. The standard says that + size() "should" be constant time, and "should" + does not mean the same thing as "shall". This is the + officially recommended ISO wording for saying that an implementation + is supposed to do something unless there is a good reason not to. + </p> + <p>One implication of linear time size(): you should never write + </p> + <pre> + if (L.size() == 0) + ...</pre> + Instead, you should write + <pre> + if (L.empty()) + ...</pre> + </blockquote> + <p>Return <a href="#top">to top of page</a> or + <a href="../faq/index.html">to the FAQ</a>. + </p> + +<hr /> +<h2><a name="7">Space overhead management for vectors</a></h2> + <p>In + <a href="http://gcc.gnu.org/ml/libstdc++/2002-04/msg00105.html">this + message to the list</a>, Daniel Kostecky announced work on an + alternate form of <code>std::vector</code> that would support hints + on the number of elements to be over-allocated. The design was also + described, along with possible implementation choices. + </p> + <p>The first two alpha releases were announced + <a href="http://gcc.gnu.org/ml/libstdc++/2002-07/msg00048.html">here</a> + and + <a href="http://gcc.gnu.org/ml/libstdc++/2002-07/msg00111.html">here</a>. + The releases themselves are available at + <a href="http://www.kotelna.sk/dk/sw/caphint/"> + http://www.kotelna.sk/dk/sw/caphint/</a>. + </p> + <p>Return <a href="#top">to top of page</a> or + <a href="../faq/index.html">to the FAQ</a>. + </p> + + +<!-- ####################################################### --> + +<hr /> +<p class="fineprint"><em> +See <a href="../17_intro/license.html">license.html</a> for copying conditions. +Comments and suggestions are welcome, and may be sent to +<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>. +</em></p> + + +</body> +</html> |