1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> 2<html> 3 <head> 4 <title>Data-Structure Genericity</title> 5 <meta name="GENERATOR" content="Microsoft Visual Studio .NET 7.1"> 6 <meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5"> 7 </head> 8<body bgcolor = "white"> 9<h1>Data-Structure Genericity</h1> 10 11<p> 12 This section describes genericity over different underlying data-structures. It is organized as follows. 13</p> 14<ol> 15 <li><a href = "#problem">The Basic Problem</a></li> 16 <li><a href = "#ds_hierarchy">Container Hierarchy</a></li> 17 <li><a href = "#ds_traits">Data-Structure Tags and Traits</a></li> 18 <li><a href = "#find_range">Find-Type and Range-Type Methods and Iterators</a></li> 19</ol> 20 21<h2><a name = "problem">The Basic Problem</a></h2> 22 23<p> 24 The design attempts to address the following problem. When writing a function manipulating a generic container object, what is the behaviour of the object? <i>E.g.</i>, suppose one writes 25</p> 26<pre> 27<b>template</b>< 28 <b>class</b> Cntnr> 29<b>void</b> some_op_sequence 30 (Cntnr &r_cntnr) 31{ 32 ... 33} 34</pre> 35then one needs to address the following questions in the body 36of <tt>some_op_sequence</tt>: 37<ol> 38 <li> Which types and methods does <tt>Cntnr</tt> support? Containers based on hash tables can be queries for the hash-functor type and object; this is meaningless for tree-based containers. Containers based on trees can be split, joined, or can erase iterators and return the following iterator; this cannot be done by hash-based containers. </li> 39 <li> 40 What are the guarantees of <tt>Cntnr</tt>? A container based on a probing hash-table invalidates all iterators when it is modified; this is not the case for containers based on node-based trees. Containers based on a node-based tree can be split or joined without exceptions; this is not the case for containers based on vector-based trees. 41 </li> 42 <li> How does the container maintain its elements? containers based on splay trees or lists with update policies "cache" "frequently accessed" elements; containers based on most other underlying data-structures do not.</li> 43</ol> 44 45<h2><a name = "ds_hierarchy">Container Hierarchy</a></h2> 46 47<p> 48 Figure 49<a href = "#cd">Class hierarchy</a> 50 shows the container hierarchy. 51</p> 52<ol> 53 <li> 54<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a> 55contains types and methods shared by all associative containers, <i>e.g.</i>, the type <tt>allocator</tt> and the method <tt>find</tt>. 56 </li> 57 <li><a href = "basic_assoc_cntnr.html"><tt>basic_hash_assoc_cntnr</tt></a> subclasses 58<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>, and contains 59types and methods shared by all hash-based containers, <i>e.g.</i>, the type <tt>hash_fn</tt>. 60 </li> 61 <ol> 62 <li> 63<a href = "cc_hash_assoc_cntnr.html"><tt>cc_hash_assoc_cntnr</tt></a> 64and 65<a href = "gp_hash_assoc_cntnr.html"><tt>gp_hash_assoc_cntnr</tt></a> 66each subclass 67<a href = "basic_hash_assoc_cntnr.html"><tt>basic_hash_assoc_cntnr</tt></a>, and encapsulate collision-chaining and (general) probing hash tables, respectively. These two types of hash tables have somewhat different policies and methods (<i>i.e.</i>, constructors and policy-access methods). 68 </li> 69 </ol> 70 <li> 71<a href = "tree_assoc_cntnr.html"><tt>tree_assoc_cntnr</tt></a> 72subclasses one of 73<a href = "basic_tree_assoc_cntnr.html"><tt>basic_tree_assoc_cntnr</tt></a> which 74subclasses 75<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>. 76<a href = "tree_assoc_cntnr.html"><tt>tree_assoc_cntnr</tt></a> 77 encapsulates a tree-based container, and is parameterized by which underlying data-structure to use (<i>e.g.</i>, a red-black tree); 78<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>. 79is specialized to the capabilities of the underlying structure. 80<a href = "tree_assoc_cntnr.html"><tt>tree_assoc_cntnr</tt></a> contains some additional methods over 81<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>, 82<i>e.g.</i>, split and join methods. 83 </li> 84 <li> 85<a href = "lu_assoc_cntnr.html"><tt>lu_assoc_cntnr</tt></a> 86subclasses 87<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>, 88and encapsulates a list with update policies. 89 </li> 90</ol> 91 92<p> 93 The hierarchy is composed naturally, such that each container inherits 94all types and methods from its base. <a href = "#ds_traits">Data-Structure Tags and Traits</a> discusses how to query which types and methods each container supports. 95</p> 96 97 98 99<h2><a name = "ds_traits">Data-Structure Tags and Traits</a></h2> 100 101<p> 102 <tt>pb_assoc</tt> contains a tag and traits mechanism similar to that of the STL's iterators. 103</p> 104 105<p> 106 <tt>pb_assoc</tt> contains a tag hierarchy corresponding to the hierarchy 107in Figure 108<a href = "#cd">Class hierarchy</a>. 109The tag hierarchy is shown in Figure 110<a href = "#ds_tag_cd">Data-structure tag class hierarchy</a>. 111</p> 112 113<h6 align = "center"> 114<a name = "cd"> 115<img src = "ds_tag_cd.jpg" width = "70%" alt = "no image"> 116</h6> 117</a> 118<h6 align = "center"> 119Data-structure tag class hierarchy. 120</h6> 121 122<p> 123 <a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a> publicly defines 124<tt>ds_category</tt> as one of the classes in Figure 125. 126Given any container <tt>Cntnr</tt>, the tag of the underlying data-structure can be found via <tt><b>typename</b> Cntnr::ds_category</tt>. 127</p> 128 129<p> 130 Additionally, a traits mechanism can be used to query a container type for its attributes. Given any container <tt>Cntnr</tt>, then 131<tt><a href = "ds_traits.html">ds_traits</a><Cntnr></tt> 132is a traits class identifying the properties of the container. 133</p> 134 135<p> 136 To find if a container can throw when a key is erased (which is true for vector-based trees, for example), one can use 137</p> 138<a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::erase_can_throw</tt>, 139for example. 140 141<p> 142 Some of the definitions in 143<a href = "ds_traits.html"><tt>ds_traits</tt></a> 144are dependent on other definitions. <i>E.g.</i>, if 145<a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::split_join</tt> 146is <tt><b>true</b></tt> (which is the case for containers based on trees), 147then 148<a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::split_join_can_throw</tt> 149indicates whether splits or joins can throw exceptions (which is true for vector-based trees); otherwise 150<a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::split_join_can_throw</tt> 151will yield a compilation error. (This is somewhat similar to a compile-time 152version of the COM model 153[<a href = "references.html#mscom">mscom</a>]). 154 155 156<h2><a name = "find_range">Find-Type and Range-Type Methods and Iterators</a></h2> 157 158<p> 159 <tt>pb_assoc</tt> differentiates between two types of methods: find-type methods, and range-type methods. For example, <tt>find</tt> is a find-type method, since a container object searches for an element with a given key; <tt>insert</tt> is a find-type method, since, by STL convention, a container object returns an iterator corresponding to an element with a given key; <tt>begin</tt> and <tt>end</tt> are range-type methods, since they are not used to find a specific element, but rather to go over all elements in a container object. 160</p> 161 162<p> 163 Correspondingly, containers in <tt>pb_assoc</tt> define two families of iterators. <tt>const_find_iterator</tt> and <tt>find_iterator</tt> are the iterator types returned by find-type methods; <tt>const_iterator</tt> and <tt>iterator</tt> are the iterator types returned by range-type methods. 164</p> 165 166<p> 167 The relationship between these iterator types varies between container types. In a tree-based container, for example, <tt>const_find_iterator</tt> and <tt>const_iterator</tt> are synonymous, and <tt>find_iterator</tt> and <tt>iterator</tt> are synonymous; in a hash-based container, for example, this is not the case. Futhermore, find-type iterators in a hash-based container lack movement operators, such as 168 <tt><b>operator++</b></tt>. 169 All containers, however, maintain the invariants shown in Figure 170 171. 172</p> 173 174 175<p> 176 This distinction between find-type and range-type iterators and methods, while complicating the interface, has several advantages: 177</p> 178 179<h3>Iterators in unordered container types</h3> 180 181<p> 182 Given an unordered container type, <i>e.g.</i>, a hash-based container, it makes no sense to move an iterator returned by a find-type method. 183Let <tt>cntnr</tt> be an associative-container object, and 184consider: 185</p> 186 187<pre> 188std::for_each(m.find(1), m.find(5), foo); 189</pre> 190 191<p> 192which applies <tt>foo</tt> to all elements in <tt>m</tt> 193between <tt>1</tt> and <tt>5</tt>. 194</p> 195 196<p>If <tt>cntnr</tt> is a 197tree-based container object, then an in-order walk will apply <tt>foo</tt> 198to the relevant elements, <i>e.g.</i>, as in Figure 199<a href = "#range_it_in_hts">Range iteration in different data-structures</a> 200-A. If <tt>m</tt> is a 201hash-based container, then the order of elements between any two 202elements is undefined (and probably time-varying); there is no 203guarantee that the elements traversed will coincide with the 204<i>logical</i> elements between 1 and 5, <i>e.g.</i>, as in 205Figure <a href = "#range_it_in_hts">Range iteration in different data-structures</a>-B. 206</p> 207 208<p> 209The application of a 210range function <i>e.g.</i>, <tt>for_each</tt>, to a 211pair of hash-based container's iterators is possibly sensical only 212if the iterators are those returned by <tt>begin</tt> and <tt>end</tt>, 213respectively. Therefore, the iterator returned by 214<tt>m</tt>'s <tt>find</tt> method should be immovable. 215</p> 216 217<p> 218 Another point also indicates that hash-based containers' 219find-type iterators and range-type iterators should be distinct. 220Consider Figure 221<a href = "#find_its_in_hash_tables"> 222Find-type iterators in hash tables</a>-A. 223An 224(immovable) find-type iterator, designed only to access an 225element, requires at most a single pointer to the element's link. 226Conversely, an iterator designed for range operations 227requires some more information <i>e.g.</i>, the bucket number), 228since a cross-list traversal might be necessary. Alternatively, 229the lists might be linked, forming a monolithic total-element 230list, as in Figure 231<a href = "#find_its_in_hash_tables"> 232Find-type iterators in hash tables</a>-B (this seems 233similar to the Dinkumware design 234[<a href = "references.html#dinkumware_stl">dinkumware_stl</a>]). This, 235however, complicates the hash-table's operations. 236 237<h6 align = "center"> 238<a name = "range_it_in_hts"> 239<img src = "find_iterators_range_ops_1.jpg" width = "70%" alt = "no image"> 240</a> 241</h6> 242<h6 align = "center"> 243Range iteration in different data-structures. 244</h6> 245 246 247<h6 align = "center"> 248<a name = "find_its_in_hash_tables"> 249<img src = "find_iterators_range_ops_2.jpg" width = "70%" alt = "no image"> 250</a> 251</h6> 252<h6 align = "center"> 253Find-type iterators in hash tables. 254</h6> 255 256<p> 257 As a consequence of this design, 258</p> 259 260<pre> 261std::for_each(m.find(1), m.find(5), foo); 262</pre> 263 264<p> 265 will compile for tree-based containers, but will not compile 266for hash-tables or other types. The returned type of <tt>find</tt> 267is a find-type iterator. For tree-based containers, this is synonymous 268with a range-type iterator, and therefore supports <tt><b>operator</b>++</tt>; 269for other types of containers, a find-type iterator lacks <tt><b>operator</b>++</tt>. 270</p> 271 272<h3>Invalidation Guarantees</h3> 273 274<p> 275 Consider the following snippet: 276</p> 277 278<pre> 279it = c.find(3); 280 281c.erase(5); 282</pre> 283 284<p> 285 Following the call to <tt>erase</tt>, what is the validity 286of <tt>it</tt>: can it be dereferenced? can it be incremented? 287</p> 288 289<p> 290 The answer depends on the underlying data-structure of the container. 291Figure 292<a href = "#invalidation_guarantee_erase">Effect of erase in different underlying data-structures</a> 293shows three cases: A1 and A2 show a red-black tree; 294B1 and B2 show an ordered-vector tree; C1 and C2 295show a collision-chaining hash table. 296</p> 297 298<h6 align = "center"> 299<a name = "invalidation_guarantee_erase"> 300<img src = "invalidation_guarantee_erase.jpg" width = "70%" alt = "no image"> 301</h6> 302</a> 303<h6 align = "center"> 304Effect of erase in different underlying data-structures. 305</h6> 306 307 308<ol> 309 <li> 310 `Erasing 5 from A1 yields A2. Clearly, an iterator to 3 311 can be dereferenced and incremented. 312 </li> 313 <li> 314 Erasing 5 from B1 yields B2. Clearly, an iterator to 3 is 315 not valid at all. 316 </li> 317 <li> 318 Erasing 5 from C1 yields C2. Here the situation is more complicated. 319On the one hand, incrementing <tt>it</tt> can be undefined. On the other 320hand, there is no problem in dereferencing <tt>it</tt>. In 321classic STL, it is not possible to express whether <tt>it</tt> 322is valid or not. 323 </li> 324</ol> 325 326<p> 327 Thus again, the iterator concept seems overloaded. Distinguishing 328between find and range types allows fine-grained invalidation guarantees. 329<a href = #invalidation_guarantee_cd">Invalidation guarantees class hierarchy</a> 330shows tags corresponding to different types of invalidation guarantees. 331</p> 332 333<h6 align = "center"> 334<a name = "invalidation_guarantee_cd"> 335<img src = "invalidation_guarantee_cd.jpg" width = "70%" alt = "no image"> 336</h6> 337</a> 338<h6 align = "center"> 339Invalidation guarantees class hierarchy. 340</h6> 341 342<ol> 343 <li> <a href = "basic_invalidation_guarantee.html"><tt>basic_invalidation_guarantee</tt></a> corresponds to a basic guarantee that a find-type iterator, a found pointer, or a found reference, remains valid as long as the container object is not modified. 344 </li> 345 <li> <a href = "find_invalidation_guarantee.html"><tt>find_invalidation_guarantee</tt></a> corresponds to a guarantee that a find-type iterator, a found pointer, or a found reference, remains valid even if the containter object is modified. 346 </li> 347 <li> <a href = "range_invalidation_guarantee.html"><tt>range_invalidation_guarantee</tt></a> corresponds to a guarantee that a range-type iterator remains valid even if the containter object is modified. 348 </li> 349</ol> 350 351 352<p> 353 To find the invalidation guarantee of a container, one can use 354</p> 355<pre> 356<b>typename</b> <a href = "ds_traits.html">ds_traits</a><Cntnr>::invalidation_guarantee 357</pre> 358 359<p> 360 which is one of the classes in Figure 361<a href = #invalidation_guarantee_cd">Invalidation guarantees class hierarchy</a>. 362</p> 363 364 365 366</body> 367 368</html> 369