1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3 4<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 5<head> 6 <meta name="generator" content= 7 "HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" /> 8 9 <title>Trie-Based Containers</title> 10 <meta http-equiv="Content-Type" content= 11 "text/html; charset=us-ascii" /> 12 </head> 13 14<body> 15 <div id="page"> 16 <h1>Trie Design</h1> 17 18 <h2><a name="overview" id="overview">Overview</a></h2> 19 20 <p>The trie-based container has the following declaration:</p> 21 <pre> 22<b>template</b>< 23 <b>typename</b> Key, 24 <b>typename</b> Mapped, 25 <b>typename</b> Cmp_Fn = std::less<Key>, 26 <b>typename</b> Tag = <a href="pat_trie_tag.html">pat_trie_tag</a>, 27 <b>template</b>< 28 <b>typename</b> Const_Node_Iterator, 29 <b>typename</b> Node_Iterator, 30 <b>typename</b> E_Access_Traits_, 31 <b>typename</b> Allocator_> 32 <b>class</b> Node_Update = <a href= 33"null_trie_node_update.html">null_trie_node_update</a>, 34 <b>typename</b> Allocator = std::allocator<<b>char</b>> > 35<b>class</b> <a href= 36"trie.html">trie</a>; 37</pre> 38 39 <p>The parameters have the following meaning:</p> 40 41 <ol> 42 <li><tt>Key</tt> is the key type.</li> 43 44 <li><tt>Mapped</tt> is the mapped-policy, and is explained in 45 <a href="tutorial.html#assoc_ms">Tutorial::Associative 46 Containers::Associative Containers Others than Maps</a>.</li> 47 48 <li><tt>E_Access_Traits</tt> is described in <a href= 49 "#e_access_traits">Element-Access Traits</a>.</li> 50 51 <li><tt>Tag</tt> specifies which underlying data structure 52 to use, and is described shortly.</li> 53 54 <li><tt>Node_Update</tt> is a policy for updating node 55 invariants. This is described in <a href="#invariants">Node 56 Invariants</a>.</li> 57 58 <li><tt>Allocator</tt> is an allocator 59 type.</li> 60 </ol> 61 62 <p>The <tt>Tag</tt> parameter specifies which underlying 63 data structure to use. Instantiating it by <a href= 64 "pat_trie_tag.html">pat_trie_tag</a>, specifies an 65 underlying PATRICIA trie (explained shortly); any other tag is 66 currently illegal.</p> 67 <hr /> 68 69 <p>Following is a description of a (PATRICIA) trie 70 (<tt>pb_ds</tt> follows specifically [<a href= 71 "references.html#okasaki98mereable">okasaki98mereable</a>] and 72 [<a href= 73 "references.html#filliatre2000ptset">filliatre2000ptset</a>]).</p> 74 75 <p>A (PATRICIA) trie is similar to a tree, but with the 76 following differences:</p> 77 78 <ol> 79 <li>It explicitly views keys as a sequence of elements. 80 <i>E.g.</i>, a trie can view a string as a sequence of 81 characters; a trie can view a number as a sequence of 82 bits.</li> 83 84 <li>It is not (necessarily) binary. Each node has fan-out <i>n 85 + 1</i>, where <i>n</i> is the number of distinct 86 elements.</li> 87 88 <li>It stores values only at leaf nodes.</li> 89 90 <li>Internal nodes have the properties that A) each has at 91 least two children, and B) each shares the same prefix with 92 any of its descendant.</li> 93 </ol> 94 95 <p><a href="#e_access_traits">Element-Access Traits</a> shows 96 an example of such a trie.</p> 97 98 <p>A (PATRICIA) trie has some useful properties:</p> 99 100 <ol> 101 <li>It can be configured to use large node fan-out, giving it 102 very efficient find performance (albeit at insertion 103 complexity and size).</li> 104 105 <li>It works well for common-prefix keys.</li> 106 107 <li>It can support efficiently queries such as which keys 108 match a certain prefix. This is sometimes useful in 109 file systems and routers.</li> 110 </ol> 111 112 <p>(We would like to thank Matt Austern for the suggestion to 113 include tries.)</p> 114 115 <h2><a name="e_access_traits" id= 116 "e_access_traits">Element-Access Traits</a></h2> 117 118 <p>A trie inherently views its keys as sequences of elements. 119 For example, a trie can view a string as a sequence of 120 characters. A trie needs to map each of <i>n</i> elements to a 121 number in <i>{0, n - 1}</i>. For example, a trie can map a 122 character <tt>c</tt> to 123 <tt>static_cast<size_t>(c)</tt>.</p> 124 125 <p>Seemingly, then, a trie can assume that its keys support 126 (const) iterators, and that the <tt>value_type</tt> of this 127 iterator can be cast to a <tt>size_t</tt>. There are several 128 reasons, though, to decouple the mechanism by which the trie 129 accesses its keys' elements from the trie:</p> 130 131 <ol> 132 <li>In some cases, the numerical value of an element is 133 inappropriate. Consider a trie storing DNA strings. It is 134 logical to use a trie with a fan-out of <i>5 = 1 + |{'A', 'C', 135 'G', 'T'}|</i>. This requires mapping 'T' to 3, though.</li> 136 137 <li>In some cases the keys' iterators are different than what 138 is needed. For example, a trie can be used to search for 139 common <u>suffixes</u>, by using strings' 140 <tt>reverse_iterator</tt>. As another example, a trie mapping 141 UNICODE strings would have a huge fan-out if each node would 142 branch on a UNICODE character; instead, one can define an 143 iterator iterating over 8-bit (or less) groups.</li> 144 </ol> 145 146 <p><a href= 147 "trie.html">trie</a> is, 148 consequently, parametrized by <tt>E_Access_Traits</tt> - 149 traits which instruct how to access sequences' elements. 150 <a href= 151 "string_trie_e_access_traits.html"><tt>string_trie_e_access_traits</tt></a> 152 is a traits class for strings. Each such traits define some 153 types, <i>e.g.</i>,</p> 154 <pre> 155<b>typename</b> E_Access_Traits::const_iterator 156</pre> 157 158 <p>is a const iterator iterating over a key's elements. The 159 traits class must also define methods for obtaining an iterator 160 to the first and last element of a key.</p> 161 162 <p>Figure <a href="#pat_trie">A PATRICIA trie</a> shows a 163 (PATRICIA) trie resulting from inserting the words: "I wish 164 that I could ever see a poem lovely as a trie" (which, 165 unfortunately, does not rhyme).</p> 166 167 <p>The leaf nodes contain values; each internal node contains 168 two <tt><b>typename</b> E_Access_Traits::const_iterator</tt> 169 objects, indicating the maximal common prefix of all keys in 170 the sub-tree. For example, the shaded internal node roots a 171 sub-tree with leafs "a" and "as". The maximal common prefix is 172 "a". The internal node contains, consequently, to const 173 iterators, one pointing to <tt>'a'</tt>, and the other to 174 <tt>'s'</tt>.</p> 175 176 <h6 class="c1"><a name="pat_trie" id="pat_trie"><img src= 177 "pat_trie.png" alt="no image" /></a></h6> 178 179 <h6 class="c1">A PATRICIA trie.</h6> 180 181 <h2><a name="invariants" id="invariants">Node 182 Invariants</a></h2> 183 184 <p>Trie-based containers support node invariants, as do 185 tree-based containers (see <a href= 186 "tree_based_containers.html#invariants">Tree-Based 187 Containers::Node Invariants</a>). There are two minor 188 differences, though, which, unfortunately, thwart sharing them 189 sharing the same node-updating policies:</p> 190 191 <ol> 192 <li>A trie's <tt>Node_Update</tt> template-template 193 parameter is parametrized by <tt>E_Access_Traits</tt>, while 194 a tree's <tt>Node_Update</tt> template-template parameter is 195 parametrized by <tt>Cmp_Fn</tt>.</li> 196 197 <li>Tree-based containers store values in all nodes, while 198 trie-based containers (at least in this implementation) store 199 values in leafs.</li> 200 </ol> 201 202 <p>Figure <a href="#trie_node_update_cd">A trie and its update 203 policy</a> shows the scheme, as well as some predefined 204 policies (which are explained below).</p> 205 206 <h6 class="c1"><a name="trie_node_update_cd" id= 207 "trie_node_update_cd"><img src= 208 "trie_node_update_policy_cd.png" alt="no image" /></a></h6> 209 210 <h6 class="c1">A trie and its update policy.</h6> 211 212 <p><tt>pb_ds</tt> offers the following pre-defined trie node 213 updating policies:</p> 214 215 <ol> 216 <li><a href= 217 "trie_order_statistics_node_update.html"><tt>trie_order_statistics_node_update</tt></a> 218 supports order statistics.</li> 219 220 <li><a href= 221 "trie_prefix_search_node_update.html"><tt>trie_prefix_search_node_update</tt></a> 222 supports searching for ranges that match a given prefix. See 223 <a href= 224 "http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/ext/pb_ds/example/trie_prefix_search.cc"><tt>trie_prefix_search.cc</tt></a>.</li> 225 226 <li><a href= 227 "null_trie_node_update.html"><tt>null_trie_node_update</tt></a> 228 is the null node updater.</li> 229 </ol> 230 231 <h2><a name="add_methods" id="add_methods">Additional 232 Methods</a></h2> 233 234 <p>Trie-based containers support split and join methods; the 235 rationale is equal to that of tree-based containers supporting 236 these methods (see <a href= 237 "tree_based_containers.html#add_methods">Tree-Based 238 Containers::Additional Methods</a>).</p> 239 </div> 240</body> 241</html> 242