1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
2    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3
4<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5<head>
6  <meta name="generator" content=
7  "HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" />
8
9  <title>Trie-Based Containers</title>
10  <meta http-equiv="Content-Type" content=
11  "text/html; charset=us-ascii" />
12  </head>
13
14<body>
15  <div id="page">
16    <h1>Trie Design</h1>
17
18    <h2><a name="overview" id="overview">Overview</a></h2>
19
20    <p>The trie-based container has the following declaration:</p>
21    <pre>
22<b>template</b>&lt;
23    <b>typename</b> Key,
24    <b>typename</b> Mapped,
25    <b>typename</b> Cmp_Fn = std::less&lt;Key&gt;,
26    <b>typename</b> Tag =  <a href="pat_trie_tag.html">pat_trie_tag</a>,
27    <b>template</b>&lt;
28        <b>typename</b> Const_Node_Iterator,
29        <b>typename</b> Node_Iterator,
30        <b>typename</b> E_Access_Traits_,
31        <b>typename</b> Allocator_&gt;
32    <b>class</b> Node_Update = <a href=
33"null_trie_node_update.html">null_trie_node_update</a>,
34    <b>typename</b> Allocator = std::allocator&lt;<b>char</b>&gt; &gt;
35<b>class</b> <a href=
36"trie.html">trie</a>;
37</pre>
38
39    <p>The parameters have the following meaning:</p>
40
41    <ol>
42      <li><tt>Key</tt> is the key type.</li>
43
44      <li><tt>Mapped</tt> is the mapped-policy, and is explained in
45      <a href="tutorial.html#assoc_ms">Tutorial::Associative
46      Containers::Associative Containers Others than Maps</a>.</li>
47
48      <li><tt>E_Access_Traits</tt> is described in <a href=
49      "#e_access_traits">Element-Access Traits</a>.</li>
50
51      <li><tt>Tag</tt> specifies which underlying data structure
52      to use, and is described shortly.</li>
53
54      <li><tt>Node_Update</tt> is a policy for updating node
55      invariants. This is described in <a href="#invariants">Node
56      Invariants</a>.</li>
57
58      <li><tt>Allocator</tt> is an allocator
59      type.</li>
60    </ol>
61
62    <p>The <tt>Tag</tt> parameter specifies which underlying
63    data structure to use. Instantiating it by <a href=
64    "pat_trie_tag.html">pat_trie_tag</a>, specifies an
65    underlying PATRICIA trie (explained shortly); any other tag is
66    currently illegal.</p>
67    <hr />
68
69    <p>Following is a description of a (PATRICIA) trie
70    (<tt>pb_ds</tt> follows specifically [<a href=
71    "references.html#okasaki98mereable">okasaki98mereable</a>] and
72    [<a href=
73    "references.html#filliatre2000ptset">filliatre2000ptset</a>]).</p>
74
75    <p>A (PATRICIA) trie is similar to a tree, but with the
76    following differences:</p>
77
78    <ol>
79      <li>It explicitly views keys as a sequence of elements.
80      <i>E.g.</i>, a trie can view a string as a sequence of
81      characters; a trie can view a number as a sequence of
82      bits.</li>
83
84      <li>It is not (necessarily) binary. Each node has fan-out <i>n
85      + 1</i>, where <i>n</i> is the number of distinct
86      elements.</li>
87
88      <li>It stores values only at leaf nodes.</li>
89
90      <li>Internal nodes have the properties that A) each has at
91      least two children, and B) each shares the same prefix with
92      any of its descendant.</li>
93    </ol>
94
95    <p><a href="#e_access_traits">Element-Access Traits</a> shows
96    an example of such a trie.</p>
97
98    <p>A (PATRICIA) trie has some useful properties:</p>
99
100    <ol>
101      <li>It can be configured to use large node fan-out, giving it
102      very efficient find performance (albeit at insertion
103      complexity and size).</li>
104
105      <li>It works well for common-prefix keys.</li>
106
107      <li>It can support efficiently queries such as which keys
108      match a certain prefix. This is sometimes useful in
109      file systems and routers.</li>
110    </ol>
111
112    <p>(We would like to thank Matt Austern for the suggestion to
113    include tries.)</p>
114
115    <h2><a name="e_access_traits" id=
116    "e_access_traits">Element-Access Traits</a></h2>
117
118    <p>A trie inherently views its keys as sequences of elements.
119    For example, a trie can view a string as a sequence of
120    characters. A trie needs to map each of <i>n</i> elements to a
121    number in <i>{0, n - 1}</i>. For example, a trie can map a
122    character <tt>c</tt> to
123    <tt>static_cast&lt;size_t&gt;(c)</tt>.</p>
124
125    <p>Seemingly, then, a trie can assume that its keys support
126    (const) iterators, and that the <tt>value_type</tt> of this
127    iterator can be cast to a <tt>size_t</tt>. There are several
128    reasons, though, to decouple the mechanism by which the trie
129    accesses its keys' elements from the trie:</p>
130
131    <ol>
132      <li>In some cases, the numerical value of an element is
133      inappropriate. Consider a trie storing DNA strings. It is
134      logical to use a trie with a fan-out of <i>5 = 1 + |{'A', 'C',
135      'G', 'T'}|</i>. This requires mapping 'T' to 3, though.</li>
136
137      <li>In some cases the keys' iterators are different than what
138      is needed. For example, a trie can be used to search for
139      common <u>suffixes</u>, by using strings'
140      <tt>reverse_iterator</tt>. As another example, a trie mapping
141      UNICODE strings would have a huge fan-out if each node would
142      branch on a UNICODE character; instead, one can define an
143      iterator iterating over 8-bit (or less) groups.</li>
144    </ol>
145
146    <p><a href=
147    "trie.html">trie</a> is,
148    consequently, parametrized by <tt>E_Access_Traits</tt> -
149    traits which instruct how to access sequences' elements.
150    <a href=
151    "string_trie_e_access_traits.html"><tt>string_trie_e_access_traits</tt></a>
152    is a traits class for strings. Each such traits define some
153    types, <i>e.g.</i>,</p>
154    <pre>
155<b>typename</b> E_Access_Traits::const_iterator
156</pre>
157
158    <p>is a const iterator iterating over a key's elements. The
159    traits class must also define methods for obtaining an iterator
160    to the first and last element of a key.</p>
161
162    <p>Figure <a href="#pat_trie">A PATRICIA trie</a> shows a
163    (PATRICIA) trie resulting from inserting the words: "I wish
164    that I could ever see a poem lovely as a trie" (which,
165    unfortunately, does not rhyme).</p>
166
167    <p>The leaf nodes contain values; each internal node contains
168    two <tt><b>typename</b> E_Access_Traits::const_iterator</tt>
169    objects, indicating the maximal common prefix of all keys in
170    the sub-tree. For example, the shaded internal node roots a
171    sub-tree with leafs "a" and "as". The maximal common prefix is
172    "a". The internal node contains, consequently, to const
173    iterators, one pointing to <tt>'a'</tt>, and the other to
174    <tt>'s'</tt>.</p>
175
176    <h6 class="c1"><a name="pat_trie" id="pat_trie"><img src=
177    "pat_trie.png" alt="no image" /></a></h6>
178
179    <h6 class="c1">A PATRICIA trie.</h6>
180
181    <h2><a name="invariants" id="invariants">Node
182    Invariants</a></h2>
183
184    <p>Trie-based containers support node invariants, as do
185    tree-based containers (see <a href=
186    "tree_based_containers.html#invariants">Tree-Based
187    Containers::Node Invariants</a>). There are two minor
188    differences, though, which, unfortunately, thwart sharing them
189    sharing the same node-updating policies:</p>
190
191    <ol>
192      <li>A trie's <tt>Node_Update</tt> template-template
193      parameter is parametrized by <tt>E_Access_Traits</tt>, while
194      a tree's <tt>Node_Update</tt> template-template parameter is
195      parametrized by <tt>Cmp_Fn</tt>.</li>
196
197      <li>Tree-based containers store values in all nodes, while
198      trie-based containers (at least in this implementation) store
199      values in leafs.</li>
200    </ol>
201
202    <p>Figure <a href="#trie_node_update_cd">A trie and its update
203    policy</a> shows the scheme, as well as some predefined
204    policies (which are explained below).</p>
205
206    <h6 class="c1"><a name="trie_node_update_cd" id=
207    "trie_node_update_cd"><img src=
208    "trie_node_update_policy_cd.png" alt="no image" /></a></h6>
209
210    <h6 class="c1">A trie and its update policy.</h6>
211
212    <p><tt>pb_ds</tt> offers the following pre-defined trie node
213    updating policies:</p>
214
215    <ol>
216      <li><a href=
217      "trie_order_statistics_node_update.html"><tt>trie_order_statistics_node_update</tt></a>
218      supports order statistics.</li>
219
220      <li><a href=
221      "trie_prefix_search_node_update.html"><tt>trie_prefix_search_node_update</tt></a>
222      supports searching for ranges that match a given prefix. See
223      <a href=
224      "http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/ext/pb_ds/example/trie_prefix_search.cc"><tt>trie_prefix_search.cc</tt></a>.</li>
225
226      <li><a href=
227      "null_trie_node_update.html"><tt>null_trie_node_update</tt></a>
228      is the null node updater.</li>
229    </ol>
230
231    <h2><a name="add_methods" id="add_methods">Additional
232    Methods</a></h2>
233
234    <p>Trie-based containers support split and join methods; the
235    rationale is equal to that of tree-based containers supporting
236    these methods (see <a href=
237    "tree_based_containers.html#add_methods">Tree-Based
238    Containers::Additional Methods</a>).</p>
239  </div>
240</body>
241</html>
242