1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
2    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3
4<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5<head>
6<meta name="generator" content="HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" />
7<title>Hash Skewed Distribution Memory Use Test</title>
8<meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
9</head>
10<body>
11<div id="page">
12<h1>Hash-Based Skewed-Distribution Random-Integer <tt>find</tt>
13    Find Timing Test</h1>
14<h2><a name="description" id="description">Description</a></h2>
15<p>This test inserts a number of values with a markedly
16    non-uniform i.i.d. integer keys into a container, then performs
17    a series of finds using <tt>find</tt> . It measures the average
18    time for <tt>find</tt> as a function of the number of values in
19    the containers. The keys are generated as follows. First, a
20    uniform integer is created; it is then shifted left 8 bits.</p>
21<p>(The test was executed with <a href="http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/performance/ext/pb_ds/hash_zlob_random_int_find_timing.cc"><tt>hash_zlob_random_int_find_timing_test</tt></a>
22    200 200 2100)</p>
23<h2><a name="purpose" id="purpose">Purpose</a></h2>
24<p>The test checks the effect of different range-hashing
25    functions and trigger policies (see <a href="hash_based_containers.html#hash_policies">Design::Associative
26    Containers::Hash-Based Containers::Hash Policies</a> and
27    <a href="hash_based_containers.html#resize_policies">Design::Associative
28    Containers::Hash-Based Containers::Resize Policies</a>).</p>
29<h2><a name="results" id="results">Results</a></h2>
30<p>Figures <a href="#NHG">NHG</a>, <a href="#NHM">NHM</a>, and
31    <a href="#NHL">NHL</a> show the results for various hash-based
32    associative-containers in <a href="assoc_performance_tests.html#gcc"><u>g++</u></a>, <a href="assoc_performance_tests.html#msvc"><u>MSVC++</u></a>, and
33    <a href="assoc_performance_tests.html#local"><u>local</u></a>,
34    respectively.</p>
35<div id="NHG_res_div">
36<div id="NHG_gcc">
37<div id="NHG_hash_zlob_random_int_find_timing_test">
38<div id="NHG_assoc">
39<div id="NHG_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHG" id="NHG"><img src="hash_zlob_random_int_find_timing_test_gcc.png" alt="no image" /></a></h6>NHG: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#gcc">g++</a><p>In the above figure, the names in the legends have the following meaning:</p>
40<ol>
41<li>
42cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map-
43<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
44with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a>
45, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
46 with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a>
47, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
48 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/1</i></li>
49<li>
50gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map-
51<a href="gp_hash_table.html"><tt>gp_hash_table</tt></a>
52 with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
53, <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
54 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
55, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
56 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a>
57</li>
58<li>
59n_hash_map_ncah-
60<tt>std::tr1::unordered_map</tt> with <tt>cache_hash_code</tt> = <tt><b>false</b></tt></li>
61<li>
62cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map-
63<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
64with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
65, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
66 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
67, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
68 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/1</i></li>
69</ol>
70</div><div style="width: 100%; height: 20px"></div></div>
71</div>
72</div>
73</div>
74</div>
75<div id="NHM_res_div">
76<div id="NHM_msvc">
77<div id="NHM_hash_zlob_random_int_find_timing_test">
78<div id="NHM_assoc">
79<div id="NHM_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHM" id="NHM"><img src="hash_zlob_random_int_find_timing_test_msvc.png" alt="no image" /></a></h6>NHM: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#msvc">msvc++</a><p>In the above figure, the names in the legends have the following meaning:</p>
80<ol>
81<li>
82n_hash_map_ncah-
83<tt>stdext::hash_map</tt></li>
84<li>
85cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map-
86<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
87with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a>
88, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
89 with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a>
90, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
91 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/1</i></li>
92<li>
93gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map-
94<a href="gp_hash_table.html"><tt>gp_hash_table</tt></a>
95 with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
96, <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
97 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
98, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
99 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a>
100</li>
101<li>
102cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map-
103<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
104with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
105, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
106 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
107, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
108 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/1</i></li>
109</ol>
110</div><div style="width: 100%; height: 20px"></div></div>
111</div>
112</div>
113</div>
114</div>
115<div id="NHL_res_div">
116<div id="NHL_local">
117<div id="NHL_hash_zlob_random_int_find_timing_test">
118<div id="NHL_assoc">
119<div id="NHL_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style = "border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHL" id= "NHL"><img src="hash_zlob_random_int_find_timing_test_local.png" alt="no image" /></a></h6>NHL: Skewed-distribution random int find timing test using <tt>find</tt> - <a href = "assoc_performance_tests.html#local">local</a></div><div style = "width: 100%; height: 20px"></div></div>
120</div>
121</div>
122</div>
123</div>
124<h2><a name="observations" id="observations">Observations</a></h2>
125<p>In this setting, the keys' distribution is so skewed that
126    the unerlying hash-table type affects performance marginally.
127    (This is in contrast with <a href="hash_text_find_find_timing_test.html">Hash-Based Text
128    <tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_find_find_timing_test.html">Hash-Based
129    Random-Integer <tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_subscript_find_timing_test.html">Hash-Based
130    Random-Integer Subscript Find Timing Test</a> and <a href="hash_random_int_subscript_insert_timing_test.html">Hash-Based
131    Random-Integer Subscript Insert Timing Test</a> .)</p>
132<p>The range-hashing scheme affects performance dramatically. A
133    mask-based range-hashing scheme effectively maps all values
134    into the same bucket. Access degenerates into a search within
135    an unordered linked-list. In Figures <a href="#NHG">NHG</a> and
136    <a href="#NHM">NHM</a> , it should be noted that
137    <tt>std::tr1::unordered_map</tt> and <tt>stdext::hash_map</tt>
138    are hard-wired currently to mod-based and mask-based schemes,
139    respectively.</p>
140<p>When observing the settings of this test, it is apparent
141    that the keys' distribution is far from natural. One might ask
142    if the test is not contrived to show that, in some cases,
143    mod-based range hashing does better than mask-based range
144    hashing. This is, in fact just the case. We did not encounter a
145    more natural case in which mod-based range hashing is better.
146    In our opnion, real-life key distributions are handled better
147    with an appropriate hash function and a mask-based
148    range-hashing function. (<a href="http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/ext/pb_ds/example/hash_shift_mask.cc"><tt>shift_mask.cc</tt></a>
149    shows an example of handling this a-priori known skewed
150    distribution with a mask-based range-hashing function). If hash
151    performance is bad, a <i>&Chi;<sup>2</sup></i> test can be used
152    to check how to transform it into a more uniform
153    distribution.</p>
154<p>For this reason, <tt>pb_ds</tt>'s default range-hashing
155    function is mask-based.</p>
156<p><a href="assoc_performance_tests.html#hash_based_types">Observations::Hash-Based
157    Container Types</a> summarizes some observations on hash-based
158    containers; <a href="assoc_performance_tests.html#hash_based_policies">Observations::Hash-Based
159    Containers' Policies</a> summarizes some observations on
160    hash-based containers' policies.</p>
161</div>
162</body>
163</html>
164