1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3 4<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 5<head> 6<meta name="generator" content="HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" /> 7<title>Hash Skewed Distribution Memory Use Test</title> 8<meta http-equiv="Content-Type" content="text/html; charset=us-ascii" /> 9</head> 10<body> 11<div id="page"> 12<h1>Hash-Based Skewed-Distribution Random-Integer <tt>find</tt> 13 Find Timing Test</h1> 14<h2><a name="description" id="description">Description</a></h2> 15<p>This test inserts a number of values with a markedly 16 non-uniform i.i.d. integer keys into a container, then performs 17 a series of finds using <tt>find</tt> . It measures the average 18 time for <tt>find</tt> as a function of the number of values in 19 the containers. The keys are generated as follows. First, a 20 uniform integer is created; it is then shifted left 8 bits.</p> 21<p>(The test was executed with <a href="http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/performance/ext/pb_ds/hash_zlob_random_int_find_timing.cc"><tt>hash_zlob_random_int_find_timing_test</tt></a> 22 200 200 2100)</p> 23<h2><a name="purpose" id="purpose">Purpose</a></h2> 24<p>The test checks the effect of different range-hashing 25 functions and trigger policies (see <a href="hash_based_containers.html#hash_policies">Design::Associative 26 Containers::Hash-Based Containers::Hash Policies</a> and 27 <a href="hash_based_containers.html#resize_policies">Design::Associative 28 Containers::Hash-Based Containers::Resize Policies</a>).</p> 29<h2><a name="results" id="results">Results</a></h2> 30<p>Figures <a href="#NHG">NHG</a>, <a href="#NHM">NHM</a>, and 31 <a href="#NHL">NHL</a> show the results for various hash-based 32 associative-containers in <a href="assoc_performance_tests.html#gcc"><u>g++</u></a>, <a href="assoc_performance_tests.html#msvc"><u>MSVC++</u></a>, and 33 <a href="assoc_performance_tests.html#local"><u>local</u></a>, 34 respectively.</p> 35<div id="NHG_res_div"> 36<div id="NHG_gcc"> 37<div id="NHG_hash_zlob_random_int_find_timing_test"> 38<div id="NHG_assoc"> 39<div id="NHG_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHG" id="NHG"><img src="hash_zlob_random_int_find_timing_test_gcc.png" alt="no image" /></a></h6>NHG: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#gcc">g++</a><p>In the above figure, the names in the legends have the following meaning:</p> 40<ol> 41<li> 42cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map- 43<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a> 44with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a> 45, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> 46 with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a> 47, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> 48 with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li> 49<li> 50gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map- 51<a href="gp_hash_table.html"><tt>gp_hash_table</tt></a> 52 with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a> 53, <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> 54 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a> 55, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> 56 with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a> 57</li> 58<li> 59n_hash_map_ncah- 60<tt>std::tr1::unordered_map</tt> with <tt>cache_hash_code</tt> = <tt><b>false</b></tt></li> 61<li> 62cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map- 63<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a> 64with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a> 65, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> 66 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a> 67, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> 68 with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li> 69</ol> 70</div><div style="width: 100%; height: 20px"></div></div> 71</div> 72</div> 73</div> 74</div> 75<div id="NHM_res_div"> 76<div id="NHM_msvc"> 77<div id="NHM_hash_zlob_random_int_find_timing_test"> 78<div id="NHM_assoc"> 79<div id="NHM_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHM" id="NHM"><img src="hash_zlob_random_int_find_timing_test_msvc.png" alt="no image" /></a></h6>NHM: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#msvc">msvc++</a><p>In the above figure, the names in the legends have the following meaning:</p> 80<ol> 81<li> 82n_hash_map_ncah- 83<tt>stdext::hash_map</tt></li> 84<li> 85cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map- 86<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a> 87with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a> 88, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> 89 with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a> 90, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> 91 with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li> 92<li> 93gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map- 94<a href="gp_hash_table.html"><tt>gp_hash_table</tt></a> 95 with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a> 96, <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> 97 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a> 98, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> 99 with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a> 100</li> 101<li> 102cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map- 103<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a> 104with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a> 105, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> 106 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a> 107, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> 108 with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li> 109</ol> 110</div><div style="width: 100%; height: 20px"></div></div> 111</div> 112</div> 113</div> 114</div> 115<div id="NHL_res_div"> 116<div id="NHL_local"> 117<div id="NHL_hash_zlob_random_int_find_timing_test"> 118<div id="NHL_assoc"> 119<div id="NHL_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style = "border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHL" id= "NHL"><img src="hash_zlob_random_int_find_timing_test_local.png" alt="no image" /></a></h6>NHL: Skewed-distribution random int find timing test using <tt>find</tt> - <a href = "assoc_performance_tests.html#local">local</a></div><div style = "width: 100%; height: 20px"></div></div> 120</div> 121</div> 122</div> 123</div> 124<h2><a name="observations" id="observations">Observations</a></h2> 125<p>In this setting, the keys' distribution is so skewed that 126 the unerlying hash-table type affects performance marginally. 127 (This is in contrast with <a href="hash_text_find_find_timing_test.html">Hash-Based Text 128 <tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_find_find_timing_test.html">Hash-Based 129 Random-Integer <tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_subscript_find_timing_test.html">Hash-Based 130 Random-Integer Subscript Find Timing Test</a> and <a href="hash_random_int_subscript_insert_timing_test.html">Hash-Based 131 Random-Integer Subscript Insert Timing Test</a> .)</p> 132<p>The range-hashing scheme affects performance dramatically. A 133 mask-based range-hashing scheme effectively maps all values 134 into the same bucket. Access degenerates into a search within 135 an unordered linked-list. In Figures <a href="#NHG">NHG</a> and 136 <a href="#NHM">NHM</a> , it should be noted that 137 <tt>std::tr1::unordered_map</tt> and <tt>stdext::hash_map</tt> 138 are hard-wired currently to mod-based and mask-based schemes, 139 respectively.</p> 140<p>When observing the settings of this test, it is apparent 141 that the keys' distribution is far from natural. One might ask 142 if the test is not contrived to show that, in some cases, 143 mod-based range hashing does better than mask-based range 144 hashing. This is, in fact just the case. We did not encounter a 145 more natural case in which mod-based range hashing is better. 146 In our opnion, real-life key distributions are handled better 147 with an appropriate hash function and a mask-based 148 range-hashing function. (<a href="http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/ext/pb_ds/example/hash_shift_mask.cc"><tt>shift_mask.cc</tt></a> 149 shows an example of handling this a-priori known skewed 150 distribution with a mask-based range-hashing function). If hash 151 performance is bad, a <i>Χ<sup>2</sup></i> test can be used 152 to check how to transform it into a more uniform 153 distribution.</p> 154<p>For this reason, <tt>pb_ds</tt>'s default range-hashing 155 function is mask-based.</p> 156<p><a href="assoc_performance_tests.html#hash_based_types">Observations::Hash-Based 157 Container Types</a> summarizes some observations on hash-based 158 containers; <a href="assoc_performance_tests.html#hash_based_policies">Observations::Hash-Based 159 Containers' Policies</a> summarizes some observations on 160 hash-based containers' policies.</p> 161</div> 162</body> 163</html> 164