1<?xml version="1.0" encoding="UTF-8" standalone="no"?> 2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3<html xmlns="http://www.w3.org/1999/xhtml"> 4 <head> 5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 6 <title>Selecting an access method</title> 7 <link rel="stylesheet" href="gettingStarted.css" type="text/css" /> 8 <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" /> 9 <link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" /> 10 <link rel="up" href="am_conf.html" title="Chapter��2.�� Access Method Configuration" /> 11 <link rel="prev" href="am_conf.html" title="Chapter��2.�� Access Method Configuration" /> 12 <link rel="next" href="am_conf_logrec.html" title="Logical record numbers" /> 13 </head> 14 <body> 15 <div class="navheader"> 16 <table width="100%" summary="Navigation header"> 17 <tr> 18 <th colspan="3" align="center">Selecting an access method</th> 19 </tr> 20 <tr> 21 <td width="20%" align="left"><a accesskey="p" href="am_conf.html">Prev</a>��</td> 22 <th width="60%" align="center">Chapter��2.�� 23 Access Method Configuration 24 </th> 25 <td width="20%" align="right">��<a accesskey="n" href="am_conf_logrec.html">Next</a></td> 26 </tr> 27 </table> 28 <hr /> 29 </div> 30 <div class="sect1" lang="en" xml:lang="en"> 31 <div class="titlepage"> 32 <div> 33 <div> 34 <h2 class="title" style="clear: both"><a id="am_conf_select"></a>Selecting an access method</h2> 35 </div> 36 </div> 37 </div> 38 <div class="toc"> 39 <dl> 40 <dt> 41 <span class="sect2"> 42 <a href="am_conf_select.html#id1589084">Hash or Btree?</a> 43 </span> 44 </dt> 45 <dt> 46 <span class="sect2"> 47 <a href="am_conf_select.html#id1589088">Queue or Recno?</a> 48 </span> 49 </dt> 50 </dl> 51 </div> 52 <p>The Berkeley DB access method implementation unavoidably interacts with each 53application's data set, locking requirements and data access patterns. 54For this reason, one access method may result in dramatically better 55performance for an application than another one. Applications whose data 56could be stored using more than one access method may want to benchmark 57their performance using the different candidates.</p> 58 <p>One of the strengths of Berkeley DB is that it provides multiple access methods 59with nearly identical interfaces to the different access methods. This 60means that it is simple to modify an application to use a different access 61method. Applications can easily benchmark the different Berkeley DB access 62methods against each other for their particular data set and access pattern.</p> 63 <p>Most applications choose between using the Btree or Hash access methods 64or between using the Queue and Recno access methods, because each of the 65two pairs offer similar functionality.</p> 66 <div class="sect2" lang="en" xml:lang="en"> 67 <div class="titlepage"> 68 <div> 69 <div> 70 <h3 class="title"><a id="id1589084"></a>Hash or Btree?</h3> 71 </div> 72 </div> 73 </div> 74 <p>The Hash and Btree access methods should be used when logical record 75numbers are not the primary key used for data access. (If logical record 76numbers are a secondary key used for data access, the Btree access method 77is a possible choice, as it supports simultaneous access by a key and a 78record number.)</p> 79 <p>Keys in Btrees are stored in sorted order and the relationship between 80them is defined by that sort order. For this reason, the Btree access 81method should be used when there is any locality of reference among keys. 82Locality of reference means that accessing one particular key in the 83Btree implies that the application is more likely to access keys near to 84the key being accessed, where "near" is defined by the sort order. For 85example, if keys are timestamps, and it is likely that a request for an 868AM timestamp will be followed by a request for a 9AM timestamp, the 87Btree access method is generally the right choice. Or, for example, if 88the keys are names, and the application will want to review all entries 89with the same last name, the Btree access method is again a good choice.</p> 90 <p>There is little difference in performance between the Hash and Btree 91access methods on small data sets, where all, or most of, the data set 92fits into the cache. However, when a data set is large enough that 93significant numbers of data pages no longer fit into the cache, then 94the Btree locality of reference described previously becomes important 95for performance reasons. For example, there is no locality of reference 96for the Hash access method, and so key "AAAAA" is as likely to be stored 97on the same database page with key "ZZZZZ" as with key "AAAAB". In the 98Btree access method, because items are sorted, key "AAAAA" is far more 99likely to be near key "AAAAB" than key "ZZZZZ". So, if the application 100exhibits locality of reference in its data requests, then the Btree page 101read into the cache to satisfy a request for key "AAAAA" is much more 102likely to be useful to satisfy subsequent requests from the application 103than the Hash page read into the cache to satisfy the same request. 104This means that for applications with locality of reference, the cache 105is generally much more effective for the Btree access method than the 106Hash access method, and the Btree access method will make many fewer 107I/O calls.</p> 108 <p>However, when a data set becomes even larger, the Hash access method can 109outperform the Btree access method. The reason for this is that Btrees 110contain more metadata pages than Hash databases. The data set can grow 111so large that metadata pages begin to dominate the cache for the Btree 112access method. If this happens, the Btree can be forced to do an I/O 113for each data request because the probability that any particular data 114page is already in the cache becomes quite small. Because the Hash access 115method has fewer metadata pages, its cache stays "hotter" longer in the 116presence of large data sets. In addition, once the data set is so large 117that both the Btree and Hash access methods are almost certainly doing 118an I/O for each random data request, the fact that Hash does not have to 119walk several internal pages as part of a key search becomes a performance 120advantage for the Hash access method as well.</p> 121 <p>Application data access patterns strongly affect all of these behaviors, 122for example, accessing the data by walking a cursor through the database 123will greatly mitigate the large data set behavior describe above because 124each I/O into the cache will satisfy a fairly large number of subsequent 125data requests.</p> 126 <p>In the absence of information on application data and data access 127patterns, for small data sets either the Btree or Hash access methods 128will suffice. For data sets larger than the cache, we normally recommend 129using the Btree access method. If you have truly large data, then the 130Hash access method may be a better choice. The <a href="../api_reference/C/db_stat.html" class="olink">db_stat utility</a> 131is a useful tool for monitoring how well your cache is performing.</p> 132 </div> 133 <div class="sect2" lang="en" xml:lang="en"> 134 <div class="titlepage"> 135 <div> 136 <div> 137 <h3 class="title"><a id="id1589088"></a>Queue or Recno?</h3> 138 </div> 139 </div> 140 </div> 141 <p>The Queue or Recno access methods should be used when logical record 142numbers are the primary key used for data access. The advantage of the 143Queue access method is that it performs record level locking and for this 144reason supports significantly higher levels of concurrency than the Recno 145access method. The advantage of the Recno access method is that it 146supports a number of additional features beyond those supported by the 147Queue access method, such as variable-length records and support for 148backing flat-text files.</p> 149 <p>Logical record numbers can be mutable or fixed: mutable, where logical 150record numbers can change as records are deleted or inserted, and fixed, 151where record numbers never change regardless of the database operation. 152It is possible to store and retrieve records based on logical record 153numbers in the Btree access method. However, those record numbers are 154always mutable, and as records are deleted or inserted, the logical record 155number for other records in the database will change. The Queue access 156method always runs in fixed mode, and logical record numbers never change 157regardless of the database operation. The Recno access method can be 158configured to run in either mutable or fixed mode.</p> 159 <p>In addition, the Recno access method provides support for databases whose 160permanent storage is a flat text file and the database is used as a fast, 161temporary storage area while the data is being read or modified.</p> 162 </div> 163 </div> 164 <div class="navfooter"> 165 <hr /> 166 <table width="100%" summary="Navigation footer"> 167 <tr> 168 <td width="40%" align="left"><a accesskey="p" href="am_conf.html">Prev</a>��</td> 169 <td width="20%" align="center"> 170 <a accesskey="u" href="am_conf.html">Up</a> 171 </td> 172 <td width="40%" align="right">��<a accesskey="n" href="am_conf_logrec.html">Next</a></td> 173 </tr> 174 <tr> 175 <td width="40%" align="left" valign="top">Chapter��2.�� 176 Access Method Configuration 177 ��</td> 178 <td width="20%" align="center"> 179 <a accesskey="h" href="index.html">Home</a> 180 </td> 181 <td width="40%" align="right" valign="top">��Logical record numbers</td> 182 </tr> 183 </table> 184 </div> 185 </body> 186</html> 187