• Home
  • History
  • Annotate
  • Line#
  • Navigate
  • Raw
  • Download
  • only in /asuswrt-rt-n18u-9.0.0.4.380.2695/release/src-rt-6.x.4708/router/db-4.8.30/docs/programmer_reference/
1<?xml version="1.0" encoding="UTF-8" standalone="no"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml">
4  <head>
5    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
6    <title>Disk space requirements</title>
7    <link rel="stylesheet" href="gettingStarted.css" type="text/css" />
8    <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
9    <link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" />
10    <link rel="up" href="am_misc.html" title="Chapter��4.�� Access Method Wrapup" />
11    <link rel="prev" href="am_misc_dbsizes.html" title="Database limits" />
12    <link rel="next" href="am_misc_db_sql.html" title="Specifying a Berkeley DB schema using SQL DDL" />
13  </head>
14  <body>
15    <div class="navheader">
16      <table width="100%" summary="Navigation header">
17        <tr>
18          <th colspan="3" align="center">Disk space requirements</th>
19        </tr>
20        <tr>
21          <td width="20%" align="left"><a accesskey="p" href="am_misc_dbsizes.html">Prev</a>��</td>
22          <th width="60%" align="center">Chapter��4.��
23		Access Method Wrapup
24        </th>
25          <td width="20%" align="right">��<a accesskey="n" href="am_misc_db_sql.html">Next</a></td>
26        </tr>
27      </table>
28      <hr />
29    </div>
30    <div class="sect1" lang="en" xml:lang="en">
31      <div class="titlepage">
32        <div>
33          <div>
34            <h2 class="title" style="clear: both"><a id="am_misc_diskspace"></a>Disk space requirements</h2>
35          </div>
36        </div>
37      </div>
38      <div class="toc">
39        <dl>
40          <dt>
41            <span class="sect2">
42              <a href="am_misc_diskspace.html#id1597294">Btree</a>
43            </span>
44          </dt>
45          <dt>
46            <span class="sect2">
47              <a href="am_misc_diskspace.html#id1597385">Hash</a>
48            </span>
49          </dt>
50        </dl>
51      </div>
52      <p>It is possible to estimate the total database size based on the size of
53the data.  The following calculations are an estimate of how many bytes
54you will need to hold a set of data and then how many pages it will take
55to actually store it on disk.</p>
56      <p>Space freed by deleting key/data pairs from a Btree or Hash database is
57never returned to the filesystem, although it is reused where possible.
58This means that the Btree and Hash databases are grow-only.  If enough
59keys are deleted from a database that shrinking the underlying file is
60desirable, you should create a new database and copy the records from
61the old one into it.</p>
62      <p>These are rough estimates at best. For example, they do not take into
63account overflow records, filesystem metadata information, large sets
64of duplicate data items (where the key is only stored once), or
65real-life situations where the sizes of key and data items are wildly
66variable, and the page-fill factor changes over time.</p>
67      <div class="sect2" lang="en" xml:lang="en">
68        <div class="titlepage">
69          <div>
70            <div>
71              <h3 class="title"><a id="id1597294"></a>Btree</h3>
72            </div>
73          </div>
74        </div>
75        <p>The formulas for the Btree access method are as follows:</p>
76        <pre class="programlisting">useful-bytes-per-page = (page-size - page-overhead) * page-fill-factor
77<p></p>
78bytes-of-data = n-records *
79    (bytes-per-entry + page-overhead-for-two-entries)
80<p></p>
81n-pages-of-data = bytes-of-data / useful-bytes-per-page
82<p></p>
83total-bytes-on-disk = n-pages-of-data * page-size
84</pre>
85        <p>The <span class="bold"><strong>useful-bytes-per-page</strong></span> is a measure of the bytes on each page
86that will actually hold the application data.  It is computed as the total
87number of bytes on the page that are available to hold application data,
88corrected by the percentage of the page that is likely to contain data.
89The reason for this correction is that the percentage of a page that
90contains application data can vary from close to 50% after a page split
91to almost 100% if the entries in the database were inserted in sorted
92order.  Obviously, the <span class="bold"><strong>page-fill-factor</strong></span> can drastically alter
93the amount of disk space required to hold any particular data set.  The
94page-fill factor of any existing database can be displayed using the
95<a href="../api_reference/C/db_stat.html" class="olink">db_stat utility</a>.</p>
96        <p>The page-overhead for Btree databases is 26 bytes.  As an example, using
97an 8K page size, with an 85% page-fill factor, there are 6941 bytes of
98useful space on each page:</p>
99        <pre class="programlisting">6941 = (8192 - 26) * .85</pre>
100        <p>The total <span class="bold"><strong>bytes-of-data</strong></span> is an easy calculation: It is the
101number of key or data items plus the overhead required to store each
102item on a page.  The overhead to store a key or data item on a Btree
103page is 5 bytes.  So, it would take 1560000000 bytes, or roughly 1.34GB
104of total data to store 60,000,000 key/data pairs, assuming each key or
105data item was 8 bytes long:</p>
106        <pre class="programlisting">1560000000 = 60000000 * ((8 + 5) * 2)</pre>
107        <p>The total pages of data, <span class="bold"><strong>n-pages-of-data</strong></span>, is the
108<span class="bold"><strong>bytes-of-data</strong></span> divided by the <span class="bold"><strong>useful-bytes-per-page</strong></span>.  In
109the example, there are 224751 pages of data.</p>
110        <pre class="programlisting">224751 = 1560000000 / 6941</pre>
111        <p>The total bytes of disk space for the database is <span class="bold"><strong>n-pages-of-data</strong></span>
112multiplied by the <span class="bold"><strong>page-size</strong></span>.  In the example, the result is
1131841160192 bytes, or roughly 1.71GB.</p>
114        <pre class="programlisting">1841160192 = 224751 * 8192</pre>
115      </div>
116      <div class="sect2" lang="en" xml:lang="en">
117        <div class="titlepage">
118          <div>
119            <div>
120              <h3 class="title"><a id="id1597385"></a>Hash</h3>
121            </div>
122          </div>
123        </div>
124        <p>The formulas for the Hash access method are as follows:</p>
125        <pre class="programlisting">useful-bytes-per-page = (page-size - page-overhead)
126<p></p>
127bytes-of-data = n-records *
128    (bytes-per-entry + page-overhead-for-two-entries)
129<p></p>
130n-pages-of-data = bytes-of-data / useful-bytes-per-page
131<p></p>
132total-bytes-on-disk = n-pages-of-data * page-size
133</pre>
134        <p>The <span class="bold"><strong>useful-bytes-per-page</strong></span> is a measure of the bytes on each page
135that will actually hold the application data.  It is computed as the total
136number of bytes on the page that are available to hold application data.
137If the application has explicitly set a page-fill factor, pages will
138not necessarily be kept full.  For databases with a preset fill factor,
139see the calculation below.  The page-overhead for Hash databases is 26
140bytes and the page-overhead-for-two-entries is 6 bytes.</p>
141        <p>As an example, using an 8K page size, there are 8166 bytes of useful space
142on each page:</p>
143        <pre class="programlisting">8166 = (8192 - 26)</pre>
144        <p>The total <span class="bold"><strong>bytes-of-data</strong></span> is an easy calculation: it is the number
145of key/data pairs plus the overhead required to store each pair on a page.
146In this case that's 6 bytes per pair.  So, assuming 60,000,000 key/data
147pairs, each of which is 8 bytes long, there are 1320000000 bytes, or
148roughly 1.23GB of total data:</p>
149        <pre class="programlisting">1320000000 = 60000000 * (16 + 6)</pre>
150        <p>The total pages of data, <span class="bold"><strong>n-pages-of-data</strong></span>, is the
151<span class="bold"><strong>bytes-of-data</strong></span> divided by the <span class="bold"><strong>useful-bytes-per-page</strong></span>.  In
152this example, there are 161646 pages of data.</p>
153        <pre class="programlisting">161646 = 1320000000 / 8166</pre>
154        <p>The total bytes of disk space for the database is <span class="bold"><strong>n-pages-of-data</strong></span>
155multiplied by the <span class="bold"><strong>page-size</strong></span>.  In the example, the result is
1561324204032 bytes, or roughly 1.23GB.</p>
157        <pre class="programlisting">1324204032 = 161646 * 8192</pre>
158        <p>Now, let's assume that the application specified a fill factor explicitly.
159The fill factor indicates the target number of items to place on a single
160page (a fill factor might reduce the utilization of each page, but it can
161be useful in avoiding splits and preventing buckets from becoming too
162large).  Using our estimates above, each item is 22 bytes (16 + 6), and
163there are 8166 useful bytes on a page (8192 - 26).  That means that, on
164average, you can fit 371 pairs per page.</p>
165        <pre class="programlisting">371 = 8166 / 22</pre>
166        <p>However, let's assume that the application designer knows that although
167most items are 8 bytes, they can sometimes be as large as 10, and it's
168very important to avoid overflowing buckets and splitting.  Then, the
169application might specify a fill factor of 314.</p>
170        <pre class="programlisting">314 = 8166 / 26</pre>
171        <p>With a fill factor of 314, then the formula for computing database size
172is</p>
173        <pre class="programlisting">n-pages-of-data = npairs / pairs-per-page</pre>
174        <p>or 191082.</p>
175        <pre class="programlisting">191082 = 60000000 / 314</pre>
176        <p>At 191082 pages, the total database size would be 1565343744, or 1.46GB.</p>
177        <pre class="programlisting">1565343744 = 191082 * 8192</pre>
178        <p>There are a few additional caveats with respect to Hash databases.  This
179discussion assumes that the hash function does a good job of evenly
180distributing keys among hash buckets.  If the function does not do this,
181you may find your table growing significantly larger than you expected.
182Secondly, in order to provide support for Hash databases coexisting with
183other databases in a single file, pages within a Hash database are
184allocated in power-of-two chunks.  That means that a Hash database with 65
185buckets will take up as much space as a Hash database with 128 buckets;
186each time the Hash database grows beyond its current power-of-two number
187of buckets, it allocates space for the next power-of-two buckets.  This
188space may be sparsely allocated in the file system, but the files will
189appear to be their full size.  Finally, because of this need for
190contiguous allocation, overflow pages and duplicate pages can be allocated
191only at specific points in the file, and this too can lead to sparse hash
192tables.</p>
193      </div>
194    </div>
195    <div class="navfooter">
196      <hr />
197      <table width="100%" summary="Navigation footer">
198        <tr>
199          <td width="40%" align="left"><a accesskey="p" href="am_misc_dbsizes.html">Prev</a>��</td>
200          <td width="20%" align="center">
201            <a accesskey="u" href="am_misc.html">Up</a>
202          </td>
203          <td width="40%" align="right">��<a accesskey="n" href="am_misc_db_sql.html">Next</a></td>
204        </tr>
205        <tr>
206          <td width="40%" align="left" valign="top">Database limits��</td>
207          <td width="20%" align="center">
208            <a accesskey="h" href="index.html">Home</a>
209          </td>
210          <td width="40%" align="right" valign="top">��Specifying a Berkeley DB schema using SQL DDL</td>
211        </tr>
212      </table>
213    </div>
214  </body>
215</html>
216