1<?xml version="1.0" encoding="UTF-8" standalone="no"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml">
4  <head>
5    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
6    <title>Chapter��11.��Database Configuration</title>
7    <link rel="stylesheet" href="gettingStarted.css" type="text/css" />
8    <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
9    <link rel="start" href="index.html" title="Getting Started with Berkeley DB" />
10    <link rel="up" href="baseapi.html" title="Part��II.��Programming with the Base API" />
11    <link rel="prev" href="javaindexusage.html" title="Secondary Database Example" />
12    <link rel="next" href="cachesize.html" title="Selecting the Cache Size" />
13  </head>
14  <body>
15    <div class="navheader">
16      <table width="100%" summary="Navigation header">
17        <tr>
18          <th colspan="3" align="center">Chapter��11.��Database Configuration</th>
19        </tr>
20        <tr>
21          <td width="20%" align="left"><a accesskey="p" href="javaindexusage.html">Prev</a>��</td>
22          <th width="60%" align="center">Part��II.��Programming with the Base API</th>
23          <td width="20%" align="right">��<a accesskey="n" href="cachesize.html">Next</a></td>
24        </tr>
25      </table>
26      <hr />
27    </div>
28    <div class="chapter" lang="en" xml:lang="en">
29      <div class="titlepage">
30        <div>
31          <div>
32            <h2 class="title"><a id="dbconfig"></a>Chapter��11.��Database Configuration</h2>
33          </div>
34        </div>
35      </div>
36      <div class="toc">
37        <p>
38          <b>Table of Contents</b>
39        </p>
40        <dl>
41          <dt>
42            <span class="sect1">
43              <a href="dbconfig.html#pagesize">Setting the Page Size</a>
44            </span>
45          </dt>
46          <dd>
47            <dl>
48              <dt>
49                <span class="sect2">
50                  <a href="dbconfig.html#overflowpages">Overflow Pages</a>
51                </span>
52              </dt>
53              <dt>
54                <span class="sect2">
55                  <a href="dbconfig.html#Locking">Locking</a>
56                </span>
57              </dt>
58              <dt>
59                <span class="sect2">
60                  <a href="dbconfig.html#IOEfficiency">IO Efficiency</a>
61                </span>
62              </dt>
63              <dt>
64                <span class="sect2">
65                  <a href="dbconfig.html#pagesizeAdvice">Page Sizing Advice</a>
66                </span>
67              </dt>
68            </dl>
69          </dd>
70          <dt>
71            <span class="sect1">
72              <a href="cachesize.html">Selecting the Cache Size</a>
73            </span>
74          </dt>
75          <dt>
76            <span class="sect1">
77              <a href="btree.html">BTree Configuration</a>
78            </span>
79          </dt>
80          <dd>
81            <dl>
82              <dt>
83                <span class="sect2">
84                  <a href="btree.html#duplicateRecords">Allowing Duplicate Records</a>
85                </span>
86              </dt>
87              <dt>
88                <span class="sect2">
89                  <a href="btree.html#comparators">Setting Comparison Functions</a>
90                </span>
91              </dt>
92            </dl>
93          </dd>
94        </dl>
95      </div>
96      <p>
97     This chapter describes some of the database and cache configuration issues
98     that you need to consider when building your DB database.
99     In most cases, there is very little that you need to do in terms of
100     managing your databases. However, there are configuration issues that you
101     need to be concerned with, and these are largely dependent on the access
102     method that you are choosing for your database.
103  </p>
104      <p>
105    The examples and descriptions throughout this document have mostly focused
106    on the BTree access method. This is because the majority of DB
107    applications use BTree. For this reason, where configuration issues are
108    dependent on the type of access method in use, this chapter will focus on
109    BTree only. For configuration descriptions surrounding the other access
110    methods, see the <em class="citetitle">Berkeley DB Programmer's Reference Guide</em>.
111  </p>
112      <div class="sect1" lang="en" xml:lang="en">
113        <div class="titlepage">
114          <div>
115            <div>
116              <h2 class="title" style="clear: both"><a id="pagesize"></a>Setting the Page Size</h2>
117            </div>
118          </div>
119        </div>
120        <div class="toc">
121          <dl>
122            <dt>
123              <span class="sect2">
124                <a href="dbconfig.html#overflowpages">Overflow Pages</a>
125              </span>
126            </dt>
127            <dt>
128              <span class="sect2">
129                <a href="dbconfig.html#Locking">Locking</a>
130              </span>
131            </dt>
132            <dt>
133              <span class="sect2">
134                <a href="dbconfig.html#IOEfficiency">IO Efficiency</a>
135              </span>
136            </dt>
137            <dt>
138              <span class="sect2">
139                <a href="dbconfig.html#pagesizeAdvice">Page Sizing Advice</a>
140              </span>
141            </dt>
142          </dl>
143        </div>
144        <p>
145        Internally, DB stores database entries on pages. Page sizes are
146        important because they can affect your application's performance.
147    </p>
148        <p>
149        DB pages can be between 512 bytes and 64K bytes in size. The size
150        that you select must be a power of 2. You set your database's
151        page size using 
152            
153            
154            <span><code class="methodname">DatabaseConfig.setPageSize()</code>.</span>
155    </p>
156        <p>
157        Note that a database's page size can only be selected at database
158        creation time.
159    </p>
160        <p>
161        When selecting a page size, you should consider the following issues:
162    </p>
163        <div class="itemizedlist">
164          <ul type="disc">
165            <li>
166              <p>
167                Overflow pages.
168            </p>
169            </li>
170            <li>
171              <p>
172                Locking
173            </p>
174            </li>
175            <li>
176              <p>
177                Disk I/O.
178            </p>
179            </li>
180          </ul>
181        </div>
182        <p>
183        These topics are discussed next.
184    </p>
185        <div class="sect2" lang="en" xml:lang="en">
186          <div class="titlepage">
187            <div>
188              <div>
189                <h3 class="title"><a id="overflowpages"></a>Overflow Pages</h3>
190              </div>
191            </div>
192          </div>
193          <p>
194            Overflow pages are used to hold a key or data item
195            that cannot fit on a single page. You do not have to do anything to
196            cause overflow pages to be created, other than to store data that is
197            too large for your database's page size. Also, the only way you can
198            prevent overflow pages from being created is to be sure to select a
199            page size that is large enough to hold your database entries.
200        </p>
201          <p>
202            Because overflow pages exist outside of the normal database
203            structure, their use is expensive from a performance
204            perspective. If you select too small of a page size, then your
205            database will be forced to use an excessive number of overflow
206            pages. This will significantly harm your application's performance.
207        </p>
208          <p>
209            For this reason, you want to select a page size that is at
210            least large enough to hold multiple entries given the expected
211            average size of your database entries. In BTree's case, for best
212            results select a page size that can hold at least 4 such entries.
213        </p>
214          <p>
215            You can see how many overflow pages your database is using by 
216            
217            <span>
218                obtaining a <code class="classname">DatabaseStats</code> object using
219                the <code class="methodname">Database.getStats()</code> method,
220            </span>
221            
222            or by examining your database using the
223            <code class="literal">db_stat</code> command line utility.
224        </p>
225        </div>
226        <div class="sect2" lang="en" xml:lang="en">
227          <div class="titlepage">
228            <div>
229              <div>
230                <h3 class="title"><a id="Locking"></a>Locking</h3>
231              </div>
232            </div>
233          </div>
234          <p>
235            Locking and multi-threaded access to DB databases is built into
236            the product. However, in order to enable the locking subsystem and
237            in order to provide efficient sharing of the cache between
238            databases, you must use an <span class="emphasis"><em>environment</em></span>.
239            Environments and multi-threaded access are not fully described 
240            in this manual (see the Berkeley DB Programmer's Reference Manual for 
241            information), however, we provide some information on sizing your
242            pages in a multi-threaded/multi-process environment in the interest
243            of providing a complete discussion on the topic.
244        </p>
245          <p>
246            If your application is multi-threaded, or if your databases are
247            accessed by more than one process at a time, then page size can
248            influence your application's performance. The reason why is that
249            for most access methods (Queue is the exception), DB implements
250            page-level locking. This means that the finest locking granularity
251            is at the page, not at the record.
252        </p>
253          <p>
254            In most cases, database pages contain multiple database
255            records. Further, in order to provide safe access to multiple
256            threads or processes, DB performs locking on pages as entries on
257            those pages are read or written.
258        </p>
259          <p>
260            As the size of your page increases relative to the size of your
261            database entries, the number of entries that are held on any given
262            page also increase. The result is that the chances of two or more
263            readers and/or writers wanting to access entries on any given page
264            also increases.
265        </p>
266          <p>
267            When two or more threads and/or processes want to manage data on a
268            page, lock contention occurs. Lock contention is resolved by one
269            thread (or process) waiting for another thread to give up its lock.
270            It is this waiting activity that is harmful to your application's
271            performance.
272        </p>
273          <p>
274            It is possible to select a page size that is so large that your
275            application will spend excessive, and noticeable, amounts of time
276            resolving lock contention. Note that this scenario is particularly
277            likely to occur as the amount of concurrency built into your
278            application increases. 
279        </p>
280          <p>
281            Oh the other hand, if you select too small of a page size, then that
282            that will only make your tree deeper, which can also cause
283            performance penalties. The trick, therefore, is to select a
284            reasonable page size (one that will hold a sizeable number of
285            records) and then reduce the page size if you notice lock
286            contention.
287        </p>
288          <p>
289            You can examine the number of lock conflicts and deadlocks occurring
290            in your application by examining your database environment lock
291            statistics. Either use the
292                
293                
294                
295            method, or use the <code class="literal">db_stat</code> command line utility.
296            The number of unavailable locks that your application waited for is
297            held in the lock statistic's <code class="literal">st_lock_wait</code> field.
298                
299        </p>
300        </div>
301        <div class="sect2" lang="en" xml:lang="en">
302          <div class="titlepage">
303            <div>
304              <div>
305                <h3 class="title"><a id="IOEfficiency"></a>IO Efficiency</h3>
306              </div>
307            </div>
308          </div>
309          <p>
310            Page size can affect how efficient DB is at moving data to and
311            from disk. For some applications, especially those for which the
312            in-memory cache can not be large enough to hold the entire working
313            dataset, IO efficiency can significantly impact application performance.
314        </p>
315          <p>
316           Most operating systems use an internal block size to determine how much
317           data to move to and from disk for a single I/O operation. This block
318           size is usually equal to the filesystem's block size. For optimal
319           disk I/O efficiency, you should select a database page size that is
320           equal to the operating system's I/O block size.
321        </p>
322          <p>
323           Essentially, DB performs data transfers based on the database
324           page size. That is, it moves data to and from disk a page at a time.
325           For this reason, if the page size does not match the I/O block size,
326           then the operating system can introduce inefficiencies in how it
327           responds to DB's I/O requests.
328        </p>
329          <p>
330            For example, suppose your page size is smaller than your operating
331            system block size. In this case, when DB writes a page to disk
332            it is writing just a portion of a logical filesystem page. Any time
333            any application writes just a portion of a logical filesystem page, the
334            operating system brings in the real filesystem page, over writes
335            the portion of the page not written by the application, then writes 
336            the filesystem page back to disk. The net result is significantly
337            more disk I/O than if the application had simply selected a page
338            size that was equal to the underlying filesystem block size.
339         </p>
340          <p>
341            Alternatively, if you select a page size that is larger than the
342            underlying filesystem block size, then the operating system may have
343            to read more data than is necessary to fulfill a read request.
344            Further, on some operating systems, requesting a single database
345            page may result in the operating system reading enough filesystem
346            blocks to satisfy the operating system's criteria for read-ahead. In
347            this case, the operating system will be reading significantly more
348            data from disk than is actually required to fulfill DB's read
349            request.
350         </p>
351          <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
352            <h3 class="title">Note</h3>
353            <p>
354                While transactions are not discussed in this manual, a page size 
355                other than your filesystem's block size can affect transactional 
356                guarantees. The reason why is that page sizes larger than the
357                filesystem's block size causes DB to write pages in block
358                size increments. As a result, it is possible for a partial page
359                to be written as the result of a transactional commit. For more
360                information, see <a class="ulink" href="http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/reclimit.html" target="_top">http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/reclimit.html</a>.
361            </p>
362          </div>
363        </div>
364        <div class="sect2" lang="en" xml:lang="en">
365          <div class="titlepage">
366            <div>
367              <div>
368                <h3 class="title"><a id="pagesizeAdvice"></a>Page Sizing Advice</h3>
369              </div>
370            </div>
371          </div>
372          <p>
373            Page sizing can be confusing at first, so here are some general
374            guidelines that you can use to select your page size.
375        </p>
376          <p>
377            In general, and given no other considerations, a page size that is equal 
378            to your filesystem block size is the ideal situation.
379        </p>
380          <p>
381            If your data is designed such that 4 database entries cannot fit on a
382            single page (assuming BTree), then grow your page size to accommodate
383            your data. Once you've abandoned matching your filesystem's block
384            size, the general rule is that larger page sizes are better.
385        </p>
386          <p>
387            The exception to this rule is if you have a great deal of
388            concurrency occurring in your application. In this case, the closer
389            you can match your page size to the ideal size needed for your
390            application's data, the better. Doing so will allow you to avoid
391            unnecessary contention for page locks.
392        </p>
393        </div>
394      </div>
395    </div>
396    <div class="navfooter">
397      <hr />
398      <table width="100%" summary="Navigation footer">
399        <tr>
400          <td width="40%" align="left"><a accesskey="p" href="javaindexusage.html">Prev</a>��</td>
401          <td width="20%" align="center">
402            <a accesskey="u" href="baseapi.html">Up</a>
403          </td>
404          <td width="40%" align="right">��<a accesskey="n" href="cachesize.html">Next</a></td>
405        </tr>
406        <tr>
407          <td width="40%" align="left" valign="top">Secondary Database Example��</td>
408          <td width="20%" align="center">
409            <a accesskey="h" href="index.html">Home</a>
410          </td>
411          <td width="40%" align="right" valign="top">��Selecting the Cache Size</td>
412        </tr>
413      </table>
414    </div>
415  </body>
416</html>
417