1<?xml version="1.0" encoding="UTF-8" standalone="no"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml">
4  <head>
5    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
6    <title>BTree Configuration</title>
7    <link rel="stylesheet" href="gettingStarted.css" type="text/css" />
8    <meta name="generator" content="DocBook XSL Stylesheets V1.62.4" />
9    <link rel="home" href="index.html" title="Getting Started with Berkeley DB" />
10    <link rel="up" href="dbconfig.html" title="Chapter 11. Database Configuration" />
11    <link rel="previous" href="cachesize.html" title="Selecting the Cache Size" />
12  </head>
13  <body>
14    <div class="navheader">
15      <table width="100%" summary="Navigation header">
16        <tr>
17          <th colspan="3" align="center">BTree Configuration</th>
18        </tr>
19        <tr>
20          <td width="20%" align="left"><a accesskey="p" href="cachesize.html">Prev</a> </td>
21          <th width="60%" align="center">Chapter 11. Database Configuration</th>
22          <td width="20%" align="right"> </td>
23        </tr>
24      </table>
25      <hr />
26    </div>
27    <div class="sect1" lang="en" xml:lang="en">
28      <div class="titlepage">
29        <div>
30          <div>
31            <h2 class="title" style="clear: both"><a id="btree"></a>BTree Configuration</h2>
32          </div>
33        </div>
34        <div></div>
35      </div>
36      <p>
37        In going through the previous chapters in this book, you may notice that
38        we touch on some topics that are specific to BTree, but we do not cover
39        those topics in any real detail. In this section, we will discuss
40        configuration issues that are unique to BTree.
41    </p>
42      <p>
43        Specifically, in this section we describe:      
44    </p>
45      <div class="itemizedlist">
46        <ul type="disc">
47          <li>
48            <p>
49                Allowing duplicate records.
50            </p>
51          </li>
52          <li>
53            <p>
54                Setting comparator callbacks.
55            </p>
56          </li>
57        </ul>
58      </div>
59      <div class="sect2" lang="en" xml:lang="en">
60        <div class="titlepage">
61          <div>
62            <div>
63              <h3 class="title"><a id="duplicateRecords"></a>Allowing Duplicate Records</h3>
64            </div>
65          </div>
66          <div></div>
67        </div>
68        <p>
69            BTree databases can contain duplicate records. One record is
70            considered to be a duplicate of another when both records use keys
71            that compare as equal to one another.
72        </p>
73        <p>
74            By default, keys are compared using a lexicographical comparison,
75            with shorter keys collating higher than longer keys.
76            You can override this default using the
77                
78                
79                <tt class="methodname">DatabaseConfig.setBtreeComparator()</tt>
80            method. See the next section for details.
81        </p>
82        <p>
83            By default, DB databases do not allow duplicate records. As a
84            result, any attempt to write a record that uses a key equal to a
85            previously existing record results in the previously existing record
86            being overwritten by the new record.
87        </p>
88        <p>
89            Allowing duplicate records is useful if you have a database that
90            contains records keyed by a commonly occurring piece of information.
91            It is frequently necessary to allow duplicate records for secondary
92            databases.
93         </p>
94        <p>
95            For example, suppose your primary database contained records related
96            to automobiles. You might in this case want to be able to find all
97            the automobiles in the database that are of a particular color, so
98            you would index on the color of the automobile. However, for any
99            given color there will probably be multiple automobiles. Since the
100            index is the secondary key, this means that multiple secondary
101            database records will share the same key, and so the secondary
102            database must support duplicate records.
103        </p>
104        <div class="sect3" lang="en" xml:lang="en">
105          <div class="titlepage">
106            <div>
107              <div>
108                <h4 class="title"><a id="sorteddups"></a>Sorted Duplicates</h4>
109              </div>
110            </div>
111            <div></div>
112          </div>
113          <p>
114                Duplicate records can be stored in sorted or unsorted order. 
115                You can cause DB to automatically sort your duplicate
116                records by 
117                
118                <span> 
119                    setting <tt class="methodname">DatabaseConfig.setSortedDuplicates()</tt>
120                    to <tt class="literal">true</tt>. Note that this property must be
121                    set prior to database creation time and it cannot be changed
122                    afterwards.
123                </span>
124            </p>
125          <p>
126                If sorted duplicates are supported, then the 
127                
128                <span>
129                    <tt class="classname">java.util.Comparator</tt> implementation
130                    identified to
131                    <tt class="methodname">DatabaseConfig.setDuplicateComparator()</tt>
132                </span>
133                is used to determine the location of the duplicate record in its
134                duplicate set. If no such function is provided, then the default
135                lexicographical comparison is used.
136            </p>
137        </div>
138        <div class="sect3" lang="en" xml:lang="en">
139          <div class="titlepage">
140            <div>
141              <div>
142                <h4 class="title"><a id="nosorteddups"></a>Unsorted Duplicates</h4>
143              </div>
144            </div>
145            <div></div>
146          </div>
147          <p>
148                For performance reasons, BTrees should always contain sorted
149                records. (BTrees containing unsorted entries must potentially 
150                spend a great deal more time locating an entry than does a BTree
151                that contains sorted entries).  That said, DB provides support 
152                for suppressing automatic sorting of duplicate records because it may be that
153                your application is inserting records that are already in a
154                sorted order.
155            </p>
156          <p>
157                That is, if the database is configured to support unsorted
158                duplicates, then the assumption is that your application
159                will manually perform the sorting. In this event,
160                expect to pay a significant performance penalty. Any time you
161                place records into the database in a sort order not know to
162                DB, you will pay a performance penalty
163            </p>
164          <p>
165                That said, this is how DB behaves when inserting records
166                into a database that supports non-sorted duplicates:
167            </p>
168          <div class="itemizedlist">
169            <ul type="disc">
170              <li>
171                <p>
172                    If your application simply adds a duplicate record using 
173                        
174                        
175                        <span><tt class="methodname">Database.put()</tt>,</span>
176                    then the record is inserted at the end of its sorted duplicate set.
177                </p>
178              </li>
179              <li>
180                <p>
181                    If a cursor is used to put the duplicate record to the database,
182                    then the new record is placed in the duplicate set according to the
183                    actual method used to perform the put. The relevant methods
184                    are:
185                </p>
186                <div class="itemizedlist">
187                  <ul type="circle">
188                    <li>
189                      <p>
190                            
191                            <tt class="methodname">Cursor.putAfter()</tt>
192                        </p>
193                      <p>
194                        The data
195                        
196                        is placed into the database
197                        as a duplicate record. The key used for this operation is
198                        the key used for the record to which the cursor currently
199                        refers. Any key provided on the call 
200                        
201                        
202
203                        is therefore ignored.
204                        </p>
205                      <p>
206                            The duplicate record is inserted into the database
207                            immediately after the cursor's current position in the
208                            database.
209                        </p>
210                    </li>
211                    <li>
212                      <p>
213                            
214                            <tt class="methodname">Cursor.putBefore()</tt>
215                        </p>
216                      <p>
217                            Behaves the same as 
218                                
219                                <tt class="methodname">Cursor.putAfter()</tt>
220                            except that the new record is inserted immediately before 
221                            the cursor's current location in the database.
222                        </p>
223                    </li>
224                    <li>
225                      <p>
226                            
227                            <tt class="methodname">Cursor.putKeyFirst()</tt>
228                        </p>
229                      <p>
230                            If the key 
231                            
232                            already exists in the
233                            database, and the database is configured to use duplicates
234                            without sorting, then the new record is inserted as the first entry
235                            in the appropriate duplicates list.
236                        </p>
237                    </li>
238                    <li>
239                      <p>
240                            
241                            <tt class="methodname">Cursor.putKeyLast()</tt>
242                        </p>
243                      <p>
244                            Behaves identically to
245                                
246                                <tt class="methodname">Cursor.putKeyFirst()</tt>
247                            except that the new duplicate record is inserted as the last
248                            record in the duplicates list.
249                        </p>
250                    </li>
251                  </ul>
252                </div>
253              </li>
254            </ul>
255          </div>
256        </div>
257        <div class="sect3" lang="en" xml:lang="en">
258          <div class="titlepage">
259            <div>
260              <div>
261                <h4 class="title"><a id="specifyingDups"></a>Configuring a Database to Support Duplicates</h4>
262              </div>
263            </div>
264            <div></div>
265          </div>
266          <p>
267            Duplicates support can only be configured
268            at database creation time. You do this by specifying the appropriate
269            
270            <span>
271                <tt class="classname">DatabaseConfig</tt> method
272            </span>
273            before the database is opened for the first time.
274        </p>
275          <p>
276            The 
277                
278                <span>methods</span>
279            that you can use are:
280        </p>
281          <div class="itemizedlist">
282            <ul type="disc">
283              <li>
284                <p>
285                    
286                    <tt class="methodname">DatabaseConfig.setUnsortedDuplicates()</tt>
287                </p>
288                <p>
289                    The database supports non-sorted duplicate records.
290                </p>
291              </li>
292              <li>
293                <p>
294                    
295                    <tt class="methodname">DatabaseConfig.setSortedDuplicates()</tt>
296                </p>
297                <p>
298                    The database supports sorted duplicate records.
299                </p>
300              </li>
301            </ul>
302          </div>
303          <p>
304            The following code fragment illustrates how to configure a database
305            to support sorted duplicate records:
306        </p>
307          <a id="java_btree_dupsort"></a>
308          <pre class="programlisting">package db.GettingStarted;
309
310import java.io.FileNotFoundException;
311
312import com.sleepycat.db.Database;
313import com.sleepycat.db.DatabaseConfig;
314import com.sleepycat.db.DatabaseException;
315import com.sleepycat.db.DatabaseType;
316
317...
318
319Database myDb = null;
320
321try {
322    // Typical configuration settings
323    DatabaseConfig myDbConfig = new DatabaseConfig();
324    myDbConfig.setType(DatabaseType.BTREE);
325    myDbConfig.setAllowCreate(true);
326
327    // Configure for sorted duplicates
328    myDbConfig.setSortedDuplicates(true);
329
330   // Open the database
331   myDb = new Database("mydb.db", null, myDbConfig);
332} catch(DatabaseException dbe) {
333    System.err.println("MyDbs: " + dbe.toString());
334    System.exit(-1);
335} catch(FileNotFoundException fnfe) {
336    System.err.println("MyDbs: " + fnfe.toString());
337    System.exit(-1);
338} </pre>
339        </div>
340      </div>
341      <div class="sect2" lang="en" xml:lang="en">
342        <div class="titlepage">
343          <div>
344            <div>
345              <h3 class="title"><a id="comparators"></a>Setting Comparison Functions</h3>
346            </div>
347          </div>
348          <div></div>
349        </div>
350        <p>
351            By default, DB uses a lexicographical comparison function where
352            shorter records collate before longer records. For the majority of
353            cases, this comparison works well and you do not need to manage
354            it in any way. 
355         </p>
356        <p>
357            However, in some situations your application's performance can
358            benefit from setting a custom comparison routine. You can do this
359            either for database keys, or for the data if your
360            database supports sorted duplicate records.
361         </p>
362        <p>
363            Some of the reasons why you may want to provide a custom sorting
364            function are:
365         </p>
366        <div class="itemizedlist">
367          <ul type="disc">
368            <li>
369              <p>
370                    Your database is keyed using strings and you want to provide
371                    some sort of language-sensitive ordering to that data. Doing
372                    so can help increase the locality of reference that allows
373                    your database to perform at its best.
374                </p>
375            </li>
376            <li>
377              <p>
378                    You are using a little-endian system (such as x86) and you
379                    are using integers as your database's keys. Berkeley DB
380                    stores keys as byte strings and little-endian integers
381                    do not sort well when viewed as byte strings. There are
382                    several solutions to this problem, one being to provide a
383                    custom comparison function. See
384                    <a href="http://www.oracle.com/technology/documentation/berkeley-db/db/ref/am_misc/faq.html" target="_top">http://www.oracle.com/technology/documentation/berkeley-db/db/ref/am_misc/faq.html</a> 
385                    for more information.
386                </p>
387            </li>
388            <li>
389              <p>
390                    You you do not want the entire key to participate in the
391                    comparison, for whatever reason.  In 
392                    this case, you may want to provide a custom comparison
393                    function so that only the relevant bytes are examined.
394                </p>
395            </li>
396          </ul>
397        </div>
398        <div class="sect3" lang="en" xml:lang="en">
399          <div class="titlepage">
400            <div>
401              <div>
402                <h4 class="title"><a id="creatingComparisonFunctions"></a>
403                
404                <span>Creating Java Comparators</span>
405            </h4>
406              </div>
407            </div>
408            <div></div>
409          </div>
410          <p>
411                You set a BTree's key
412                    
413                    <span>
414                        comparator
415                    </span>
416                using
417                    
418                    
419                    <span><tt class="methodname">DatabaseConfig.setBtreeComparator()</tt>.</span>
420                You can also set a BTree's duplicate data comparison function using
421                    
422                    
423                    <span><tt class="methodname">DatabaseConfig.setDuplicateComparator()</tt>.</span>
424                
425            </p>
426          <p>
427            
428            <span>
429                If
430            </span>
431            the database already exists when it is opened, the
432                    
433                    <span>
434                        comparator
435                    </span>
436            provided to these methods must be the same as
437            that historically used to create the database or corruption can
438            occur.
439         </p>
440          <p>
441      You override the default comparison function by providing a Java
442      <tt class="classname">Comparator</tt> class to the database.
443      The Java <tt class="classname">Comparator</tt> interface requires you to implement the
444      <tt class="methodname">Comparator.compare()</tt> method 
445      (see <a href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/Comparator.html" target="_top">http://java.sun.com/j2se/1.4.2/docs/api/java/util/Comparator.html</a> for details). 
446      </p>
447          <p>
448        DB hands your <tt class="methodname">Comparator.compare()</tt> method
449        the <tt class="literal">byte</tt> arrays that you stored in the database. If
450        you know how your data is organized in the <tt class="literal">byte</tt>
451        array, then you can write a comparison routine that directly examines
452        the contents of the arrays.  Otherwise, you have to reconstruct your
453        original objects, and then perform the comparison.
454      </p>
455          <p>
456            For example, suppose you want to perform unicode lexical comparisons
457            instead of UTF-8 byte-by-byte comparisons. Then you could provide a
458            comparator that uses <tt class="methodname">String.compareTo()</tt>,
459            which performs a Unicode comparison of two strings (note that for
460            single-byte roman characters, Unicode comparison and UTF-8
461            byte-by-byte comparisons are identical – this is something you
462            would only want to do if you were using multibyte unicode characters
463            with DB). In this case, your comparator would look like the
464            following:
465      </p>
466          <a id="java_btree1"></a>
467          <pre class="programlisting">package db.GettingStarted;
468
469import java.util.Comparator;
470
471public class MyDataComparator implements Comparator {
472
473    public MyDataComparator() {}
474
475    public int compare(Object d1, Object d2) {
476
477        byte[] b1 = (byte[])d1;
478        byte[] b2 = (byte[])d2;
479
480        String s1 = new String(b1);
481        String s2 = new String(b2);
482        return s1.compareTo(s2);
483    }
484} </pre>
485          <p>
486        To use this comparator:
487    </p>
488          <a id="java_btree2"></a>
489          <pre class="programlisting">package db.GettingStarted;
490
491import java.io.FileNotFoundException;
492import java.util.Comparator;
493import com.sleepycat.db.Database;
494import com.sleepycat.db.DatabaseConfig;
495import com.sleepycat.db.DatabaseException;
496
497...
498
499Database myDatabase = null;
500try {
501    // Get the database configuration object
502    DatabaseConfig myDbConfig = new DatabaseConfig();
503    myDbConfig.setAllowCreate(true);
504
505    // Set the duplicate comparator class
506    MyDataComparator mdc = new MyDataComparator();
507    myDbConfig.setDuplicateComparator(mdc);
508
509    // Open the database that you will use to store your data
510    myDbConfig.setSortedDuplicates(true);
511    myDatabase = new Database("myDb", null, myDbConfig);
512} catch (DatabaseException dbe) {
513    // Exception handling goes here
514} catch (FileNotFoundException fnfe) {
515    // Exception handling goes here
516}</pre>
517        </div>
518      </div>
519    </div>
520    <div class="navfooter">
521      <hr />
522      <table width="100%" summary="Navigation footer">
523        <tr>
524          <td width="40%" align="left"><a accesskey="p" href="cachesize.html">Prev</a> </td>
525          <td width="20%" align="center">
526            <a accesskey="u" href="dbconfig.html">Up</a>
527          </td>
528          <td width="40%" align="right"> </td>
529        </tr>
530        <tr>
531          <td width="40%" align="left" valign="top">Selecting the Cache Size </td>
532          <td width="20%" align="center">
533            <a accesskey="h" href="index.html">Home</a>
534          </td>
535          <td width="40%" align="right" valign="top"> </td>
536        </tr>
537      </table>
538    </div>
539  </body>
540</html>
541