1<?xml version="1.0" encoding="UTF-8" standalone="no"?> 2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3<html xmlns="http://www.w3.org/1999/xhtml"> 4 <head> 5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 6 <title>BTree Configuration</title> 7 <link rel="stylesheet" href="gettingStarted.css" type="text/css" /> 8 <meta name="generator" content="DocBook XSL Stylesheets V1.62.4" /> 9 <link rel="home" href="index.html" title="Getting Started with Berkeley DB" /> 10 <link rel="up" href="dbconfig.html" title="Chapter 11. Database Configuration" /> 11 <link rel="previous" href="cachesize.html" title="Selecting the Cache Size" /> 12 </head> 13 <body> 14 <div class="navheader"> 15 <table width="100%" summary="Navigation header"> 16 <tr> 17 <th colspan="3" align="center">BTree Configuration</th> 18 </tr> 19 <tr> 20 <td width="20%" align="left"><a accesskey="p" href="cachesize.html">Prev</a> </td> 21 <th width="60%" align="center">Chapter 11. Database Configuration</th> 22 <td width="20%" align="right"> </td> 23 </tr> 24 </table> 25 <hr /> 26 </div> 27 <div class="sect1" lang="en" xml:lang="en"> 28 <div class="titlepage"> 29 <div> 30 <div> 31 <h2 class="title" style="clear: both"><a id="btree"></a>BTree Configuration</h2> 32 </div> 33 </div> 34 <div></div> 35 </div> 36 <p> 37 In going through the previous chapters in this book, you may notice that 38 we touch on some topics that are specific to BTree, but we do not cover 39 those topics in any real detail. In this section, we will discuss 40 configuration issues that are unique to BTree. 41 </p> 42 <p> 43 Specifically, in this section we describe: 44 </p> 45 <div class="itemizedlist"> 46 <ul type="disc"> 47 <li> 48 <p> 49 Allowing duplicate records. 50 </p> 51 </li> 52 <li> 53 <p> 54 Setting comparator callbacks. 55 </p> 56 </li> 57 </ul> 58 </div> 59 <div class="sect2" lang="en" xml:lang="en"> 60 <div class="titlepage"> 61 <div> 62 <div> 63 <h3 class="title"><a id="duplicateRecords"></a>Allowing Duplicate Records</h3> 64 </div> 65 </div> 66 <div></div> 67 </div> 68 <p> 69 BTree databases can contain duplicate records. One record is 70 considered to be a duplicate of another when both records use keys 71 that compare as equal to one another. 72 </p> 73 <p> 74 By default, keys are compared using a lexicographical comparison, 75 with shorter keys collating higher than longer keys. 76 You can override this default using the 77 78 79 <tt class="methodname">DatabaseConfig.setBtreeComparator()</tt> 80 method. See the next section for details. 81 </p> 82 <p> 83 By default, DB databases do not allow duplicate records. As a 84 result, any attempt to write a record that uses a key equal to a 85 previously existing record results in the previously existing record 86 being overwritten by the new record. 87 </p> 88 <p> 89 Allowing duplicate records is useful if you have a database that 90 contains records keyed by a commonly occurring piece of information. 91 It is frequently necessary to allow duplicate records for secondary 92 databases. 93 </p> 94 <p> 95 For example, suppose your primary database contained records related 96 to automobiles. You might in this case want to be able to find all 97 the automobiles in the database that are of a particular color, so 98 you would index on the color of the automobile. However, for any 99 given color there will probably be multiple automobiles. Since the 100 index is the secondary key, this means that multiple secondary 101 database records will share the same key, and so the secondary 102 database must support duplicate records. 103 </p> 104 <div class="sect3" lang="en" xml:lang="en"> 105 <div class="titlepage"> 106 <div> 107 <div> 108 <h4 class="title"><a id="sorteddups"></a>Sorted Duplicates</h4> 109 </div> 110 </div> 111 <div></div> 112 </div> 113 <p> 114 Duplicate records can be stored in sorted or unsorted order. 115 You can cause DB to automatically sort your duplicate 116 records by 117 118 <span> 119 setting <tt class="methodname">DatabaseConfig.setSortedDuplicates()</tt> 120 to <tt class="literal">true</tt>. Note that this property must be 121 set prior to database creation time and it cannot be changed 122 afterwards. 123 </span> 124 </p> 125 <p> 126 If sorted duplicates are supported, then the 127 128 <span> 129 <tt class="classname">java.util.Comparator</tt> implementation 130 identified to 131 <tt class="methodname">DatabaseConfig.setDuplicateComparator()</tt> 132 </span> 133 is used to determine the location of the duplicate record in its 134 duplicate set. If no such function is provided, then the default 135 lexicographical comparison is used. 136 </p> 137 </div> 138 <div class="sect3" lang="en" xml:lang="en"> 139 <div class="titlepage"> 140 <div> 141 <div> 142 <h4 class="title"><a id="nosorteddups"></a>Unsorted Duplicates</h4> 143 </div> 144 </div> 145 <div></div> 146 </div> 147 <p> 148 For performance reasons, BTrees should always contain sorted 149 records. (BTrees containing unsorted entries must potentially 150 spend a great deal more time locating an entry than does a BTree 151 that contains sorted entries). That said, DB provides support 152 for suppressing automatic sorting of duplicate records because it may be that 153 your application is inserting records that are already in a 154 sorted order. 155 </p> 156 <p> 157 That is, if the database is configured to support unsorted 158 duplicates, then the assumption is that your application 159 will manually perform the sorting. In this event, 160 expect to pay a significant performance penalty. Any time you 161 place records into the database in a sort order not know to 162 DB, you will pay a performance penalty 163 </p> 164 <p> 165 That said, this is how DB behaves when inserting records 166 into a database that supports non-sorted duplicates: 167 </p> 168 <div class="itemizedlist"> 169 <ul type="disc"> 170 <li> 171 <p> 172 If your application simply adds a duplicate record using 173 174 175 <span><tt class="methodname">Database.put()</tt>,</span> 176 then the record is inserted at the end of its sorted duplicate set. 177 </p> 178 </li> 179 <li> 180 <p> 181 If a cursor is used to put the duplicate record to the database, 182 then the new record is placed in the duplicate set according to the 183 actual method used to perform the put. The relevant methods 184 are: 185 </p> 186 <div class="itemizedlist"> 187 <ul type="circle"> 188 <li> 189 <p> 190 191 <tt class="methodname">Cursor.putAfter()</tt> 192 </p> 193 <p> 194 The data 195 196 is placed into the database 197 as a duplicate record. The key used for this operation is 198 the key used for the record to which the cursor currently 199 refers. Any key provided on the call 200 201 202 203 is therefore ignored. 204 </p> 205 <p> 206 The duplicate record is inserted into the database 207 immediately after the cursor's current position in the 208 database. 209 </p> 210 </li> 211 <li> 212 <p> 213 214 <tt class="methodname">Cursor.putBefore()</tt> 215 </p> 216 <p> 217 Behaves the same as 218 219 <tt class="methodname">Cursor.putAfter()</tt> 220 except that the new record is inserted immediately before 221 the cursor's current location in the database. 222 </p> 223 </li> 224 <li> 225 <p> 226 227 <tt class="methodname">Cursor.putKeyFirst()</tt> 228 </p> 229 <p> 230 If the key 231 232 already exists in the 233 database, and the database is configured to use duplicates 234 without sorting, then the new record is inserted as the first entry 235 in the appropriate duplicates list. 236 </p> 237 </li> 238 <li> 239 <p> 240 241 <tt class="methodname">Cursor.putKeyLast()</tt> 242 </p> 243 <p> 244 Behaves identically to 245 246 <tt class="methodname">Cursor.putKeyFirst()</tt> 247 except that the new duplicate record is inserted as the last 248 record in the duplicates list. 249 </p> 250 </li> 251 </ul> 252 </div> 253 </li> 254 </ul> 255 </div> 256 </div> 257 <div class="sect3" lang="en" xml:lang="en"> 258 <div class="titlepage"> 259 <div> 260 <div> 261 <h4 class="title"><a id="specifyingDups"></a>Configuring a Database to Support Duplicates</h4> 262 </div> 263 </div> 264 <div></div> 265 </div> 266 <p> 267 Duplicates support can only be configured 268 at database creation time. You do this by specifying the appropriate 269 270 <span> 271 <tt class="classname">DatabaseConfig</tt> method 272 </span> 273 before the database is opened for the first time. 274 </p> 275 <p> 276 The 277 278 <span>methods</span> 279 that you can use are: 280 </p> 281 <div class="itemizedlist"> 282 <ul type="disc"> 283 <li> 284 <p> 285 286 <tt class="methodname">DatabaseConfig.setUnsortedDuplicates()</tt> 287 </p> 288 <p> 289 The database supports non-sorted duplicate records. 290 </p> 291 </li> 292 <li> 293 <p> 294 295 <tt class="methodname">DatabaseConfig.setSortedDuplicates()</tt> 296 </p> 297 <p> 298 The database supports sorted duplicate records. 299 </p> 300 </li> 301 </ul> 302 </div> 303 <p> 304 The following code fragment illustrates how to configure a database 305 to support sorted duplicate records: 306 </p> 307 <a id="java_btree_dupsort"></a> 308 <pre class="programlisting">package db.GettingStarted; 309 310import java.io.FileNotFoundException; 311 312import com.sleepycat.db.Database; 313import com.sleepycat.db.DatabaseConfig; 314import com.sleepycat.db.DatabaseException; 315import com.sleepycat.db.DatabaseType; 316 317... 318 319Database myDb = null; 320 321try { 322 // Typical configuration settings 323 DatabaseConfig myDbConfig = new DatabaseConfig(); 324 myDbConfig.setType(DatabaseType.BTREE); 325 myDbConfig.setAllowCreate(true); 326 327 // Configure for sorted duplicates 328 myDbConfig.setSortedDuplicates(true); 329 330 // Open the database 331 myDb = new Database("mydb.db", null, myDbConfig); 332} catch(DatabaseException dbe) { 333 System.err.println("MyDbs: " + dbe.toString()); 334 System.exit(-1); 335} catch(FileNotFoundException fnfe) { 336 System.err.println("MyDbs: " + fnfe.toString()); 337 System.exit(-1); 338} </pre> 339 </div> 340 </div> 341 <div class="sect2" lang="en" xml:lang="en"> 342 <div class="titlepage"> 343 <div> 344 <div> 345 <h3 class="title"><a id="comparators"></a>Setting Comparison Functions</h3> 346 </div> 347 </div> 348 <div></div> 349 </div> 350 <p> 351 By default, DB uses a lexicographical comparison function where 352 shorter records collate before longer records. For the majority of 353 cases, this comparison works well and you do not need to manage 354 it in any way. 355 </p> 356 <p> 357 However, in some situations your application's performance can 358 benefit from setting a custom comparison routine. You can do this 359 either for database keys, or for the data if your 360 database supports sorted duplicate records. 361 </p> 362 <p> 363 Some of the reasons why you may want to provide a custom sorting 364 function are: 365 </p> 366 <div class="itemizedlist"> 367 <ul type="disc"> 368 <li> 369 <p> 370 Your database is keyed using strings and you want to provide 371 some sort of language-sensitive ordering to that data. Doing 372 so can help increase the locality of reference that allows 373 your database to perform at its best. 374 </p> 375 </li> 376 <li> 377 <p> 378 You are using a little-endian system (such as x86) and you 379 are using integers as your database's keys. Berkeley DB 380 stores keys as byte strings and little-endian integers 381 do not sort well when viewed as byte strings. There are 382 several solutions to this problem, one being to provide a 383 custom comparison function. See 384 <a href="http://www.oracle.com/technology/documentation/berkeley-db/db/ref/am_misc/faq.html" target="_top">http://www.oracle.com/technology/documentation/berkeley-db/db/ref/am_misc/faq.html</a> 385 for more information. 386 </p> 387 </li> 388 <li> 389 <p> 390 You you do not want the entire key to participate in the 391 comparison, for whatever reason. In 392 this case, you may want to provide a custom comparison 393 function so that only the relevant bytes are examined. 394 </p> 395 </li> 396 </ul> 397 </div> 398 <div class="sect3" lang="en" xml:lang="en"> 399 <div class="titlepage"> 400 <div> 401 <div> 402 <h4 class="title"><a id="creatingComparisonFunctions"></a> 403 404 <span>Creating Java Comparators</span> 405 </h4> 406 </div> 407 </div> 408 <div></div> 409 </div> 410 <p> 411 You set a BTree's key 412 413 <span> 414 comparator 415 </span> 416 using 417 418 419 <span><tt class="methodname">DatabaseConfig.setBtreeComparator()</tt>.</span> 420 You can also set a BTree's duplicate data comparison function using 421 422 423 <span><tt class="methodname">DatabaseConfig.setDuplicateComparator()</tt>.</span> 424 425 </p> 426 <p> 427 428 <span> 429 If 430 </span> 431 the database already exists when it is opened, the 432 433 <span> 434 comparator 435 </span> 436 provided to these methods must be the same as 437 that historically used to create the database or corruption can 438 occur. 439 </p> 440 <p> 441 You override the default comparison function by providing a Java 442 <tt class="classname">Comparator</tt> class to the database. 443 The Java <tt class="classname">Comparator</tt> interface requires you to implement the 444 <tt class="methodname">Comparator.compare()</tt> method 445 (see <a href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/Comparator.html" target="_top">http://java.sun.com/j2se/1.4.2/docs/api/java/util/Comparator.html</a> for details). 446 </p> 447 <p> 448 DB hands your <tt class="methodname">Comparator.compare()</tt> method 449 the <tt class="literal">byte</tt> arrays that you stored in the database. If 450 you know how your data is organized in the <tt class="literal">byte</tt> 451 array, then you can write a comparison routine that directly examines 452 the contents of the arrays. Otherwise, you have to reconstruct your 453 original objects, and then perform the comparison. 454 </p> 455 <p> 456 For example, suppose you want to perform unicode lexical comparisons 457 instead of UTF-8 byte-by-byte comparisons. Then you could provide a 458 comparator that uses <tt class="methodname">String.compareTo()</tt>, 459 which performs a Unicode comparison of two strings (note that for 460 single-byte roman characters, Unicode comparison and UTF-8 461 byte-by-byte comparisons are identical – this is something you 462 would only want to do if you were using multibyte unicode characters 463 with DB). In this case, your comparator would look like the 464 following: 465 </p> 466 <a id="java_btree1"></a> 467 <pre class="programlisting">package db.GettingStarted; 468 469import java.util.Comparator; 470 471public class MyDataComparator implements Comparator { 472 473 public MyDataComparator() {} 474 475 public int compare(Object d1, Object d2) { 476 477 byte[] b1 = (byte[])d1; 478 byte[] b2 = (byte[])d2; 479 480 String s1 = new String(b1); 481 String s2 = new String(b2); 482 return s1.compareTo(s2); 483 } 484} </pre> 485 <p> 486 To use this comparator: 487 </p> 488 <a id="java_btree2"></a> 489 <pre class="programlisting">package db.GettingStarted; 490 491import java.io.FileNotFoundException; 492import java.util.Comparator; 493import com.sleepycat.db.Database; 494import com.sleepycat.db.DatabaseConfig; 495import com.sleepycat.db.DatabaseException; 496 497... 498 499Database myDatabase = null; 500try { 501 // Get the database configuration object 502 DatabaseConfig myDbConfig = new DatabaseConfig(); 503 myDbConfig.setAllowCreate(true); 504 505 // Set the duplicate comparator class 506 MyDataComparator mdc = new MyDataComparator(); 507 myDbConfig.setDuplicateComparator(mdc); 508 509 // Open the database that you will use to store your data 510 myDbConfig.setSortedDuplicates(true); 511 myDatabase = new Database("myDb", null, myDbConfig); 512} catch (DatabaseException dbe) { 513 // Exception handling goes here 514} catch (FileNotFoundException fnfe) { 515 // Exception handling goes here 516}</pre> 517 </div> 518 </div> 519 </div> 520 <div class="navfooter"> 521 <hr /> 522 <table width="100%" summary="Navigation footer"> 523 <tr> 524 <td width="40%" align="left"><a accesskey="p" href="cachesize.html">Prev</a> </td> 525 <td width="20%" align="center"> 526 <a accesskey="u" href="dbconfig.html">Up</a> 527 </td> 528 <td width="40%" align="right"> </td> 529 </tr> 530 <tr> 531 <td width="40%" align="left" valign="top">Selecting the Cache Size </td> 532 <td width="20%" align="center"> 533 <a accesskey="h" href="index.html">Home</a> 534 </td> 535 <td width="40%" align="right" valign="top"> </td> 536 </tr> 537 </table> 538 </div> 539 </body> 540</html> 541