1<?xml version="1.0" encoding="UTF-8" standalone="no"?> 2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3<html xmlns="http://www.w3.org/1999/xhtml"> 4 <head> 5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 6 <title>BTree Configuration</title> 7 <link rel="stylesheet" href="gettingStarted.css" type="text/css" /> 8 <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" /> 9 <link rel="start" href="index.html" title="Getting Started with Berkeley DB" /> 10 <link rel="up" href="dbconfig.html" title="Chapter��11.��Database Configuration" /> 11 <link rel="prev" href="cachesize.html" title="Selecting the Cache Size" /> 12 </head> 13 <body> 14 <div class="navheader"> 15 <table width="100%" summary="Navigation header"> 16 <tr> 17 <th colspan="3" align="center">BTree Configuration</th> 18 </tr> 19 <tr> 20 <td width="20%" align="left"><a accesskey="p" href="cachesize.html">Prev</a>��</td> 21 <th width="60%" align="center">Chapter��11.��Database Configuration</th> 22 <td width="20%" align="right">��</td> 23 </tr> 24 </table> 25 <hr /> 26 </div> 27 <div class="sect1" lang="en" xml:lang="en"> 28 <div class="titlepage"> 29 <div> 30 <div> 31 <h2 class="title" style="clear: both"><a id="btree"></a>BTree Configuration</h2> 32 </div> 33 </div> 34 </div> 35 <div class="toc"> 36 <dl> 37 <dt> 38 <span class="sect2"> 39 <a href="btree.html#duplicateRecords">Allowing Duplicate Records</a> 40 </span> 41 </dt> 42 <dt> 43 <span class="sect2"> 44 <a href="btree.html#comparators">Setting Comparison Functions</a> 45 </span> 46 </dt> 47 </dl> 48 </div> 49 <p> 50 In going through the previous chapters in this book, you may notice that 51 we touch on some topics that are specific to BTree, but we do not cover 52 those topics in any real detail. In this section, we will discuss 53 configuration issues that are unique to BTree. 54 </p> 55 <p> 56 Specifically, in this section we describe: 57 </p> 58 <div class="itemizedlist"> 59 <ul type="disc"> 60 <li> 61 <p> 62 Allowing duplicate records. 63 </p> 64 </li> 65 <li> 66 <p> 67 Setting comparator callbacks. 68 </p> 69 </li> 70 </ul> 71 </div> 72 <div class="sect2" lang="en" xml:lang="en"> 73 <div class="titlepage"> 74 <div> 75 <div> 76 <h3 class="title"><a id="duplicateRecords"></a>Allowing Duplicate Records</h3> 77 </div> 78 </div> 79 </div> 80 <p> 81 BTree databases can contain duplicate records. One record is 82 considered to be a duplicate of another when both records use keys 83 that compare as equal to one another. 84 </p> 85 <p> 86 By default, keys are compared using a lexicographical comparison, 87 with shorter keys collating higher than longer keys. 88 You can override this default using the 89 90 91 <code class="methodname">DatabaseConfig.setBtreeComparator()</code> 92 method. See the next section for details. 93 </p> 94 <p> 95 By default, DB databases do not allow duplicate records. As a 96 result, any attempt to write a record that uses a key equal to a 97 previously existing record results in the previously existing record 98 being overwritten by the new record. 99 </p> 100 <p> 101 Allowing duplicate records is useful if you have a database that 102 contains records keyed by a commonly occurring piece of information. 103 It is frequently necessary to allow duplicate records for secondary 104 databases. 105 </p> 106 <p> 107 For example, suppose your primary database contained records related 108 to automobiles. You might in this case want to be able to find all 109 the automobiles in the database that are of a particular color, so 110 you would index on the color of the automobile. However, for any 111 given color there will probably be multiple automobiles. Since the 112 index is the secondary key, this means that multiple secondary 113 database records will share the same key, and so the secondary 114 database must support duplicate records. 115 </p> 116 <div class="sect3" lang="en" xml:lang="en"> 117 <div class="titlepage"> 118 <div> 119 <div> 120 <h4 class="title"><a id="sorteddups"></a>Sorted Duplicates</h4> 121 </div> 122 </div> 123 </div> 124 <p> 125 Duplicate records can be stored in sorted or unsorted order. 126 You can cause DB to automatically sort your duplicate 127 records by 128 129 <span> 130 setting <code class="methodname">DatabaseConfig.setSortedDuplicates()</code> 131 to <code class="literal">true</code>. Note that this property must be 132 set prior to database creation time and it cannot be changed 133 afterwards. 134 </span> 135 </p> 136 <p> 137 If sorted duplicates are supported, then the 138 139 <span> 140 <code class="classname">java.util.Comparator</code> implementation 141 identified to 142 <code class="methodname">DatabaseConfig.setDuplicateComparator()</code> 143 </span> 144 is used to determine the location of the duplicate record in its 145 duplicate set. If no such function is provided, then the default 146 lexicographical comparison is used. 147 </p> 148 </div> 149 <div class="sect3" lang="en" xml:lang="en"> 150 <div class="titlepage"> 151 <div> 152 <div> 153 <h4 class="title"><a id="nosorteddups"></a>Unsorted Duplicates</h4> 154 </div> 155 </div> 156 </div> 157 <p> 158 For performance reasons, BTrees should always contain sorted 159 records. (BTrees containing unsorted entries must potentially 160 spend a great deal more time locating an entry than does a BTree 161 that contains sorted entries). That said, DB provides support 162 for suppressing automatic sorting of duplicate records because it may be that 163 your application is inserting records that are already in a 164 sorted order. 165 </p> 166 <p> 167 That is, if the database is configured to support unsorted 168 duplicates, then the assumption is that your application 169 will manually perform the sorting. In this event, 170 expect to pay a significant performance penalty. Any time you 171 place records into the database in a sort order not know to 172 DB, you will pay a performance penalty 173 </p> 174 <p> 175 That said, this is how DB behaves when inserting records 176 into a database that supports non-sorted duplicates: 177 </p> 178 <div class="itemizedlist"> 179 <ul type="disc"> 180 <li> 181 <p> 182 If your application simply adds a duplicate record using 183 184 185 <span><code class="methodname">Database.put()</code>,</span> 186 then the record is inserted at the end of its sorted duplicate set. 187 </p> 188 </li> 189 <li> 190 <p> 191 If a cursor is used to put the duplicate record to the database, 192 then the new record is placed in the duplicate set according to the 193 actual method used to perform the put. The relevant methods 194 are: 195 </p> 196 <div class="itemizedlist"> 197 <ul type="circle"> 198 <li> 199 <p> 200 201 <code class="methodname">Cursor.putAfter()</code> 202 </p> 203 <p> 204 The data 205 206 is placed into the database 207 as a duplicate record. The key used for this operation is 208 the key used for the record to which the cursor currently 209 refers. Any key provided on the call 210 211 212 213 is therefore ignored. 214 </p> 215 <p> 216 The duplicate record is inserted into the database 217 immediately after the cursor's current position in the 218 database. 219 </p> 220 </li> 221 <li> 222 <p> 223 224 <code class="methodname">Cursor.putBefore()</code> 225 </p> 226 <p> 227 Behaves the same as 228 229 <code class="methodname">Cursor.putAfter()</code> 230 except that the new record is inserted immediately before 231 the cursor's current location in the database. 232 </p> 233 </li> 234 <li> 235 <p> 236 237 <code class="methodname">Cursor.putKeyFirst()</code> 238 </p> 239 <p> 240 If the key 241 242 already exists in the 243 database, and the database is configured to use duplicates 244 without sorting, then the new record is inserted as the first entry 245 in the appropriate duplicates list. 246 </p> 247 </li> 248 <li> 249 <p> 250 251 <code class="methodname">Cursor.putKeyLast()</code> 252 </p> 253 <p> 254 Behaves identically to 255 256 <code class="methodname">Cursor.putKeyFirst()</code> 257 except that the new duplicate record is inserted as the last 258 record in the duplicates list. 259 </p> 260 </li> 261 </ul> 262 </div> 263 </li> 264 </ul> 265 </div> 266 </div> 267 <div class="sect3" lang="en" xml:lang="en"> 268 <div class="titlepage"> 269 <div> 270 <div> 271 <h4 class="title"><a id="specifyingDups"></a>Configuring a Database to Support Duplicates</h4> 272 </div> 273 </div> 274 </div> 275 <p> 276 Duplicates support can only be configured 277 at database creation time. You do this by specifying the appropriate 278 279 <span> 280 <code class="classname">DatabaseConfig</code> method 281 </span> 282 before the database is opened for the first time. 283 </p> 284 <p> 285 The 286 287 <span>methods</span> 288 that you can use are: 289 </p> 290 <div class="itemizedlist"> 291 <ul type="disc"> 292 <li> 293 <p> 294 295 <code class="methodname">DatabaseConfig.setUnsortedDuplicates()</code> 296 </p> 297 <p> 298 The database supports non-sorted duplicate records. 299 </p> 300 </li> 301 <li> 302 <p> 303 304 <code class="methodname">DatabaseConfig.setSortedDuplicates()</code> 305 </p> 306 <p> 307 The database supports sorted duplicate records. 308 </p> 309 </li> 310 </ul> 311 </div> 312 <p> 313 The following code fragment illustrates how to configure a database 314 to support sorted duplicate records: 315 </p> 316 <a id="java_btree_dupsort"></a> 317 <pre class="programlisting">package db.GettingStarted; 318 319import java.io.FileNotFoundException; 320 321import com.sleepycat.db.Database; 322import com.sleepycat.db.DatabaseConfig; 323import com.sleepycat.db.DatabaseException; 324import com.sleepycat.db.DatabaseType; 325 326... 327 328Database myDb = null; 329 330try { 331 // Typical configuration settings 332 DatabaseConfig myDbConfig = new DatabaseConfig(); 333 myDbConfig.setType(DatabaseType.BTREE); 334 myDbConfig.setAllowCreate(true); 335 336 // Configure for sorted duplicates 337 myDbConfig.setSortedDuplicates(true); 338 339 // Open the database 340 myDb = new Database("mydb.db", null, myDbConfig); 341} catch(DatabaseException dbe) { 342 System.err.println("MyDbs: " + dbe.toString()); 343 System.exit(-1); 344} catch(FileNotFoundException fnfe) { 345 System.err.println("MyDbs: " + fnfe.toString()); 346 System.exit(-1); 347} </pre> 348 </div> 349 </div> 350 <div class="sect2" lang="en" xml:lang="en"> 351 <div class="titlepage"> 352 <div> 353 <div> 354 <h3 class="title"><a id="comparators"></a>Setting Comparison Functions</h3> 355 </div> 356 </div> 357 </div> 358 <p> 359 By default, DB uses a lexicographical comparison function where 360 shorter records collate before longer records. For the majority of 361 cases, this comparison works well and you do not need to manage 362 it in any way. 363 </p> 364 <p> 365 However, in some situations your application's performance can 366 benefit from setting a custom comparison routine. You can do this 367 either for database keys, or for the data if your 368 database supports sorted duplicate records. 369 </p> 370 <p> 371 Some of the reasons why you may want to provide a custom sorting 372 function are: 373 </p> 374 <div class="itemizedlist"> 375 <ul type="disc"> 376 <li> 377 <p> 378 Your database is keyed using strings and you want to provide 379 some sort of language-sensitive ordering to that data. Doing 380 so can help increase the locality of reference that allows 381 your database to perform at its best. 382 </p> 383 </li> 384 <li> 385 <p> 386 You are using a little-endian system (such as x86) and you 387 are using integers as your database's keys. Berkeley DB 388 stores keys as byte strings and little-endian integers 389 do not sort well when viewed as byte strings. There are 390 several solutions to this problem, one being to provide a 391 custom comparison function. See 392 <a class="ulink" href="http://www.oracle.com/technology/documentation/berkeley-db/db/ref/am_misc/faq.html" target="_top">http://www.oracle.com/technology/documentation/berkeley-db/db/ref/am_misc/faq.html</a> 393 for more information. 394 </p> 395 </li> 396 <li> 397 <p> 398 You you do not want the entire key to participate in the 399 comparison, for whatever reason. In 400 this case, you may want to provide a custom comparison 401 function so that only the relevant bytes are examined. 402 </p> 403 </li> 404 </ul> 405 </div> 406 <div class="sect3" lang="en" xml:lang="en"> 407 <div class="titlepage"> 408 <div> 409 <div> 410 <h4 class="title"><a id="creatingComparisonFunctions"></a> 411 412 <span>Creating Java Comparators</span> 413 </h4> 414 </div> 415 </div> 416 </div> 417 <p> 418 You set a BTree's key 419 420 <span> 421 comparator 422 </span> 423 using 424 425 426 <span><code class="methodname">DatabaseConfig.setBtreeComparator()</code>.</span> 427 You can also set a BTree's duplicate data comparison function using 428 429 430 <span><code class="methodname">DatabaseConfig.setDuplicateComparator()</code>.</span> 431 432 </p> 433 <p> 434 435 <span> 436 If 437 </span> 438 the database already exists when it is opened, the 439 440 <span> 441 comparator 442 </span> 443 provided to these methods must be the same as 444 that historically used to create the database or corruption can 445 occur. 446 </p> 447 <p> 448 You override the default comparison function by providing a Java 449 <code class="classname">Comparator</code> class to the database. 450 The Java <code class="classname">Comparator</code> interface requires you to implement the 451 <code class="methodname">Comparator.compare()</code> method 452 (see <a class="ulink" href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/Comparator.html" target="_top">http://java.sun.com/j2se/1.4.2/docs/api/java/util/Comparator.html</a> for details). 453 </p> 454 <p> 455 DB hands your <code class="methodname">Comparator.compare()</code> method 456 the <code class="literal">byte</code> arrays that you stored in the database. If 457 you know how your data is organized in the <code class="literal">byte</code> 458 array, then you can write a comparison routine that directly examines 459 the contents of the arrays. Otherwise, you have to reconstruct your 460 original objects, and then perform the comparison. 461 </p> 462 <p> 463 For example, suppose you want to perform unicode lexical comparisons 464 instead of UTF-8 byte-by-byte comparisons. Then you could provide a 465 comparator that uses <code class="methodname">String.compareTo()</code>, 466 which performs a Unicode comparison of two strings (note that for 467 single-byte roman characters, Unicode comparison and UTF-8 468 byte-by-byte comparisons are identical ��� this is something you 469 would only want to do if you were using multibyte unicode characters 470 with DB). In this case, your comparator would look like the 471 following: 472 </p> 473 <a id="java_btree1"></a> 474 <pre class="programlisting">package db.GettingStarted; 475 476import java.util.Comparator; 477 478public class MyDataComparator implements Comparator { 479 480 public MyDataComparator() {} 481 482 public int compare(Object d1, Object d2) { 483 484 byte[] b1 = (byte[])d1; 485 byte[] b2 = (byte[])d2; 486 487 String s1 = new String(b1); 488 String s2 = new String(b2); 489 return s1.compareTo(s2); 490 } 491} </pre> 492 <p> 493 To use this comparator: 494 </p> 495 <a id="java_btree2"></a> 496 <pre class="programlisting">package db.GettingStarted; 497 498import java.io.FileNotFoundException; 499import java.util.Comparator; 500import com.sleepycat.db.Database; 501import com.sleepycat.db.DatabaseConfig; 502import com.sleepycat.db.DatabaseException; 503 504... 505 506Database myDatabase = null; 507try { 508 // Get the database configuration object 509 DatabaseConfig myDbConfig = new DatabaseConfig(); 510 myDbConfig.setAllowCreate(true); 511 512 // Set the duplicate comparator class 513 MyDataComparator mdc = new MyDataComparator(); 514 myDbConfig.setDuplicateComparator(mdc); 515 516 // Open the database that you will use to store your data 517 myDbConfig.setSortedDuplicates(true); 518 myDatabase = new Database("myDb", null, myDbConfig); 519} catch (DatabaseException dbe) { 520 // Exception handling goes here 521} catch (FileNotFoundException fnfe) { 522 // Exception handling goes here 523}</pre> 524 </div> 525 </div> 526 </div> 527 <div class="navfooter"> 528 <hr /> 529 <table width="100%" summary="Navigation footer"> 530 <tr> 531 <td width="40%" align="left"><a accesskey="p" href="cachesize.html">Prev</a>��</td> 532 <td width="20%" align="center"> 533 <a accesskey="u" href="dbconfig.html">Up</a> 534 </td> 535 <td width="40%" align="right">��</td> 536 </tr> 537 <tr> 538 <td width="40%" align="left" valign="top">Selecting the Cache Size��</td> 539 <td width="20%" align="center"> 540 <a accesskey="h" href="index.html">Home</a> 541 </td> 542 <td width="40%" align="right" valign="top">��</td> 543 </tr> 544 </table> 545 </div> 546 </body> 547</html> 548