1<?xml version="1.0" encoding="UTF-8" standalone="no"?> 2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3<html xmlns="http://www.w3.org/1999/xhtml"> 4 <head> 5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 6 <title>Chapter��11.��Database Configuration</title> 7 <link rel="stylesheet" href="gettingStarted.css" type="text/css" /> 8 <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" /> 9 <link rel="start" href="index.html" title="Getting Started with Berkeley DB" /> 10 <link rel="up" href="baseapi.html" title="Part��II.��Programming with the Base API" /> 11 <link rel="prev" href="javaindexusage.html" title="Secondary Database Example" /> 12 <link rel="next" href="cachesize.html" title="Selecting the Cache Size" /> 13 </head> 14 <body> 15 <div class="navheader"> 16 <table width="100%" summary="Navigation header"> 17 <tr> 18 <th colspan="3" align="center">Chapter��11.��Database Configuration</th> 19 </tr> 20 <tr> 21 <td width="20%" align="left"><a accesskey="p" href="javaindexusage.html">Prev</a>��</td> 22 <th width="60%" align="center">Part��II.��Programming with the Base API</th> 23 <td width="20%" align="right">��<a accesskey="n" href="cachesize.html">Next</a></td> 24 </tr> 25 </table> 26 <hr /> 27 </div> 28 <div class="chapter" lang="en" xml:lang="en"> 29 <div class="titlepage"> 30 <div> 31 <div> 32 <h2 class="title"><a id="dbconfig"></a>Chapter��11.��Database Configuration</h2> 33 </div> 34 </div> 35 </div> 36 <div class="toc"> 37 <p> 38 <b>Table of Contents</b> 39 </p> 40 <dl> 41 <dt> 42 <span class="sect1"> 43 <a href="dbconfig.html#pagesize">Setting the Page Size</a> 44 </span> 45 </dt> 46 <dd> 47 <dl> 48 <dt> 49 <span class="sect2"> 50 <a href="dbconfig.html#overflowpages">Overflow Pages</a> 51 </span> 52 </dt> 53 <dt> 54 <span class="sect2"> 55 <a href="dbconfig.html#Locking">Locking</a> 56 </span> 57 </dt> 58 <dt> 59 <span class="sect2"> 60 <a href="dbconfig.html#IOEfficiency">IO Efficiency</a> 61 </span> 62 </dt> 63 <dt> 64 <span class="sect2"> 65 <a href="dbconfig.html#pagesizeAdvice">Page Sizing Advice</a> 66 </span> 67 </dt> 68 </dl> 69 </dd> 70 <dt> 71 <span class="sect1"> 72 <a href="cachesize.html">Selecting the Cache Size</a> 73 </span> 74 </dt> 75 <dt> 76 <span class="sect1"> 77 <a href="btree.html">BTree Configuration</a> 78 </span> 79 </dt> 80 <dd> 81 <dl> 82 <dt> 83 <span class="sect2"> 84 <a href="btree.html#duplicateRecords">Allowing Duplicate Records</a> 85 </span> 86 </dt> 87 <dt> 88 <span class="sect2"> 89 <a href="btree.html#comparators">Setting Comparison Functions</a> 90 </span> 91 </dt> 92 </dl> 93 </dd> 94 </dl> 95 </div> 96 <p> 97 This chapter describes some of the database and cache configuration issues 98 that you need to consider when building your DB database. 99 In most cases, there is very little that you need to do in terms of 100 managing your databases. However, there are configuration issues that you 101 need to be concerned with, and these are largely dependent on the access 102 method that you are choosing for your database. 103 </p> 104 <p> 105 The examples and descriptions throughout this document have mostly focused 106 on the BTree access method. This is because the majority of DB 107 applications use BTree. For this reason, where configuration issues are 108 dependent on the type of access method in use, this chapter will focus on 109 BTree only. For configuration descriptions surrounding the other access 110 methods, see the <em class="citetitle">Berkeley DB Programmer's Reference Guide</em>. 111 </p> 112 <div class="sect1" lang="en" xml:lang="en"> 113 <div class="titlepage"> 114 <div> 115 <div> 116 <h2 class="title" style="clear: both"><a id="pagesize"></a>Setting the Page Size</h2> 117 </div> 118 </div> 119 </div> 120 <div class="toc"> 121 <dl> 122 <dt> 123 <span class="sect2"> 124 <a href="dbconfig.html#overflowpages">Overflow Pages</a> 125 </span> 126 </dt> 127 <dt> 128 <span class="sect2"> 129 <a href="dbconfig.html#Locking">Locking</a> 130 </span> 131 </dt> 132 <dt> 133 <span class="sect2"> 134 <a href="dbconfig.html#IOEfficiency">IO Efficiency</a> 135 </span> 136 </dt> 137 <dt> 138 <span class="sect2"> 139 <a href="dbconfig.html#pagesizeAdvice">Page Sizing Advice</a> 140 </span> 141 </dt> 142 </dl> 143 </div> 144 <p> 145 Internally, DB stores database entries on pages. Page sizes are 146 important because they can affect your application's performance. 147 </p> 148 <p> 149 DB pages can be between 512 bytes and 64K bytes in size. The size 150 that you select must be a power of 2. You set your database's 151 page size using 152 153 154 <span><code class="methodname">DatabaseConfig.setPageSize()</code>.</span> 155 </p> 156 <p> 157 Note that a database's page size can only be selected at database 158 creation time. 159 </p> 160 <p> 161 When selecting a page size, you should consider the following issues: 162 </p> 163 <div class="itemizedlist"> 164 <ul type="disc"> 165 <li> 166 <p> 167 Overflow pages. 168 </p> 169 </li> 170 <li> 171 <p> 172 Locking 173 </p> 174 </li> 175 <li> 176 <p> 177 Disk I/O. 178 </p> 179 </li> 180 </ul> 181 </div> 182 <p> 183 These topics are discussed next. 184 </p> 185 <div class="sect2" lang="en" xml:lang="en"> 186 <div class="titlepage"> 187 <div> 188 <div> 189 <h3 class="title"><a id="overflowpages"></a>Overflow Pages</h3> 190 </div> 191 </div> 192 </div> 193 <p> 194 Overflow pages are used to hold a key or data item 195 that cannot fit on a single page. You do not have to do anything to 196 cause overflow pages to be created, other than to store data that is 197 too large for your database's page size. Also, the only way you can 198 prevent overflow pages from being created is to be sure to select a 199 page size that is large enough to hold your database entries. 200 </p> 201 <p> 202 Because overflow pages exist outside of the normal database 203 structure, their use is expensive from a performance 204 perspective. If you select too small of a page size, then your 205 database will be forced to use an excessive number of overflow 206 pages. This will significantly harm your application's performance. 207 </p> 208 <p> 209 For this reason, you want to select a page size that is at 210 least large enough to hold multiple entries given the expected 211 average size of your database entries. In BTree's case, for best 212 results select a page size that can hold at least 4 such entries. 213 </p> 214 <p> 215 You can see how many overflow pages your database is using by 216 217 <span> 218 obtaining a <code class="classname">DatabaseStats</code> object using 219 the <code class="methodname">Database.getStats()</code> method, 220 </span> 221 222 or by examining your database using the 223 <code class="literal">db_stat</code> command line utility. 224 </p> 225 </div> 226 <div class="sect2" lang="en" xml:lang="en"> 227 <div class="titlepage"> 228 <div> 229 <div> 230 <h3 class="title"><a id="Locking"></a>Locking</h3> 231 </div> 232 </div> 233 </div> 234 <p> 235 Locking and multi-threaded access to DB databases is built into 236 the product. However, in order to enable the locking subsystem and 237 in order to provide efficient sharing of the cache between 238 databases, you must use an <span class="emphasis"><em>environment</em></span>. 239 Environments and multi-threaded access are not fully described 240 in this manual (see the Berkeley DB Programmer's Reference Manual for 241 information), however, we provide some information on sizing your 242 pages in a multi-threaded/multi-process environment in the interest 243 of providing a complete discussion on the topic. 244 </p> 245 <p> 246 If your application is multi-threaded, or if your databases are 247 accessed by more than one process at a time, then page size can 248 influence your application's performance. The reason why is that 249 for most access methods (Queue is the exception), DB implements 250 page-level locking. This means that the finest locking granularity 251 is at the page, not at the record. 252 </p> 253 <p> 254 In most cases, database pages contain multiple database 255 records. Further, in order to provide safe access to multiple 256 threads or processes, DB performs locking on pages as entries on 257 those pages are read or written. 258 </p> 259 <p> 260 As the size of your page increases relative to the size of your 261 database entries, the number of entries that are held on any given 262 page also increase. The result is that the chances of two or more 263 readers and/or writers wanting to access entries on any given page 264 also increases. 265 </p> 266 <p> 267 When two or more threads and/or processes want to manage data on a 268 page, lock contention occurs. Lock contention is resolved by one 269 thread (or process) waiting for another thread to give up its lock. 270 It is this waiting activity that is harmful to your application's 271 performance. 272 </p> 273 <p> 274 It is possible to select a page size that is so large that your 275 application will spend excessive, and noticeable, amounts of time 276 resolving lock contention. Note that this scenario is particularly 277 likely to occur as the amount of concurrency built into your 278 application increases. 279 </p> 280 <p> 281 Oh the other hand, if you select too small of a page size, then that 282 that will only make your tree deeper, which can also cause 283 performance penalties. The trick, therefore, is to select a 284 reasonable page size (one that will hold a sizeable number of 285 records) and then reduce the page size if you notice lock 286 contention. 287 </p> 288 <p> 289 You can examine the number of lock conflicts and deadlocks occurring 290 in your application by examining your database environment lock 291 statistics. Either use the 292 293 294 295 method, or use the <code class="literal">db_stat</code> command line utility. 296 The number of unavailable locks that your application waited for is 297 held in the lock statistic's <code class="literal">st_lock_wait</code> field. 298 299 </p> 300 </div> 301 <div class="sect2" lang="en" xml:lang="en"> 302 <div class="titlepage"> 303 <div> 304 <div> 305 <h3 class="title"><a id="IOEfficiency"></a>IO Efficiency</h3> 306 </div> 307 </div> 308 </div> 309 <p> 310 Page size can affect how efficient DB is at moving data to and 311 from disk. For some applications, especially those for which the 312 in-memory cache can not be large enough to hold the entire working 313 dataset, IO efficiency can significantly impact application performance. 314 </p> 315 <p> 316 Most operating systems use an internal block size to determine how much 317 data to move to and from disk for a single I/O operation. This block 318 size is usually equal to the filesystem's block size. For optimal 319 disk I/O efficiency, you should select a database page size that is 320 equal to the operating system's I/O block size. 321 </p> 322 <p> 323 Essentially, DB performs data transfers based on the database 324 page size. That is, it moves data to and from disk a page at a time. 325 For this reason, if the page size does not match the I/O block size, 326 then the operating system can introduce inefficiencies in how it 327 responds to DB's I/O requests. 328 </p> 329 <p> 330 For example, suppose your page size is smaller than your operating 331 system block size. In this case, when DB writes a page to disk 332 it is writing just a portion of a logical filesystem page. Any time 333 any application writes just a portion of a logical filesystem page, the 334 operating system brings in the real filesystem page, over writes 335 the portion of the page not written by the application, then writes 336 the filesystem page back to disk. The net result is significantly 337 more disk I/O than if the application had simply selected a page 338 size that was equal to the underlying filesystem block size. 339 </p> 340 <p> 341 Alternatively, if you select a page size that is larger than the 342 underlying filesystem block size, then the operating system may have 343 to read more data than is necessary to fulfill a read request. 344 Further, on some operating systems, requesting a single database 345 page may result in the operating system reading enough filesystem 346 blocks to satisfy the operating system's criteria for read-ahead. In 347 this case, the operating system will be reading significantly more 348 data from disk than is actually required to fulfill DB's read 349 request. 350 </p> 351 <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"> 352 <h3 class="title">Note</h3> 353 <p> 354 While transactions are not discussed in this manual, a page size 355 other than your filesystem's block size can affect transactional 356 guarantees. The reason why is that page sizes larger than the 357 filesystem's block size causes DB to write pages in block 358 size increments. As a result, it is possible for a partial page 359 to be written as the result of a transactional commit. For more 360 information, see <a class="ulink" href="http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/reclimit.html" target="_top">http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/reclimit.html</a>. 361 </p> 362 </div> 363 </div> 364 <div class="sect2" lang="en" xml:lang="en"> 365 <div class="titlepage"> 366 <div> 367 <div> 368 <h3 class="title"><a id="pagesizeAdvice"></a>Page Sizing Advice</h3> 369 </div> 370 </div> 371 </div> 372 <p> 373 Page sizing can be confusing at first, so here are some general 374 guidelines that you can use to select your page size. 375 </p> 376 <p> 377 In general, and given no other considerations, a page size that is equal 378 to your filesystem block size is the ideal situation. 379 </p> 380 <p> 381 If your data is designed such that 4 database entries cannot fit on a 382 single page (assuming BTree), then grow your page size to accommodate 383 your data. Once you've abandoned matching your filesystem's block 384 size, the general rule is that larger page sizes are better. 385 </p> 386 <p> 387 The exception to this rule is if you have a great deal of 388 concurrency occurring in your application. In this case, the closer 389 you can match your page size to the ideal size needed for your 390 application's data, the better. Doing so will allow you to avoid 391 unnecessary contention for page locks. 392 </p> 393 </div> 394 </div> 395 </div> 396 <div class="navfooter"> 397 <hr /> 398 <table width="100%" summary="Navigation footer"> 399 <tr> 400 <td width="40%" align="left"><a accesskey="p" href="javaindexusage.html">Prev</a>��</td> 401 <td width="20%" align="center"> 402 <a accesskey="u" href="baseapi.html">Up</a> 403 </td> 404 <td width="40%" align="right">��<a accesskey="n" href="cachesize.html">Next</a></td> 405 </tr> 406 <tr> 407 <td width="40%" align="left" valign="top">Secondary Database Example��</td> 408 <td width="20%" align="center"> 409 <a accesskey="h" href="index.html">Home</a> 410 </td> 411 <td width="40%" align="right" valign="top">��Selecting the Cache Size</td> 412 </tr> 413 </table> 414 </div> 415 </body> 416</html> 417