1<!--$Id: tune.so,v 11.25 2008/01/24 00:38:39 sarette Exp $--> 2<!--Copyright (c) 1997,2008 Oracle. All rights reserved.--> 3<!--See the file LICENSE for redistribution information.--> 4<html> 5<head> 6<title>Berkeley DB Reference Guide: Transaction tuning</title> 7<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> 8<meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++"> 9</head> 10<body bgcolor=white> 11<a name="2"><!--meow--></a><a name="3"><!--meow--></a> 12<table width="100%"><tr valign=top> 13<td><b><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Transactional Data Store Applications</dl></b></td> 14<td align=right><a href="../transapp/reclimit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../transapp/throughput.html"><img src="../../images/next.gif" alt="Next"></a> 15</td></tr></table> 16<p align=center><b>Transaction tuning</b></p> 17<p>There are a few different issues to consider when tuning the performance 18of Berkeley DB transactional applications. First, you should review 19<a href="../../ref/am_misc/tune.html">Access method tuning</a>, as the 20tuning issues for access method applications are applicable to 21transactional applications as well. The following are additional tuning 22issues for Berkeley DB transactional applications:</p> 23<br> 24<b>access method</b><ul compact><li>Highly concurrent applications should use the Queue access method, where 25possible, as it provides finer-granularity of locking than the other 26access methods. Otherwise, applications usually see better concurrency 27when using the Btree access method than when using either the Hash or 28Recno access methods.</ul> 29<b>record numbers</b><ul compact><li>Using record numbers outside of the Queue access method will often slow 30down concurrent applications as they limit the degree of concurrency 31available in the database. Using the Recno access method, or the Btree 32access method with retrieval by record number configured can slow 33applications down.</ul> 34<b>Btree database size</b><ul compact><li>When using the Btree access method, applications supporting concurrent 35access may see excessive numbers of deadlocks in small databases. There 36are two different approaches to resolving this problem. First, as the 37Btree access method uses page-level locking, decreasing the database 38page size can result in fewer lock conflicts. Second, in the case of 39databases that are cyclically growing and shrinking, turning off reverse 40splits (with <a href="../../api_c/db_set_flags.html#DB_REVSPLITOFF">DB_REVSPLITOFF</a>) can leave the database with enough 41pages that there will be fewer lock conflicts.</ul> 42<b>read locks</b><ul compact><li>Performing all read operations outside of transactions or at 43<a href="../../ref/transapp/read.html">snapshot isolation</a> can often 44significantly increase application throughput. In addition, limiting 45the lifetime of non-transactional cursors will reduce the length of 46times locks are held, thereby improving concurrency.</ul> 47<b><a href="../../api_c/env_set_flags.html#DB_DIRECT_DB">DB_DIRECT_DB</a>, <a href="../../api_c/env_log_set_config.html#DB_LOG_DIRECT">DB_LOG_DIRECT</a></b><ul compact><li>Consider using the <a href="../../api_c/env_set_flags.html#DB_DIRECT_DB">DB_DIRECT_DB</a> and <a href="../../api_c/env_log_set_config.html#DB_LOG_DIRECT">DB_LOG_DIRECT</a> flags. 48On some systems, avoiding caching in the operating system can improve 49write throughput and allow the creation of larger Berkeley DB caches.</ul> 50<b><a href="../../api_c/db_open.html#DB_READ_UNCOMMITTED">DB_READ_UNCOMMITTED</a>, <a href="../../api_c/db_cursor.html#DB_READ_COMMITTED">DB_READ_COMMITTED</a></b><ul compact><li>Consider decreasing the level of isolation of transaction using the 51<a href="../../api_c/db_open.html#DB_READ_UNCOMMITTED">DB_READ_UNCOMMITTED</a> or <a href="../../api_c/db_cursor.html#DB_READ_COMMITTED">DB_READ_COMMITTED</a> flags for 52transactions or cursors or the <a href="../../api_c/db_open.html#DB_READ_UNCOMMITTED">DB_READ_UNCOMMITTED</a> flag on 53individual read operations. The <a href="../../api_c/db_cursor.html#DB_READ_COMMITTED">DB_READ_COMMITTED</a> flag will 54release read locks on cursors as soon as the data page is nolonger 55referenced. This is also called <i>degree 2 isolation</i>. This 56will tend to block write operations for shorter periods for applications 57that do not need to have repeatable reads for cursor operations. 58<p>The <a href="../../api_c/db_open.html#DB_READ_UNCOMMITTED">DB_READ_UNCOMMITTED</a> flag will allow read operations to 59potentially return data which has been modified but not yet committed, 60and can significantly increase application throughput in applications 61that do not require data be guaranteed to be permanent in the database. 62This is also called <i>degree 1 isolation</i>, or <i>dirty 63reads</i>.</p></ul> 64<b><a href="../../api_c/dbc_get.html#DB_RMW">DB_RMW</a></b><ul compact><li>If there are many deadlocks, consider using the <a href="../../api_c/dbc_get.html#DB_RMW">DB_RMW</a> flag to 65immediate acquire write locks when reading data items that will 66subsequently be modified. Although this flag may increase contention 67(because write locks are held longer than they would otherwise be), it 68may decrease the number of deadlocks that occur.</ul> 69<b><a href="../../api_c/env_set_flags.html#DB_TXN_WRITE_NOSYNC">DB_TXN_WRITE_NOSYNC</a>, <a href="../../api_c/env_set_flags.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a></b><ul compact><li>By default, transactional commit in Berkeley DB implies durability, that is, 70all committed operations will be present in the database after recovery 71from any application or system failure. For applications not requiring 72that level of certainty, specifying the <a href="../../api_c/env_set_flags.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a> flag will 73often provide a significant performance improvement. In this case, the 74database will still be fully recoverable, but some number of committed 75transactions might be lost after application or system failure.</ul> 76<b>access databases in order</b><ul compact><li>When modifying multiple databases in a single transaction, always access 77physical files and databases within physical files, in the same order 78where possible. In addition, avoid returning to a physical file or 79database, that is, avoid accessing a database, moving on to another 80database and then returning to the first database. This can 81significantly reduce the chance of deadlock between threads of 82control.</ul> 83<b>large key/data items</b><ul compact><li>Transactional protections in Berkeley DB are guaranteed by before and after 84physical image logging. This means applications modifying large 85key/data items also write large log records, and, in the case of the 86default transaction commit, threads of control must wait until those 87log records have been flushed to disk. Applications supporting 88concurrent access should try and keep key/data items small wherever 89possible.</ul> 90<b>mutex selection</b><ul compact><li>During configuration, Berkeley DB selects a mutex implementation for the 91architecture. Berkeley DB normally prefers blocking-mutex implementations over 92non-blocking ones. For example, Berkeley DB will select POSIX pthread mutex 93interfaces rather than assembly-code test-and-set spin mutexes because 94pthread mutexes are usually more efficient and less likely to waste CPU 95cycles spinning without getting any work accomplished. 96<p>For some applications and systems (generally highly concurrent 97applications on large multiprocessor systems), Berkeley DB makes the wrong 98choice. In some cases, better performance can be achieved by 99configuring with the 100<a href="../../ref/build_unix/conf.html#--with-mutex">--with-mutex</a> 101argument and selecting a different mutex implementation than the one 102selected by Berkeley DB. When a test-and-set spin mutex implementation is 103selected, it may be useful to tune the number of spins made before 104yielding the processor and sleeping. For more information, see the 105<a href="../../api_c/mutex_set_tas_spins.html">DB_ENV->mutex_set_tas_spins</a> method.</p> 106<p>Finally, Berkeley DB may put multiple mutexes on individual cache lines. When 107tuning Berkeley DB for large multiprocessor systems, it may be useful to tune 108mutex alignment using the 109<a href="../../api_c/mutex_set_align.html">DB_ENV->mutex_set_align</a> method.</p></ul> 110<b><a href="../../ref/build_unix/conf.html#--enable-posixmutexes">--enable-posixmutexes</a></b><ul compact><li>By default, the Berkeley DB library will only select the POSIX pthread mutex 111implementation if it supports mutexes shared between multiple processes. 112If your application does not share its database environment between 113processes and your system's POSIX mutex support was not selected because 114it did not support inter-process mutexes, you may be able to increase 115performance and transactional throughput by configuring with the 116<a href="../../ref/build_unix/conf.html#--enable-posixmutexes">--enable-posixmutexes</a> argument.</ul> 117<b>log buffer size</b><ul compact><li>Berkeley DB internally maintains a buffer of log writes. The buffer is 118written to disk at transaction commit, by default, or, whenever it 119is filled. If it is consistently being filled before transaction 120commit, it will be written multiple times per transaction, costing 121application performance. In these cases, increasing the size of the 122log buffer can increase application throughput.</ul> 123<b>log file location</b><ul compact><li>If the database environment's log files are on the same disk as the 124databases, the disk arms will have to seek back-and-forth between the 125two. Placing the log files and the databases on different disk arms 126can often increase application throughput.</ul> 127<b>trickle write</b><ul compact><li>In some applications, the cache is sufficiently active and dirty that 128readers frequently need to write a dirty page in order to have space in 129which to read a new page from the backing database file. You can use 130the <a href="../../utility/db_stat.html">db_stat</a> utility (or the statistics returned by the 131<a href="../../api_c/memp_stat.html">DB_ENV->memp_stat</a> method) to see how often this is happening in your 132application's cache. In this case, using a separate thread of control 133and the <a href="../../api_c/memp_trickle.html">DB_ENV->memp_trickle</a> method to trickle-write pages can often increase 134the overall throughput of the application.</ul> 135<br> 136<table width="100%"><tr><td><br></td><td align=right><a href="../transapp/reclimit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../transapp/throughput.html"><img src="../../images/next.gif" alt="Next"></a> 137</td></tr></table> 138<p><font size=1>Copyright (c) 1996,2008 Oracle. All rights reserved.</font> 139</body> 140</html> 141