1<!--$Id: tune.so,v 11.25 2008/01/24 00:38:39 sarette Exp $-->
2<!--Copyright (c) 1997,2008 Oracle.  All rights reserved.-->
3<!--See the file LICENSE for redistribution information.-->
4<html>
5<head>
6<title>Berkeley DB Reference Guide: Transaction tuning</title>
7<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
8<meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++">
9</head>
10<body bgcolor=white>
11<a name="2"><!--meow--></a><a name="3"><!--meow--></a>
12<table width="100%"><tr valign=top>
13<td><b><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Transactional Data Store Applications</dl></b></td>
14<td align=right><a href="/transapp/reclimit.html"><img src="/images/prev.gif" alt="Prev"></a><a href="/toc.html"><img src="/images/ref.gif" alt="Ref"></a><a href="/transapp/throughput.html"><img src="/images/next.gif" alt="Next"></a>
15</td></tr></table>
16<p align=center><b>Transaction tuning</b></p>
17<p>There are a few different issues to consider when tuning the performance
18of Berkeley DB transactional applications.  First, you should review
19<a href="/ref/am_misc/tune.html">Access method tuning</a>, as the
20tuning issues for access method applications are applicable to
21transactional applications as well.  The following are additional tuning
22issues for Berkeley DB transactional applications:</p>
23<br>
24<b>access method</b><ul compact><li>Highly concurrent applications should use the Queue access method, where
25possible, as it provides finer-granularity of locking than the other
26access methods.  Otherwise, applications usually see better concurrency
27when using the Btree access method than when using either the Hash or
28Recno access methods.</ul>
29<b>record numbers</b><ul compact><li>Using record numbers outside of the Queue access method will often slow
30down concurrent applications as they limit the degree of concurrency
31available in the database.  Using the Recno access method, or the Btree
32access method with retrieval by record number configured can slow
33applications down.</ul>
34<b>Btree database size</b><ul compact><li>When using the Btree access method, applications supporting concurrent
35access may see excessive numbers of deadlocks in small databases.  There
36are two different approaches to resolving this problem.  First, as the
37Btree access method uses page-level locking, decreasing the database
38page size can result in fewer lock conflicts.  Second, in the case of
39databases that are cyclically growing and shrinking, turning off reverse
40splits (with <a href="/api_c/db_set_flags.html#DB_REVSPLITOFF">DB_REVSPLITOFF</a>) can leave the database with enough
41pages that there will be fewer lock conflicts.</ul>
42<b>read locks</b><ul compact><li>Performing all read operations outside of transactions or at
43<a href="/ref/transapp/read.html">snapshot isolation</a> can often
44significantly increase application throughput.  In addition, limiting
45the lifetime of non-transactional cursors will reduce the length of
46times locks are held, thereby improving concurrency.</ul>
47<b><a href="/api_c/env_set_flags.html#DB_DIRECT_DB">DB_DIRECT_DB</a>, <a href="/api_c/env_log_set_config.html#DB_LOG_DIRECT">DB_LOG_DIRECT</a></b><ul compact><li>Consider using the <a href="/api_c/env_set_flags.html#DB_DIRECT_DB">DB_DIRECT_DB</a> and <a href="/api_c/env_log_set_config.html#DB_LOG_DIRECT">DB_LOG_DIRECT</a> flags.
48On some systems, avoiding caching in the operating system can improve
49write throughput and allow the creation of larger Berkeley DB caches.</ul>
50<b><a href="/api_c/db_open.html#DB_READ_UNCOMMITTED">DB_READ_UNCOMMITTED</a>, <a href="/api_c/db_cursor.html#DB_READ_COMMITTED">DB_READ_COMMITTED</a></b><ul compact><li>Consider decreasing the level of isolation of transaction using the
51<a href="/api_c/db_open.html#DB_READ_UNCOMMITTED">DB_READ_UNCOMMITTED</a> or <a href="/api_c/db_cursor.html#DB_READ_COMMITTED">DB_READ_COMMITTED</a> flags for
52transactions or cursors or the <a href="/api_c/db_open.html#DB_READ_UNCOMMITTED">DB_READ_UNCOMMITTED</a> flag on
53individual read operations.  The <a href="/api_c/db_cursor.html#DB_READ_COMMITTED">DB_READ_COMMITTED</a> flag will
54release read locks on cursors as soon as the data page is nolonger
55referenced.  This is also called <i>degree 2 isolation</i>.  This
56will tend to block write operations for shorter periods for applications
57that do not need to have repeatable reads for cursor operations.
58<p>The <a href="/api_c/db_open.html#DB_READ_UNCOMMITTED">DB_READ_UNCOMMITTED</a> flag will allow read operations to
59potentially return data which has been modified but not yet committed,
60and can significantly increase application throughput in applications
61that do not require data be guaranteed to be permanent in the database.
62This is also called <i>degree 1 isolation</i>, or <i>dirty
63reads</i>.</p></ul>
64<b><a href="/api_c/dbc_get.html#DB_RMW">DB_RMW</a></b><ul compact><li>If there are many deadlocks, consider using the <a href="/api_c/dbc_get.html#DB_RMW">DB_RMW</a> flag to
65immediate acquire write locks when reading data items that will
66subsequently be modified.  Although this flag may increase contention
67(because write locks are held longer than they would otherwise be), it
68may decrease the number of deadlocks that occur.</ul>
69<b><a href="/api_c/env_set_flags.html#DB_TXN_WRITE_NOSYNC">DB_TXN_WRITE_NOSYNC</a>, <a href="/api_c/env_set_flags.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a></b><ul compact><li>By default, transactional commit in Berkeley DB implies durability, that is,
70all committed operations will be present in the database after recovery
71from any application or system failure.  For applications not requiring
72that level of certainty, specifying the <a href="/api_c/env_set_flags.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a> flag will
73often provide a significant performance improvement. In this case, the
74database will still be fully recoverable, but some number of committed
75transactions might be lost after application or system failure.</ul>
76<b>access databases in order</b><ul compact><li>When modifying multiple databases in a single transaction, always access
77physical files and databases within physical files, in the same order
78where possible.  In addition, avoid returning to a physical file or
79database, that is, avoid accessing a database, moving on to another
80database and then returning to the first database.  This can
81significantly reduce the chance of deadlock between threads of
82control.</ul>
83<b>large key/data items</b><ul compact><li>Transactional protections in Berkeley DB are guaranteed by before and after
84physical image logging.  This means applications modifying large
85key/data items also write large log records, and, in the case of the
86default transaction commit, threads of control must wait until those
87log records have been flushed to disk.  Applications supporting
88concurrent access should try and keep key/data items small wherever
89possible.</ul>
90<b>mutex selection</b><ul compact><li>During configuration, Berkeley DB selects a mutex implementation for the
91architecture. Berkeley DB normally prefers blocking-mutex implementations over
92non-blocking ones.  For example, Berkeley DB will select POSIX pthread mutex
93interfaces rather than assembly-code test-and-set spin mutexes because
94pthread mutexes are usually more efficient and less likely to waste CPU
95cycles spinning without getting any work accomplished.
96<p>For some applications and systems (generally highly concurrent
97applications on large multiprocessor systems), Berkeley DB makes the wrong
98choice.  In some cases, better performance can be achieved by
99configuring with the
100<a href="/ref/build_unix/conf.html#--with-mutex">--with-mutex</a>
101argument and selecting a different mutex implementation than the one
102selected by Berkeley DB.  When a test-and-set spin mutex implementation is
103selected, it may be useful to tune the number of spins made before
104yielding the processor and sleeping.  For more information, see the
105<a href="/api_c/mutex_set_tas_spins.html">DB_ENV-&gt;mutex_set_tas_spins</a> method.</p>
106<p>Finally, Berkeley DB may put multiple mutexes on individual cache lines.  When
107tuning Berkeley DB for large multiprocessor systems, it may be useful to tune
108mutex alignment using the
109<a href="/api_c/mutex_set_align.html">DB_ENV-&gt;mutex_set_align</a> method.</p></ul>
110<b><a href="/ref/build_unix/conf.html#--enable-posixmutexes">--enable-posixmutexes</a></b><ul compact><li>By default, the Berkeley DB library will only select the POSIX pthread mutex
111implementation if it supports mutexes shared between multiple processes.
112If your application does not share its database environment between
113processes and your system's POSIX mutex support was not selected because
114it did not support inter-process mutexes, you may be able to increase
115performance and transactional throughput by configuring with the
116<a href="/ref/build_unix/conf.html#--enable-posixmutexes">--enable-posixmutexes</a> argument.</ul>
117<b>log buffer size</b><ul compact><li>Berkeley DB internally maintains a buffer of log writes.   The buffer is
118written to disk at transaction commit, by default, or, whenever it
119is filled.  If it is consistently being filled before transaction
120commit, it will be written multiple times per transaction, costing
121application performance.  In these cases, increasing the size of the
122log buffer can increase application throughput.</ul>
123<b>log file location</b><ul compact><li>If the database environment's log files are on the same disk as the
124databases, the disk arms will have to seek back-and-forth between the
125two.  Placing the log files and the databases on different disk arms
126can often increase application throughput.</ul>
127<b>trickle write</b><ul compact><li>In some applications, the cache is sufficiently active and dirty that
128readers frequently need to write a dirty page in order to have space in
129which to read a new page from the backing database file.  You can use
130the <a href="/utility/db_stat.html">db_stat</a> utility (or the statistics returned by the
131<a href="/api_c/memp_stat.html">DB_ENV-&gt;memp_stat</a> method) to see how often this is happening in your
132application's cache.  In this case, using a separate thread of control
133and the <a href="/api_c/memp_trickle.html">DB_ENV-&gt;memp_trickle</a> method to trickle-write pages can often increase
134the overall throughput of the application.</ul>
135<br>
136<table width="100%"><tr><td><br></td><td align=right><a href="/transapp/reclimit.html"><img src="/images/prev.gif" alt="Prev"></a><a href="/toc.html"><img src="/images/ref.gif" alt="Ref"></a><a href="/transapp/throughput.html"><img src="/images/next.gif" alt="Next"></a>
137</td></tr></table>
138<p><font size=1>Copyright (c) 1996,2008 Oracle.  All rights reserved.</font>
139</body>
140</html>
141