1<!--$Id: cachesize.so,v 10.20 2003/02/19 17:41:58 bostic Exp $--> 2<!--Copyright (c) 1997,2008 Oracle. All rights reserved.--> 3<!--See the file LICENSE for redistribution information.--> 4<html> 5<head> 6<title>Berkeley DB Reference Guide: Selecting a cache size</title> 7<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> 8<meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++"> 9</head> 10<body bgcolor=white> 11<a name="2"><!--meow--></a> 12<table width="100%"><tr valign=top> 13<td><b><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></b></td> 14<td align=right><a href="../am_conf/pagesize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../am_conf/byteorder.html"><img src="../../images/next.gif" alt="Next"></a> 15</td></tr></table> 16<p align=center><b>Selecting a cache size</b></p> 17<p>The size of the cache used for the underlying database can be specified 18by calling the <a href="../../api_c/db_set_cachesize.html">DB->set_cachesize</a> method. 19Choosing a cache size is, unfortunately, an art. Your cache must be at 20least large enough for your working set plus some overlap for unexpected 21situations.</p> 22<p>When using the Btree access method, you must have a cache big enough for 23the minimum working set for a single access. This will include a root 24page, one or more internal pages (depending on the depth of your tree), 25and a leaf page. If your cache is any smaller than that, each new page 26will force out the least-recently-used page, and Berkeley DB will re-read the 27root page of the tree anew on each database request.</p> 28<p>If your keys are of moderate size (a few tens of bytes) and your pages 29are on the order of 4KB to 8KB, most Btree applications will be only 30three levels. For example, using 20 byte keys with 20 bytes of data 31associated with each key, a 8KB page can hold roughly 400 keys (or 200 32key/data pairs), so a fully populated three-level Btree will hold 32 33million key/data pairs, and a tree with only a 50% page-fill factor will 34still hold 16 million key/data pairs. We rarely expect trees to exceed 35five levels, although Berkeley DB will support trees up to 255 levels.</p> 36<p>The rule-of-thumb is that cache is good, and more cache is better. 37Generally, applications benefit from increasing the cache size up to a 38point, at which the performance will stop improving as the cache size 39increases. When this point is reached, one of two things have happened: 40either the cache is large enough that the application is almost never 41having to retrieve information from disk, or, your application is doing 42truly random accesses, and therefore increasing size of the cache doesn't 43significantly increase the odds of finding the next requested information 44in the cache. The latter is fairly rare -- almost all applications show 45some form of locality of reference.</p> 46<p>That said, it is important not to increase your cache size beyond the 47capabilities of your system, as that will result in reduced performance. 48Under many operating systems, tying down enough virtual memory will cause 49your memory and potentially your program to be swapped. This is 50especially likely on systems without unified OS buffer caches and virtual 51memory spaces, as the buffer cache was allocated at boot time and so 52cannot be adjusted based on application requests for large amounts of 53virtual memory.</p> 54<p>For example, even if accesses are truly random within a Btree, your 55access pattern will favor internal pages to leaf pages, so your cache 56should be large enough to hold all internal pages. In the steady state, 57this requires at most one I/O per operation to retrieve the appropriate 58leaf page.</p> 59<p>You can use the <a href="../../utility/db_stat.html">db_stat</a> utility to monitor the effectiveness of 60your cache. The following output is excerpted from the output of that 61utility's <b>-m</b> option:</p> 62<blockquote><pre>prompt: db_stat -m 63131072 Cache size (128K). 644273 Requested pages found in the cache (97%). 65134 Requested pages not found in the cache. 6618 Pages created in the cache. 67116 Pages read into the cache. 6893 Pages written from the cache to the backing file. 695 Clean pages forced from the cache. 7013 Dirty pages forced from the cache. 710 Dirty buffers written by trickle-sync thread. 72130 Current clean buffer count. 734 Current dirty buffer count. 74</pre></blockquote> 75<p>The statistics for this cache say that there have been 4,273 requests of 76the cache, and only 116 of those requests required an I/O from disk. This 77means that the cache is working well, yielding a 97% cache hit rate. The 78<a href="../../utility/db_stat.html">db_stat</a> utility will present these statistics both for the cache 79as a whole and for each file within the cache separately.</p> 80<table width="100%"><tr><td><br></td><td align=right><a href="../am_conf/pagesize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../am_conf/byteorder.html"><img src="../../images/next.gif" alt="Next"></a> 81</td></tr></table> 82<p><font size=1>Copyright (c) 1996,2008 Oracle. All rights reserved.</font> 83</body> 84</html> 85