1<?xml version="1.0" encoding="UTF-8" standalone="no"?> 2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3<html xmlns="http://www.w3.org/1999/xhtml"> 4 <head> 5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 6 <title>What is Berkeley DB?</title> 7 <link rel="stylesheet" href="gettingStarted.css" type="text/css" /> 8 <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" /> 9 <link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" /> 10 <link rel="up" href="intro.html" title="Chapter 1. Introduction" /> 11 <link rel="prev" href="intro_terrain.html" title="Mapping the terrain: theory and practice" /> 12 <link rel="next" href="intro_dbisnot.html" title="What Berkeley DB is not" /> 13 </head> 14 <body> 15 <div class="navheader"> 16 <table width="100%" summary="Navigation header"> 17 <tr> 18 <th colspan="3" align="center">What is Berkeley DB?</th> 19 </tr> 20 <tr> 21 <td width="20%" align="left"><a accesskey="p" href="intro_terrain.html">Prev</a> </td> 22 <th width="60%" align="center">Chapter 1. 23 Introduction 24 </th> 25 <td width="20%" align="right"> <a accesskey="n" href="intro_dbisnot.html">Next</a></td> 26 </tr> 27 </table> 28 <hr /> 29 </div> 30 <div class="sect1" lang="en" xml:lang="en"> 31 <div class="titlepage"> 32 <div> 33 <div> 34 <h2 class="title" style="clear: both"><a id="intro_dbis"></a>What is Berkeley DB?</h2> 35 </div> 36 </div> 37 </div> 38 <div class="toc"> 39 <dl> 40 <dt> 41 <span class="sect2"> 42 <a href="intro_dbis.html#id1587756">Data Access Services</a> 43 </span> 44 </dt> 45 <dt> 46 <span class="sect2"> 47 <a href="intro_dbis.html#id1588499">Data management services</a> 48 </span> 49 </dt> 50 <dt> 51 <span class="sect2"> 52 <a href="intro_dbis.html#id1588295">Design</a> 53 </span> 54 </dt> 55 </dl> 56 </div> 57 <p>So far, we've discussed database systems in general terms. It's time 58now to consider Berkeley DB in particular and see how it fits into the 59framework we have introduced. The key question is, what kinds of 60applications should use Berkeley DB?</p> 61 <p>Berkeley DB is an Open Source embedded database library that provides 62scalable, high-performance, transaction-protected data management 63services to applications. Berkeley DB provides a simple function-call API for 64data access and management.</p> 65 <p>By "Open Source," we mean Berkeley DB is distributed under a license that 66conforms to the <a class="ulink" href="http://www.opensource.org/osd.html" target="_top">Open 67Source Definition</a>. This license guarantees Berkeley DB is freely available 68for use and redistribution in other Open Source applications. Oracle 69Corporation sells commercial licenses allowing the redistribution of 70Berkeley DB in proprietary applications. In all cases the complete source 71code for Berkeley DB is freely available for download and use.</p> 72 <p>Berkeley DB is "embedded" because it links directly into the application. It 73runs in the same address space as the application. As a result, no 74inter-process communication, either over the network or between 75processes on the same machine, is required for database operations. 76Berkeley DB provides a simple function-call API for a number of programming 77languages, including C, C++, Java, Perl, Tcl, Python, and PHP. All 78database operations happen inside the library. Multiple processes, or 79multiple threads in a single process, can all use the database at the 80same time as each uses the Berkeley DB library. Low-level services like 81locking, transaction logging, shared buffer management, memory 82management, and so on are all handled transparently by the library.</p> 83 <p>The Berkeley DB library is extremely portable. It runs under almost all UNIX 84and Linux variants, Windows, and a number of embedded real-time 85operating systems. It runs on both 32-bit and 64-bit systems. It has 86been deployed on high-end Internet servers, desktop machines, and on 87palmtop computers, set-top boxes, in network switches, and elsewhere. 88Once Berkeley DB is linked into the application, the end user generally does 89not know that there's a database present at all.</p> 90 <p>Berkeley DB is scalable in a number of respects. The database library itself 91is quite compact (under 300 kilobytes of text space on common 92architectures), but it can manage databases up to 256 terabytes in size. 93It also supports high concurrency, with thousands of users operating on 94the same database at the same time. Berkeley DB is small enough to run in 95tightly constrained embedded systems, but can take advantage of 96gigabytes of memory and terabytes of disk on high-end server machines.</p> 97 <p>Berkeley DB generally outperforms relational and object-oriented database 98systems in embedded applications for a couple of reasons. First, because 99the library runs in the same address space, no inter-process 100communication is required for database operations. The cost of 101communicating between processes on a single machine, or among machines 102on a network, is much higher than the cost of making a function call. 103Second, because Berkeley DB uses a simple function-call interface for all 104operations, there is no query language to parse, and no execution plan 105to produce.</p> 106 <div class="sect2" lang="en" xml:lang="en"> 107 <div class="titlepage"> 108 <div> 109 <div> 110 <h3 class="title"><a id="id1587756"></a>Data Access Services</h3> 111 </div> 112 </div> 113 </div> 114 <p>Berkeley DB applications can choose the storage structure that best suits the 115application. Berkeley DB supports hash tables, Btrees, simple 116record-number-based storage, and persistent queues. Programmers can 117create tables using any of these storage structures, and can mix 118operations on different kinds of tables in a single application.</p> 119 <p>Hash tables are generally good for very large databases that need 120predictable search and update times for random-access records. Hash 121tables allow users to ask, "Does this key exist?" or to fetch a record 122with a known key. Hash tables do not allow users to ask for records 123with keys that are close to a known key.</p> 124 <p>Btrees are better for range-based searches, as when the application 125needs to find all records with keys between some starting and ending 126value. Btrees also do a better job of exploiting <span class="emphasis"><em>locality 127of reference</em></span>. If the application is likely to touch keys near each 128other at the same time, the Btrees work well. The tree structure keeps 129keys that are close together near one another in storage, so fetching 130nearby values usually doesn't require a disk access.</p> 131 <p>Record-number-based storage is natural for applications that need to 132store and fetch records, but that do not have a simple way to generate 133keys of their own. In a record number table, the record number is the 134key for the record. Berkeley DB will generate these record numbers 135automatically.</p> 136 <p>Queues are well-suited for applications that create records, and then 137must deal with those records in creation order. A good example is 138on-line purchasing systems. Orders can enter the system at any time, 139but should generally be filled in the order in which they were placed.</p> 140 </div> 141 <div class="sect2" lang="en" xml:lang="en"> 142 <div class="titlepage"> 143 <div> 144 <div> 145 <h3 class="title"><a id="id1588499"></a>Data management services</h3> 146 </div> 147 </div> 148 </div> 149 <p>Berkeley DB offers important data management services, including concurrency, 150transactions, and recovery. All of these services work on all of the 151storage structures.</p> 152 <p>Many users can work on the same database concurrently. Berkeley DB handles 153locking transparently, ensuring that two users working on the same 154record do not interfere with one another.</p> 155 <p>The library provides strict ACID transaction semantics, by default. 156However, applications are allowed to relax the isolation guarantees 157the database system makes.</p> 158 <p>Multiple operations can be grouped into a single transaction, and can 159be committed or rolled back atomically. Berkeley DB uses a technique called 160<span class="emphasis"><em>two-phase locking</em></span> to be sure that concurrent transactions 161are isolated from one another, and a technique called 162<span class="emphasis"><em>write-ahead logging</em></span> to guarantee that committed changes 163survive application, system, or hardware failures.</p> 164 <p>When an application starts up, it can ask Berkeley DB to run recovery. 165Recovery restores the database to a clean state, with all committed 166changes present, even after a crash. The database is guaranteed to be 167consistent and all committed changes are guaranteed to be present when 168recovery completes.</p> 169 <p>An application can specify, when it starts up, which data management 170services it will use. Some applications need fast, single-user, 171non-transactional Btree data storage. In that case, the application can 172disable the locking and transaction systems, and will not incur the 173overhead of locking or logging. If an application needs to support 174multiple concurrent users, but doesn't need transactions, it can turn 175on locking without transactions. Applications that need concurrent, 176transaction-protected database access can enable all of the 177subsystems.</p> 178 <p>In all these cases, the application uses the same function-call API to 179fetch and update records.</p> 180 </div> 181 <div class="sect2" lang="en" xml:lang="en"> 182 <div class="titlepage"> 183 <div> 184 <div> 185 <h3 class="title"><a id="id1588295"></a>Design</h3> 186 </div> 187 </div> 188 </div> 189 <p>Berkeley DB was designed to provide industrial-strength database services to 190application developers, without requiring them to become database 191experts. It is a classic C-library style <span class="emphasis"><em>toolkit</em></span>, providing 192a broad base of functionality to application writers. Berkeley DB was designed 193by programmers, for programmers: its modular design surfaces simple, 194orthogonal interfaces to core services, and it provides mechanism (for 195example, good thread support) without imposing policy (for example, the 196use of threads is not required). Just as importantly, Berkeley DB allows 197developers to balance performance against the need for crash recovery 198and concurrent use. An application can use the storage structure that 199provides the fastest access to its data and can request only the degree 200of logging and locking that it needs.</p> 201 <p>Because of the tool-based approach and separate interfaces for each 202Berkeley DB subsystem, you can support a complete transaction environment for 203other system operations. Berkeley DB even allows you to wrap transactions 204around the standard UNIX file read and write operations! Further, Berkeley DB 205was designed to interact correctly with the native system's toolset, a 206feature no other database package offers. For example, Berkeley DB supports 207hot backups (database backups while the database is in use), using 208standard UNIX system utilities, for example, dump, tar, cpio, pax or 209even cp.</p> 210 <p>Finally, because scripting language interfaces are available for Berkeley DB 211(notably Tcl and Perl), application writers can build incredibly powerful 212database engines with little effort. You can build transaction-protected 213database applications using your favorite scripting languages, an 214increasingly important feature in a world using CGI scripts to deliver 215HTML.</p> 216 </div> 217 </div> 218 <div class="navfooter"> 219 <hr /> 220 <table width="100%" summary="Navigation footer"> 221 <tr> 222 <td width="40%" align="left"><a accesskey="p" href="intro_terrain.html">Prev</a> </td> 223 <td width="20%" align="center"> 224 <a accesskey="u" href="intro.html">Up</a> 225 </td> 226 <td width="40%" align="right"> <a accesskey="n" href="intro_dbisnot.html">Next</a></td> 227 </tr> 228 <tr> 229 <td width="40%" align="left" valign="top">Mapping the terrain: theory and practice </td> 230 <td width="20%" align="center"> 231 <a accesskey="h" href="index.html">Home</a> 232 </td> 233 <td width="40%" align="right" valign="top"> What Berkeley DB is not</td> 234 </tr> 235 </table> 236 </div> 237 </body> 238</html> 239