1<!--$Id: put.so,v 1.20 2006/04/24 17:26:33 bostic Exp $-->
2<!--Copyright (c) 1997,2008 Oracle.  All rights reserved.-->
3<!--See the file LICENSE for redistribution information.-->
4<html>
5<head>
6<title>Berkeley DB Reference Guide: Recoverability and deadlock handling</title>
7<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
8<meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++">
9</head>
10<body bgcolor=white>
11<table width="100%"><tr valign=top>
12<td><b><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Transactional Data Store Applications</dl></b></td>
13<td align=right><a href="/transapp/data_open.html"><img src="/images/prev.gif" alt="Prev"></a><a href="/toc.html"><img src="/images/ref.gif" alt="Ref"></a><a href="/transapp/atomicity.html"><img src="/images/next.gif" alt="Next"></a>
14</td></tr></table>
15<p align=center><b>Recoverability and deadlock handling</b></p>
16<p>The first reason listed for using transactions was recoverability.  Any
17logical change to a database may require multiple changes to underlying
18data structures.  For example, modifying a record in a Btree may require
19leaf and internal pages to split, so a single <a href="/api_c/db_put.html">DB-&gt;put</a> method
20call can potentially require that multiple physical database pages be
21written.  If only some of those pages are written and then the system
22or application fails, the database is left inconsistent and cannot be
23used until it has been recovered; that is, until the partially completed
24changes have been undone.</p>
25<p><i>Write-ahead-logging</i> is the term that describes the underlying
26implementation that Berkeley DB uses to ensure recoverability.  What it means
27is that before any change is made to a database, information about the
28change is written to a database log.  During recovery, the log is read,
29and databases are checked to ensure that changes described in the log
30for committed transactions appear in the database.  Changes that appear
31in the database but are related to aborted or unfinished transactions
32in the log are undone from the database.</p>
33<p>For recoverability after application or system failure, operations that
34modify the database must be protected by transactions.  More
35specifically, operations are not recoverable unless a transaction is
36begun and each operation is associated with the transaction via the
37Berkeley DB interfaces, and then the transaction successfully committed.  This
38is true even if logging is turned on in the database environment.</p>
39<p>Here is an example function that updates a record in a database in a
40transactionally protected manner.  The function takes a key and data
41items as arguments and then attempts to store them into the database.</p>
42<blockquote><pre>int
43main(int argc, char *argv)
44{
45	extern int optind;
46	DB *db_cats, *db_color, *db_fruit;
47	DB_ENV *dbenv;
48	int ch;
49<p>
50	while ((ch = getopt(argc, argv, "")) != EOF)
51		switch (ch) {
52		case '?':
53		default:
54			usage();
55		}
56	argc -= optind;
57	argv += optind;
58<p>
59	env_dir_create();
60	env_open(&dbenv);
61<p>
62	/* Open database: Key is fruit class; Data is specific type. */
63	db_open(dbenv, &db_fruit, "fruit", 0);
64<p>
65	/* Open database: Key is a color; Data is an integer. */
66	db_open(dbenv, &db_color, "color", 0);
67<p>
68	/*
69	 * Open database:
70	 *	Key is a name; Data is: company name, cat breeds.
71	 */
72	db_open(dbenv, &db_cats, "cats", 1);
73<p>
74<b>	add_fruit(dbenv, db_fruit, "apple", "yellow delicious");</b>
75<p>
76	return (0);
77}
78<p>
79<b>int
80add_fruit(DB_ENV *dbenv, DB *db, char *fruit, char *name)
81{
82	DBT key, data;
83	DB_TXN *tid;
84	int fail, ret, t_ret;
85<p>
86	/* Initialization. */
87	memset(&key, 0, sizeof(key));
88	memset(&data, 0, sizeof(data));
89	key.data = fruit;
90	key.size = strlen(fruit);
91	data.data = name;
92	data.size = strlen(name);
93<p>
94	for (fail = 0;;) {
95		/* Begin the transaction. */
96		if ((ret = dbenv-&gt;txn_begin(dbenv, NULL, &tid, 0)) != 0) {
97			dbenv-&gt;err(dbenv, ret, "DB_ENV-&gt;txn_begin");
98			exit (1);
99		}
100<p>
101		/* Store the value. */
102		switch (ret = db-&gt;put(db, tid, &key, &data, 0)) {
103		case 0:
104			/* Success: commit the change. */
105			if ((ret = tid-&gt;commit(tid, 0)) != 0) {
106				dbenv-&gt;err(dbenv, ret, "DB_TXN-&gt;commit");
107				exit (1);
108			}
109			return (0);
110		case DB_LOCK_DEADLOCK:
111		default:
112			/* Retry the operation. */
113			if ((t_ret = tid-&gt;abort(tid)) != 0) {
114				dbenv-&gt;err(dbenv, t_ret, "DB_TXN-&gt;abort");
115				exit (1);
116			}
117			if (fail++ == MAXIMUM_RETRY)
118				return (ret);
119			break;
120		}
121	}
122}</b></pre></blockquote>
123<p>Berkeley DB also uses transactions to recover from deadlock.  Database
124operations (that is, any call to a function underlying the handles
125returned by <a href="/api_c/db_open.html">DB-&gt;open</a> and <a href="/api_c/db_cursor.html">DB-&gt;cursor</a>) are usually
126performed on behalf of a unique locker.  Transactions can be used to
127perform multiple calls on behalf of the same locker within a single
128thread of control.  For example, consider the case in which an
129application uses a cursor scan to locate a record and then the
130application accesses another other item in the database, based on the
131key returned by the cursor, without first closing the cursor.  If these
132operations are done using default locker IDs, they may conflict.  If the
133locks are obtained on behalf of a transaction, using the transaction's
134locker ID instead of the database handle's locker ID, the operations
135will not conflict.</p>
136<p>There is a new error return in this function that you may not have seen
137before.  In transactional (not Concurrent Data Store) applications
138supporting both readers and writers, or just multiple writers, Berkeley DB
139functions have an additional possible error return:
140<a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>.  This means two threads of control deadlocked,
141and the thread receiving the <a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> error return has
142been selected to discard its locks in order to resolve the problem.
143When an application receives a <a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> return, the
144correct action is to close any cursors involved in the operation and
145abort any enclosing transaction.  In the sample code, any time the
146<a href="/api_c/db_put.html">DB-&gt;put</a> method returns <a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>, <a href="/api_c/txn_abort.html">DB_TXN-&gt;abort</a> is
147called (which releases the transaction's Berkeley DB resources and undoes any
148partial changes to the databases), and then the transaction is retried
149from the beginning.</p>
150<p>There is no requirement that the transaction be attempted again, but
151that is a common course of action for applications.  Applications may
152want to set an upper bound on the number of times an operation will be
153retried because some operations on some data sets may simply be unable
154to succeed.  For example, updating all of the pages on a large Web site
155during prime business hours may simply be impossible because of the high
156access rate to the database.</p>
157<p>The <a href="/api_c/txn_abort.html">DB_TXN-&gt;abort</a> method is called in error cases other than deadlock.
158Any time an error occurs, such that a transactionally protected set of
159operations cannot complete successfully, the transaction must be
160aborted.  While deadlock is by far the most common of these errors,
161there are other possibilities; for example, running out of disk space
162for the filesystem.  In Berkeley DB transactional applications, there are
163three classes of error returns: "expected" errors, "unexpected but
164recoverable" errors, and a single "unrecoverable" error.  Expected
165errors are errors like <a href="/ref/program/errorret.html#DB_NOTFOUND">DB_NOTFOUND</a>, which indicates that a
166searched-for key item is not present in the database.  Applications may
167want to explicitly test for and handle this error, or, in the case where
168the absence of a key implies the enclosing transaction should fail,
169simply call <a href="/api_c/txn_abort.html">DB_TXN-&gt;abort</a>.  Unexpected but recoverable errors are
170errors like <a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>, which indicates that an operation
171has been selected to resolve a deadlock, or a system error such as EIO,
172which likely indicates that the filesystem has no available disk space.
173Applications must immediately call <a href="/api_c/txn_abort.html">DB_TXN-&gt;abort</a> when these returns
174occur, as it is not possible to proceed otherwise.  The only
175unrecoverable error is <a href="/ref/program/errorret.html#DB_RUNRECOVERY">DB_RUNRECOVERY</a>, which indicates that the
176system must stop and recovery must be run.</p>
177<p>The above code can be simplified in the case of a transaction comprised
178entirely of a single database put or delete operation, as operations
179occurring in transactional databases are implicitly transaction
180protected.  For example, in a transactional database, the above code
181could be more simply written as:</p>
182<blockquote><pre><b>	for (fail = 0; fail++ &lt;= MAXIMUM_RETRY &&
183	    (ret = db-&gt;put(db, NULL, &key, &data, 0)) == DB_LOCK_DEADLOCK;)
184		;
185	return (ret == 0 ? 0 : 1);</b></pre></blockquote>
186<p>and the underlying transaction would be automatically handled by Berkeley DB.</p>
187<p>Programmers should not attempt to enumerate all possible error returns
188in their software.  Instead, they should explicitly handle expected
189returns and default to aborting the transaction for the rest.  It is
190entirely the choice of the programmer whether to check for
191<a href="/ref/program/errorret.html#DB_RUNRECOVERY">DB_RUNRECOVERY</a> explicitly or not -- attempting new Berkeley DB
192operations after <a href="/ref/program/errorret.html#DB_RUNRECOVERY">DB_RUNRECOVERY</a> is returned does not worsen the
193situation.  Alternatively, using the <a href="/api_c/env_event_notify.html">DB_ENV-&gt;set_event_notify</a> method to
194handle an unrecoverable error and simply doing some number of
195abort-and-retry cycles for any unexpected Berkeley DB or system error in the
196mainline code often results in the simplest and cleanest application
197code.</p>
198<table width="100%"><tr><td><br></td><td align=right><a href="/transapp/data_open.html"><img src="/images/prev.gif" alt="Prev"></a><a href="/toc.html"><img src="/images/ref.gif" alt="Ref"></a><a href="/transapp/atomicity.html"><img src="/images/next.gif" alt="Next"></a>
199</td></tr></table>
200<p><font size=1>Copyright (c) 1996,2008 Oracle.  All rights reserved.</font>
201</body>
202</html>
203