1<!--$Id: put.so,v 1.20 2006/04/24 17:26:33 bostic Exp $--> 2<!--Copyright (c) 1997,2008 Oracle. All rights reserved.--> 3<!--See the file LICENSE for redistribution information.--> 4<html> 5<head> 6<title>Berkeley DB Reference Guide: Recoverability and deadlock handling</title> 7<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> 8<meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++"> 9</head> 10<body bgcolor=white> 11<table width="100%"><tr valign=top> 12<td><b><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Transactional Data Store Applications</dl></b></td> 13<td align=right><a href="/transapp/data_open.html"><img src="/images/prev.gif" alt="Prev"></a><a href="/toc.html"><img src="/images/ref.gif" alt="Ref"></a><a href="/transapp/atomicity.html"><img src="/images/next.gif" alt="Next"></a> 14</td></tr></table> 15<p align=center><b>Recoverability and deadlock handling</b></p> 16<p>The first reason listed for using transactions was recoverability. Any 17logical change to a database may require multiple changes to underlying 18data structures. For example, modifying a record in a Btree may require 19leaf and internal pages to split, so a single <a href="/api_c/db_put.html">DB->put</a> method 20call can potentially require that multiple physical database pages be 21written. If only some of those pages are written and then the system 22or application fails, the database is left inconsistent and cannot be 23used until it has been recovered; that is, until the partially completed 24changes have been undone.</p> 25<p><i>Write-ahead-logging</i> is the term that describes the underlying 26implementation that Berkeley DB uses to ensure recoverability. What it means 27is that before any change is made to a database, information about the 28change is written to a database log. During recovery, the log is read, 29and databases are checked to ensure that changes described in the log 30for committed transactions appear in the database. Changes that appear 31in the database but are related to aborted or unfinished transactions 32in the log are undone from the database.</p> 33<p>For recoverability after application or system failure, operations that 34modify the database must be protected by transactions. More 35specifically, operations are not recoverable unless a transaction is 36begun and each operation is associated with the transaction via the 37Berkeley DB interfaces, and then the transaction successfully committed. This 38is true even if logging is turned on in the database environment.</p> 39<p>Here is an example function that updates a record in a database in a 40transactionally protected manner. The function takes a key and data 41items as arguments and then attempts to store them into the database.</p> 42<blockquote><pre>int 43main(int argc, char *argv) 44{ 45 extern int optind; 46 DB *db_cats, *db_color, *db_fruit; 47 DB_ENV *dbenv; 48 int ch; 49<p> 50 while ((ch = getopt(argc, argv, "")) != EOF) 51 switch (ch) { 52 case '?': 53 default: 54 usage(); 55 } 56 argc -= optind; 57 argv += optind; 58<p> 59 env_dir_create(); 60 env_open(&dbenv); 61<p> 62 /* Open database: Key is fruit class; Data is specific type. */ 63 db_open(dbenv, &db_fruit, "fruit", 0); 64<p> 65 /* Open database: Key is a color; Data is an integer. */ 66 db_open(dbenv, &db_color, "color", 0); 67<p> 68 /* 69 * Open database: 70 * Key is a name; Data is: company name, cat breeds. 71 */ 72 db_open(dbenv, &db_cats, "cats", 1); 73<p> 74<b> add_fruit(dbenv, db_fruit, "apple", "yellow delicious");</b> 75<p> 76 return (0); 77} 78<p> 79<b>int 80add_fruit(DB_ENV *dbenv, DB *db, char *fruit, char *name) 81{ 82 DBT key, data; 83 DB_TXN *tid; 84 int fail, ret, t_ret; 85<p> 86 /* Initialization. */ 87 memset(&key, 0, sizeof(key)); 88 memset(&data, 0, sizeof(data)); 89 key.data = fruit; 90 key.size = strlen(fruit); 91 data.data = name; 92 data.size = strlen(name); 93<p> 94 for (fail = 0;;) { 95 /* Begin the transaction. */ 96 if ((ret = dbenv->txn_begin(dbenv, NULL, &tid, 0)) != 0) { 97 dbenv->err(dbenv, ret, "DB_ENV->txn_begin"); 98 exit (1); 99 } 100<p> 101 /* Store the value. */ 102 switch (ret = db->put(db, tid, &key, &data, 0)) { 103 case 0: 104 /* Success: commit the change. */ 105 if ((ret = tid->commit(tid, 0)) != 0) { 106 dbenv->err(dbenv, ret, "DB_TXN->commit"); 107 exit (1); 108 } 109 return (0); 110 case DB_LOCK_DEADLOCK: 111 default: 112 /* Retry the operation. */ 113 if ((t_ret = tid->abort(tid)) != 0) { 114 dbenv->err(dbenv, t_ret, "DB_TXN->abort"); 115 exit (1); 116 } 117 if (fail++ == MAXIMUM_RETRY) 118 return (ret); 119 break; 120 } 121 } 122}</b></pre></blockquote> 123<p>Berkeley DB also uses transactions to recover from deadlock. Database 124operations (that is, any call to a function underlying the handles 125returned by <a href="/api_c/db_open.html">DB->open</a> and <a href="/api_c/db_cursor.html">DB->cursor</a>) are usually 126performed on behalf of a unique locker. Transactions can be used to 127perform multiple calls on behalf of the same locker within a single 128thread of control. For example, consider the case in which an 129application uses a cursor scan to locate a record and then the 130application accesses another other item in the database, based on the 131key returned by the cursor, without first closing the cursor. If these 132operations are done using default locker IDs, they may conflict. If the 133locks are obtained on behalf of a transaction, using the transaction's 134locker ID instead of the database handle's locker ID, the operations 135will not conflict.</p> 136<p>There is a new error return in this function that you may not have seen 137before. In transactional (not Concurrent Data Store) applications 138supporting both readers and writers, or just multiple writers, Berkeley DB 139functions have an additional possible error return: 140<a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>. This means two threads of control deadlocked, 141and the thread receiving the <a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> error return has 142been selected to discard its locks in order to resolve the problem. 143When an application receives a <a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> return, the 144correct action is to close any cursors involved in the operation and 145abort any enclosing transaction. In the sample code, any time the 146<a href="/api_c/db_put.html">DB->put</a> method returns <a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>, <a href="/api_c/txn_abort.html">DB_TXN->abort</a> is 147called (which releases the transaction's Berkeley DB resources and undoes any 148partial changes to the databases), and then the transaction is retried 149from the beginning.</p> 150<p>There is no requirement that the transaction be attempted again, but 151that is a common course of action for applications. Applications may 152want to set an upper bound on the number of times an operation will be 153retried because some operations on some data sets may simply be unable 154to succeed. For example, updating all of the pages on a large Web site 155during prime business hours may simply be impossible because of the high 156access rate to the database.</p> 157<p>The <a href="/api_c/txn_abort.html">DB_TXN->abort</a> method is called in error cases other than deadlock. 158Any time an error occurs, such that a transactionally protected set of 159operations cannot complete successfully, the transaction must be 160aborted. While deadlock is by far the most common of these errors, 161there are other possibilities; for example, running out of disk space 162for the filesystem. In Berkeley DB transactional applications, there are 163three classes of error returns: "expected" errors, "unexpected but 164recoverable" errors, and a single "unrecoverable" error. Expected 165errors are errors like <a href="/ref/program/errorret.html#DB_NOTFOUND">DB_NOTFOUND</a>, which indicates that a 166searched-for key item is not present in the database. Applications may 167want to explicitly test for and handle this error, or, in the case where 168the absence of a key implies the enclosing transaction should fail, 169simply call <a href="/api_c/txn_abort.html">DB_TXN->abort</a>. Unexpected but recoverable errors are 170errors like <a href="/ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>, which indicates that an operation 171has been selected to resolve a deadlock, or a system error such as EIO, 172which likely indicates that the filesystem has no available disk space. 173Applications must immediately call <a href="/api_c/txn_abort.html">DB_TXN->abort</a> when these returns 174occur, as it is not possible to proceed otherwise. The only 175unrecoverable error is <a href="/ref/program/errorret.html#DB_RUNRECOVERY">DB_RUNRECOVERY</a>, which indicates that the 176system must stop and recovery must be run.</p> 177<p>The above code can be simplified in the case of a transaction comprised 178entirely of a single database put or delete operation, as operations 179occurring in transactional databases are implicitly transaction 180protected. For example, in a transactional database, the above code 181could be more simply written as:</p> 182<blockquote><pre><b> for (fail = 0; fail++ <= MAXIMUM_RETRY && 183 (ret = db->put(db, NULL, &key, &data, 0)) == DB_LOCK_DEADLOCK;) 184 ; 185 return (ret == 0 ? 0 : 1);</b></pre></blockquote> 186<p>and the underlying transaction would be automatically handled by Berkeley DB.</p> 187<p>Programmers should not attempt to enumerate all possible error returns 188in their software. Instead, they should explicitly handle expected 189returns and default to aborting the transaction for the rest. It is 190entirely the choice of the programmer whether to check for 191<a href="/ref/program/errorret.html#DB_RUNRECOVERY">DB_RUNRECOVERY</a> explicitly or not -- attempting new Berkeley DB 192operations after <a href="/ref/program/errorret.html#DB_RUNRECOVERY">DB_RUNRECOVERY</a> is returned does not worsen the 193situation. Alternatively, using the <a href="/api_c/env_event_notify.html">DB_ENV->set_event_notify</a> method to 194handle an unrecoverable error and simply doing some number of 195abort-and-retry cycles for any unexpected Berkeley DB or system error in the 196mainline code often results in the simplest and cleanest application 197code.</p> 198<table width="100%"><tr><td><br></td><td align=right><a href="/transapp/data_open.html"><img src="/images/prev.gif" alt="Prev"></a><a href="/toc.html"><img src="/images/ref.gif" alt="Ref"></a><a href="/transapp/atomicity.html"><img src="/images/next.gif" alt="Next"></a> 199</td></tr></table> 200<p><font size=1>Copyright (c) 1996,2008 Oracle. All rights reserved.</font> 201</body> 202</html> 203