1<!--$Id: deaddbg.so,v 10.5 2005/12/02 17:27:50 alanb Exp $--> 2<!--Copyright (c) 1997,2008 Oracle. All rights reserved.--> 3<!--See the file LICENSE for redistribution information.--> 4<html> 5<head> 6<title>Berkeley DB Reference Guide: Deadlock debugging</title> 7<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> 8<meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++"> 9</head> 10<body bgcolor=white> 11<table width="100%"><tr valign=top> 12<td><b><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></b></td> 13<td align=right><a href="../lock/timeout.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../lock/page.html"><img src="../../images/next.gif" alt="Next"></a> 14</td></tr></table> 15<p align=center><b>Deadlock debugging</b></p> 16<p>An occasional debugging problem in Berkeley DB applications is unresolvable 17deadlock. The output of the <b>-Co</b> flags of the <a href="../../utility/db_stat.html">db_stat</a> 18utility can be used to detect and debug these problems. The following 19is a typical example of the output of this utility:</p> 20<blockquote><pre>Locks grouped by object 21Locker Mode Count Status ----------- Object ---------- 22 1 READ 1 HELD a.db handle 0 2380000004 WRITE 1 HELD a.db page 3</pre></blockquote> 24<p>In this example, we have opened a database and stored a single key/data 25pair in it. Because we have a database handle open, we have a read lock 26on that database handle. The database handle lock is the read lock 27labeled <i>handle</i>. (We can normally ignore handle locks for 28the purposes of database debugging, as they will only conflict with 29other handle operations, for example, an attempt to remove the database 30will block because we are holding the handle locked, but reading and 31writing the database will not conflict with the handle lock.)</p> 32<p>It is important to note that locker IDs are 32-bit unsigned integers, 33and are divided into two name spaces. Locker IDs with the high bit set 34(that is, values 80000000 or higher), are locker IDs associated with 35transactions. Locker IDs without the high bit set are locker IDs that 36are not associated with a transaction. Locker IDs associated with 37transactions map one-to-one with the transaction, that is, a transaction 38never has more than a single locker ID, and all of the locks acquired 39by the transaction will be acquired on behalf of the same locker ID.</p> 40<p>We also hold a write lock on the database page where we stored the new 41key/data pair. The page lock is labeled <i>page</i> and is on page 42number 3. If we were to put an additional key/data pair in the 43database, we would see the following output:</p> 44<blockquote><pre>Locks grouped by object 45Locker Mode Count Status ----------- Object ---------- 4680000004 WRITE 2 HELD a.db page 3 47 1 READ 1 HELD a.db handle 0</pre></blockquote> 48<p>That is, we have acquired a second reference count to page number 3, but 49have not acquired any new locks. If we add an entry to a different page 50in the database, we would acquire additional locks:</p> 51<blockquote><pre>Locks grouped by object 52Locker Mode Count Status ----------- Object ---------- 53 1 READ 1 HELD a.db handle 0 5480000004 WRITE 2 HELD a.db page 3 5580000004 WRITE 1 HELD a.db page 2</pre></blockquote> 56<p>Here's a simple example of one lock blocking another one:</p> 57<blockquote><pre>Locks grouped by object 58Locker Mode Count Status ----------- Object ---------- 5980000004 WRITE 1 HELD a.db page 2 6080000005 WRITE 1 WAIT a.db page 2 61 1 READ 1 HELD a.db handle 0 6280000004 READ 1 HELD a.db page 1</pre></blockquote> 63<p>In this example, there are two different transactional lockers (80000004 and 6480000005). Locker 80000004 is holding a write lock on page 2, and 65locker 80000005 is waiting for a write lock on page 2. This is not a 66deadlock, because locker 80000004 is not blocked on anything. 67Presumably, the thread of control using locker 80000004 will proceed, 68eventually release its write lock on page 2, at which point the thread 69of control using locker 80000005 can also proceed, acquiring a write 70lock on page 2.</p> 71<p>If lockers 80000004 and 80000005 are not in different threads of 72control, the result would be <i>self deadlock</i>. Self deadlock 73is not a true deadlock, and won't be detected by the Berkeley DB deadlock 74detector. It's not a true deadlock because, if work could continue to 75be done on behalf of locker 80000004, then the lock would eventually be 76released, and locker 80000005 could acquire the lock and itself proceed. 77So, the key element is that the thread of control holding the lock 78cannot proceed because it is the same thread as is blocked waiting on the 79lock.</p> 80<p>Here's an example of three transactions reaching true deadlock. First, 81three different threads of control opened the database, acquiring three 82database handle read locks.</p> 83<blockquote><pre>Locks grouped by object 84Locker Mode Count Status ----------- Object ---------- 85 1 READ 1 HELD a.db handle 0 86 3 READ 1 HELD a.db handle 0 87 5 READ 1 HELD a.db handle 0</pre></blockquote> 88<p>The three threads then each began a transaction, and put a key/data pair 89on a different page:</p> 90<blockquote><pre>Locks grouped by object 91Locker Mode Count Status ----------- Object ---------- 9280000008 WRITE 1 HELD a.db page 4 93 1 READ 1 HELD a.db handle 0 94 3 READ 1 HELD a.db handle 0 95 5 READ 1 HELD a.db handle 0 9680000006 READ 1 HELD a.db page 1 9780000007 READ 1 HELD a.db page 1 9880000008 READ 1 HELD a.db page 1 9980000006 WRITE 1 HELD a.db page 2 10080000007 WRITE 1 HELD a.db page 3</pre></blockquote> 101<p>The thread using locker 80000006 put a new key/data pair on page 2, the 102thread using locker 80000007, on page 3, and the thread using locker 10380000008 on page 4. Because the database is a 2-level Btree, the tree 104was searched, and so each transaction acquired a read lock on the Btree 105root page (page 1) as part of this operation.</p> 106<p>The three threads then each attempted to put a second key/data pair on 107a page currently locked by another thread. The thread using locker 10880000006 tried to put a key/data pair on page 3, the thread using locker 10980000007 on page 4, and the thread using locker 80000008 on page 2:</p> 110<blockquote><pre>Locks grouped by object 111Locker Mode Count Status ----------- Object ---------- 11280000008 WRITE 1 HELD a.db page 4 11380000007 WRITE 1 WAIT a.db page 4 114 1 READ 1 HELD a.db handle 0 115 3 READ 1 HELD a.db handle 0 116 5 READ 1 HELD a.db handle 0 11780000006 READ 2 HELD a.db page 1 11880000007 READ 2 HELD a.db page 1 11980000008 READ 2 HELD a.db page 1 12080000006 WRITE 1 HELD a.db page 2 12180000008 WRITE 1 WAIT a.db page 2 12280000007 WRITE 1 HELD a.db page 3 12380000006 WRITE 1 WAIT a.db page 3</pre></blockquote> 124<p>Now, each of the threads of control is blocked, waiting on a different 125thread of control. 126The thread using locker 80000007 is blocked by 127the thread using locker 80000008, due to the lock on page 4. 128The thread using locker 80000008 is blocked by 129the thread using locker 80000006, due to the lock on page 2. 130And the thread using locker 80000006 is blocked by 131the thread using locker 80000007, due to the lock on page 3. 132Since none of the threads of control can make 133progress, one of them will have to be killed in order to resolve the 134deadlock.</p> 135<table width="100%"><tr><td><br></td><td align=right><a href="../lock/timeout.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../lock/page.html"><img src="../../images/next.gif" alt="Next"></a> 136</td></tr></table> 137<p><font size=1>Copyright (c) 1996,2008 Oracle. All rights reserved.</font> 138</body> 139</html> 140