1<!--$Id: deaddbg.so,v 10.5 2005/12/02 17:27:50 alanb Exp $-->
2<!--Copyright (c) 1997,2008 Oracle.  All rights reserved.-->
3<!--See the file LICENSE for redistribution information.-->
4<html>
5<head>
6<title>Berkeley DB Reference Guide: Deadlock debugging</title>
7<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
8<meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++">
9</head>
10<body bgcolor=white>
11<table width="100%"><tr valign=top>
12<td><b><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></b></td>
13<td align=right><a href="../lock/timeout.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../lock/page.html"><img src="../../images/next.gif" alt="Next"></a>
14</td></tr></table>
15<p align=center><b>Deadlock debugging</b></p>
16<p>An occasional debugging problem in Berkeley DB applications is unresolvable
17deadlock.  The output of the <b>-Co</b> flags of the <a href="../../utility/db_stat.html">db_stat</a>
18utility can be used to detect and debug these problems.  The following
19is a typical example of the output of this utility:</p>
20<blockquote><pre>Locks grouped by object
21Locker    Mode    Count   Status      ----------- Object ----------
22       1  READ         1  HELD        a.db                handle   0
2380000004  WRITE        1  HELD        a.db                page     3</pre></blockquote>
24<p>In this example, we have opened a database and stored a single key/data
25pair in it.  Because we have a database handle open, we have a read lock
26on that database handle.  The database handle lock is the read lock
27labeled <i>handle</i>.  (We can normally ignore handle locks for
28the purposes of database debugging, as they will only conflict with
29other handle operations, for example, an attempt to remove the database
30will block because we are holding the handle locked, but reading and
31writing the database will not conflict with the handle lock.)</p>
32<p>It is important to note that locker IDs are 32-bit unsigned integers,
33and are divided into two name spaces.  Locker IDs with the high bit set
34(that is, values 80000000 or higher), are locker IDs associated with
35transactions.  Locker IDs without the high bit set are locker IDs that
36are not associated with a transaction.  Locker IDs associated with
37transactions map one-to-one with the transaction, that is, a transaction
38never has more than a single locker ID, and all of the locks acquired
39by the transaction will be acquired on behalf of the same locker ID.</p>
40<p>We also hold a write lock on the database page where we stored the new
41key/data pair.  The page lock is labeled <i>page</i> and is on page
42number 3.  If we were to put an additional key/data pair in the
43database, we would see the following output:</p>
44<blockquote><pre>Locks grouped by object
45Locker    Mode    Count   Status      ----------- Object ----------
4680000004  WRITE        2  HELD        a.db                page     3
47       1  READ         1  HELD        a.db                handle   0</pre></blockquote>
48<p>That is, we have acquired a second reference count to page number 3, but
49have not acquired any new locks.  If we add an entry to a different page
50in the database, we would acquire additional locks:</p>
51<blockquote><pre>Locks grouped by object
52Locker    Mode    Count   Status      ----------- Object ----------
53       1  READ         1  HELD        a.db                handle   0
5480000004  WRITE        2  HELD        a.db                page     3
5580000004  WRITE        1  HELD        a.db                page     2</pre></blockquote>
56<p>Here's a simple example of one lock blocking another one:</p>
57<blockquote><pre>Locks grouped by object
58Locker    Mode    Count   Status      ----------- Object ----------
5980000004  WRITE        1  HELD        a.db                page     2
6080000005  WRITE        1  WAIT        a.db                page     2
61       1  READ         1  HELD        a.db                handle   0
6280000004  READ         1  HELD        a.db                page     1</pre></blockquote>
63<p>In this example, there are two different transactional lockers (80000004 and
6480000005).  Locker 80000004 is holding a write lock on page 2, and
65locker 80000005 is waiting for a write lock on page 2.  This is not a
66deadlock, because locker 80000004 is not blocked on anything.
67Presumably, the thread of control using locker 80000004 will proceed,
68eventually release its write lock on page 2, at which point the thread
69of control using locker 80000005 can also proceed, acquiring a write
70lock on page 2.</p>
71<p>If lockers 80000004 and 80000005 are not in different threads of
72control, the result would be <i>self deadlock</i>.  Self deadlock
73is not a true deadlock, and won't be detected by the Berkeley DB deadlock
74detector.  It's not a true deadlock because, if work could continue to
75be done on behalf of locker 80000004, then the lock would eventually be
76released, and locker 80000005 could acquire the lock and itself proceed.
77So, the key element is that the thread of control holding the lock
78cannot proceed because it is the same thread as is blocked waiting on the
79lock.</p>
80<p>Here's an example of three transactions reaching true deadlock.  First,
81three different threads of control opened the database, acquiring three
82database handle read locks.</p>
83<blockquote><pre>Locks grouped by object
84Locker    Mode    Count   Status      ----------- Object ----------
85       1  READ         1  HELD        a.db                handle   0
86       3  READ         1  HELD        a.db                handle   0
87       5  READ         1  HELD        a.db                handle   0</pre></blockquote>
88<p>The three threads then each began a transaction, and put a key/data pair
89on a different page:</p>
90<blockquote><pre>Locks grouped by object
91Locker    Mode    Count   Status      ----------- Object ----------
9280000008  WRITE        1  HELD        a.db                page     4
93       1  READ         1  HELD        a.db                handle   0
94       3  READ         1  HELD        a.db                handle   0
95       5  READ         1  HELD        a.db                handle   0
9680000006  READ         1  HELD        a.db                page     1
9780000007  READ         1  HELD        a.db                page     1
9880000008  READ         1  HELD        a.db                page     1
9980000006  WRITE        1  HELD        a.db                page     2
10080000007  WRITE        1  HELD        a.db                page     3</pre></blockquote>
101<p>The thread using locker 80000006 put a new key/data pair on page 2, the
102thread using locker 80000007, on page 3, and the thread using locker
10380000008 on page 4.  Because the database is a 2-level Btree, the tree
104was searched, and so each transaction acquired a read lock on the Btree
105root page (page 1) as part of this operation.</p>
106<p>The three threads then each attempted to put a second key/data pair on
107a page currently locked by another thread.  The thread using locker
10880000006 tried to put a key/data pair on page 3, the thread using locker
10980000007 on page 4, and the thread using locker 80000008 on page 2:</p>
110<blockquote><pre>Locks grouped by object
111Locker    Mode    Count   Status      ----------- Object ----------
11280000008  WRITE        1  HELD        a.db                page     4
11380000007  WRITE        1  WAIT        a.db                page     4
114       1  READ         1  HELD        a.db                handle   0
115       3  READ         1  HELD        a.db                handle   0
116       5  READ         1  HELD        a.db                handle   0
11780000006  READ         2  HELD        a.db                page     1
11880000007  READ         2  HELD        a.db                page     1
11980000008  READ         2  HELD        a.db                page     1
12080000006  WRITE        1  HELD        a.db                page     2
12180000008  WRITE        1  WAIT        a.db                page     2
12280000007  WRITE        1  HELD        a.db                page     3
12380000006  WRITE        1  WAIT        a.db                page     3</pre></blockquote>
124<p>Now, each of the threads of control is blocked, waiting on a different
125thread of control.
126The thread using locker 80000007 is blocked by
127the thread using locker 80000008, due to the lock on page 4.
128The thread using locker 80000008 is blocked by
129the thread using locker 80000006, due to the lock on page 2.
130And the thread using locker 80000006 is blocked by
131the thread using locker 80000007, due to the lock on page 3.
132Since none of the threads of control can make
133progress, one of them will have to be killed in order to resolve the
134deadlock.</p>
135<table width="100%"><tr><td><br></td><td align=right><a href="../lock/timeout.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../lock/page.html"><img src="../../images/next.gif" alt="Next"></a>
136</td></tr></table>
137<p><font size=1>Copyright (c) 1996,2008 Oracle.  All rights reserved.</font>
138</body>
139</html>
140