1<?xml version="1.0" encoding="UTF-8" standalone="no"?> 2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3<html xmlns="http://www.w3.org/1999/xhtml"> 4 <head> 5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 6 <title>Permanent Message Handling</title> 7 <link rel="stylesheet" href="gettingStarted.css" type="text/css" /> 8 <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" /> 9 <link rel="start" href="index.html" title="Getting Started with Replicated Berkeley DB Applications" /> 10 <link rel="up" href="introduction.html" title="Chapter��1.��Introduction" /> 11 <link rel="prev" href="elections.html" title="Holding Elections" /> 12 <link rel="next" href="txnapp.html" title="Chapter��2.��Transactional Application" /> 13 </head> 14 <body> 15 <div class="navheader"> 16 <table width="100%" summary="Navigation header"> 17 <tr> 18 <th colspan="3" align="center">Permanent Message Handling</th> 19 </tr> 20 <tr> 21 <td width="20%" align="left"><a accesskey="p" href="elections.html">Prev</a>��</td> 22 <th width="60%" align="center">Chapter��1.��Introduction</th> 23 <td width="20%" align="right">��<a accesskey="n" href="txnapp.html">Next</a></td> 24 </tr> 25 </table> 26 <hr /> 27 </div> 28 <div class="sect1" lang="en" xml:lang="en"> 29 <div class="titlepage"> 30 <div> 31 <div> 32 <h2 class="title" style="clear: both"><a id="permmessages"></a>Permanent Message Handling</h2> 33 </div> 34 </div> 35 </div> 36 <div class="toc"> 37 <dl> 38 <dt> 39 <span class="sect2"> 40 <a href="permmessages.html#permmessagenot">When Not to Manage 41 Permanent Messages</a> 42 </span> 43 </dt> 44 <dt> 45 <span class="sect2"> 46 <a href="permmessages.html#permmanage">Managing Permanent Messages</a> 47 </span> 48 </dt> 49 <dt> 50 <span class="sect2"> 51 <a href="permmessages.html#permimplement">Implementing Permanent 52 Message Handling</a> 53 </span> 54 </dt> 55 </dl> 56 </div> 57 <p> 58 Messages received by a replica may be marked with 59 special flag that indicates the message is permanent. 60 Custom replicated applications will receive notification of 61 this flag via the <code class="literal">DB_REP_ISPERM</code> return value 62 from the 63 64 <code class="methodname">DbEnv::rep_process_message()</code> 65 method. 66 67 There is no hard requirement that a replication application look for, or 68 respond to, this return code. However, because robust replicated 69 applications typically do manage permanent messages, we introduce 70 the concept here. 71 </p> 72 <p> 73 A message is marked as being permanent if the message 74 affects transactional integrity. For example, 75 transaction commit messages are an example of a message 76 that is marked permanent. What the application does 77 about the permanent message is driven by the durability 78 guarantees required by the application. 79 </p> 80 <p> 81 For example, consider what the Replication Manager does when it 82 has permanent message handling turned on and a 83 transactional commit record is sent to the replicas. 84 First, the replicas must transactional-commit the data 85 modifications identified by the message. And then, upon 86 a successful commit, the Replication Manager sends the master a 87 message acknowledgment. 88 </p> 89 <p> 90 For the master (again, using the Replication Manager), things are a little more complicated than 91 simple message acknowledgment. Usually in a replicated 92 application, the master commits transactions 93 asynchronously; that is, the commit operation does not 94 block waiting for log data to be flushed to disk before 95 returning. So when a master is managing permanent 96 messages, it typically blocks the committing thread 97 immediately before <code class="methodname">commit()</code> 98 returns. The thread then waits for acknowledgments from 99 its replicas. If it receives enough acknowledgments, it 100 continues to operate as normal. 101 </p> 102 <p> 103 If the master does not 104 receive message acknowledgments ��� or, more likely, it does not receive 105 <span class="emphasis"><em>enough</em></span> acknowledgments ��� the 106 committing thread flushes its log data to disk and then 107 continues operations as normal. The master application can 108 do this because replicas that fail to handle a message, for 109 whatever reason, will eventually catch up to the master. So 110 by flushing the transaction logs to disk, the master is 111 ensuring that the data modifications have made it to 112 stable storage in one location (its own hard drive). 113 </p> 114 <div class="sect2" lang="en" xml:lang="en"> 115 <div class="titlepage"> 116 <div> 117 <div> 118 <h3 class="title"><a id="permmessagenot"></a>When Not to Manage 119 Permanent Messages</h3> 120 </div> 121 </div> 122 </div> 123 <p> 124 There are two reasons why you might 125 choose to not implement permanent messages. 126 In part, these go to why you are using 127 replication in the first place. 128 </p> 129 <p> 130 One class of applications uses replication so that 131 the application can improve transaction 132 through-put. Essentially, the application chooses a 133 reduced transactional durability guarantee so as to 134 avoid the overhead forced by the disk I/O required 135 to flush transaction logs to disk. However, the 136 application can then regain that durability 137 guarantee to a certain degree by replicating the 138 commit to some number of replicas. 139 </p> 140 <p> 141 Using replication to improve an application's 142 transactional commit guarantee is called 143 <span class="emphasis"><em>replicating to the network.</em></span> 144 </p> 145 <p> 146 In extreme cases where performance is of critical 147 importance to the application, the master might 148 choose to both use asynchronous commits 149 <span class="emphasis"><em>and</em></span> decide not to wait for 150 message acknowledgments. In this case the master 151 is simply broadcasting its commit activities to its 152 replicas without waiting for any sort of a reply. An 153 application like this might also choose to use 154 something other than TCP/IP for its network 155 communications since that protocol involves a fair 156 amount of packet acknowledgment all on its own. Of 157 course, this sort of an application should also be 158 very sure about the reliability of both its network and 159 the machines that are hosting its replicas. 160 </p> 161 <p> 162 At the other extreme, there is a 163 class of applications that use replication 164 purely to improve read performance. This sort 165 of application might choose to use synchronous 166 commits on the master because write 167 performance there is not of critical 168 performance. In any case, this kind of an 169 application might not care to know whether its 170 replicas have received and successfully handled 171 permanent messages because the primary storage 172 location is assumed to be on the master, not 173 the replicas. 174 </p> 175 </div> 176 <div class="sect2" lang="en" xml:lang="en"> 177 <div class="titlepage"> 178 <div> 179 <div> 180 <h3 class="title"><a id="permmanage"></a>Managing Permanent Messages</h3> 181 </div> 182 </div> 183 </div> 184 <p> 185 With the exception of a rare breed of 186 replicated applications, most masters need some 187 view as to whether commits are occurring on 188 replicas as expected. At a minimum, this is because 189 masters will not flush their log buffers unless 190 they have reason to expect that permanent 191 messages have not been committed on the 192 replicas. 193 </p> 194 <p> 195 That said, it is important to remember that 196 managing permanent messages involves a fair amount 197 of network traffic. The messages must be sent to 198 the replicas and the replicas must acknowledge 199 them. This represents a performance overhead 200 that can be worsened by congested networks or 201 outright outages. 202 </p> 203 <p> 204 Therefore, when managing permanent messages, you 205 must first decide on how many of your replicas must 206 send acknowledgments before your master decides 207 that all is well and it can continue normal 208 operations. When making this decision, you could 209 decide that <span class="emphasis"><em>all</em></span> replicas must 210 send acknowledgments. But unless you have only one 211 or two replicas, or you are replicating over a very 212 fast and reliable network, this policy could prove 213 very harmful to your application's performance. 214 </p> 215 <p> 216 Therefore, a common strategy is to wait for an 217 acknowledgment from a simple majority of replicas. 218 This ensures that commit activity has occurred on 219 enough machines that you can be reliably certain 220 that data writes are preserved across your network. 221 </p> 222 <p> 223 Remember that replicas that do not acknowledge a 224 permanent message are not necessarily unable to 225 perform the commit; it might be that network 226 problems have simply resulted in a delay at the 227 replica. In any case, the underlying DB 228 replication code is written such that a replica that 229 falls behind the master will eventually take action 230 to catch up. 231 </p> 232 <p> 233 Depending on your application, it may be 234 possible for you to code your permanent message 235 handling such that acknowledgment must come 236 from only one or two replicas. This is a 237 particularly attractive strategy if you are 238 closely managing which machines are eligible to 239 become masters. Assuming that you have one or 240 two machines designated to be a master in the 241 event that the current master goes down, you 242 may only want to receive acknowledgments from 243 those specific machines. 244 </p> 245 <p> 246 Finally, beyond simple message acknowledgment, you 247 also need to implement an acknowledgment timeout 248 for your application. This timeout value is simply 249 meant to ensure that your master does not hang 250 indefinitely waiting for responses that will never 251 come because a machine or router is down. 252 </p> 253 </div> 254 <div class="sect2" lang="en" xml:lang="en"> 255 <div class="titlepage"> 256 <div> 257 <div> 258 <h3 class="title"><a id="permimplement"></a>Implementing Permanent 259 Message Handling</h3> 260 </div> 261 </div> 262 </div> 263 <p> 264 How you implement permanent message handling 265 depends on which API you are using to implement 266 replication. If you are using the Replication Manager, then 267 permanent message handling is configured using 268 policies that you specify to the framework. In 269 this case, you can configure your application 270 to: 271 </p> 272 <div class="itemizedlist"> 273 <ul type="disc"> 274 <li> 275 <p> 276 Ignore permanent messages (the master 277 does not wait for acknowledgments). 278 </p> 279 </li> 280 <li> 281 <p> 282 Require acknowledgments from a 283 quorum. A quorum is reached when 284 acknowledgments are received from the 285 minimum number of electable 286 peers needed to ensure that 287 the record remains durable if 288 an election is held. 289 </p> 290 <p> 291 An <span class="emphasis"><em>electable peer</em></span> is any other 292 site that potentially can be elected master. 293 </p> 294 <p> 295 The goal here is to be 296 absolutely sure the record is 297 durable. The master wants to 298 hear from enough electable 299 peer that they have 300 committed the record so that if 301 an election is held, the master 302 knows the record will exist even 303 if a new master is selected. 304 </p> 305 <p> 306 This is the default policy. 307 </p> 308 </li> 309 <li> 310 <p> 311 Require an acknowledgment from at least one replica. 312 </p> 313 </li> 314 <li> 315 <p> 316 Require acknowledgments from 317 all replicas. 318 </p> 319 </li> 320 <li> 321 <p> 322 Require an acknowledgment from at least one electable peer. 323 </p> 324 </li> 325 <li> 326 <p> 327 Require acknowledgments from all electable peers. 328 </p> 329 </li> 330 </ul> 331 </div> 332 <p> 333 Note that the Replication Manager simply flushes its transaction 334 logs and moves on if a permanent message is not 335 sufficiently acknowledged. 336 </p> 337 <p> 338 For details on permanent message handling with the 339 Replication Manager, see <a class="xref" href="fwrkpermmessage.html" title="Permanent Message Handling">Permanent Message Handling</a>. 340 </p> 341 <p> 342 If these policies are not sufficient for your 343 needs, or if you want your application to take more 344 corrective action than simply flushing log buffers 345 in the event of an unsuccessful commit, then you 346 must use implement replication using the Base APIs. 347 </p> 348 <p> 349 When using the Base APIs, messages are 350 sent from the master to its replica using a 351 <code class="function">send()</code> callback that you 352 implement. Note, however, that DB's replication 353 code automatically sets the permanent 354 flag for you where appropriate. 355 </p> 356 <p> 357 If the <code class="function">send()</code> callback returns with a 358 non-zero status, DB flushes the transaction log 359 buffers for you. Therefore, you must cause your 360 <code class="function">send()</code> callback to block waiting 361 for acknowledgments from your replicas. 362 As a part of implementing the 363 <code class="function">send()</code> callback, you implement 364 your permanent message handling policies. This 365 means that you identify how many replicas must 366 acknowledge the message before the callback can 367 return <code class="literal">0</code>. You must also 368 implement the acknowledgment timeout, if any. 369 </p> 370 <p> 371 Further, message acknowledgments are sent from the 372 replicas to the master using a communications 373 channel that you implement (the replication code 374 does not provide a channel for acknowledgments). 375 So implementing permanent messages means that when 376 you write your replication communications channel, 377 you must also write it in such a way as to also 378 handle permanent message acknowledgments. 379 </p> 380 <p> 381 For more information on implementing permanent 382 message handling using a custom replication layer, 383 see the <em class="citetitle">Berkeley DB Programmer's Reference Guide</em>. 384 </p> 385 </div> 386 </div> 387 <div class="navfooter"> 388 <hr /> 389 <table width="100%" summary="Navigation footer"> 390 <tr> 391 <td width="40%" align="left"><a accesskey="p" href="elections.html">Prev</a>��</td> 392 <td width="20%" align="center"> 393 <a accesskey="u" href="introduction.html">Up</a> 394 </td> 395 <td width="40%" align="right">��<a accesskey="n" href="txnapp.html">Next</a></td> 396 </tr> 397 <tr> 398 <td width="40%" align="left" valign="top">Holding Elections��</td> 399 <td width="20%" align="center"> 400 <a accesskey="h" href="index.html">Home</a> 401 </td> 402 <td width="40%" align="right" valign="top">��Chapter��2.��Transactional Application</td> 403 </tr> 404 </table> 405 </div> 406 </body> 407</html> 408