1<?xml version="1.0" encoding="UTF-8" standalone="no"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml">
4  <head>
5    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
6    <title>Chapter 1. Introduction</title>
7    <link rel="stylesheet" href="gettingStarted.css" type="text/css" />
8    <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
9    <link rel="start" href="index.html" title="Getting Started with Replicated Berkeley DB Applications" />
10    <link rel="up" href="index.html" title="Getting Started with Replicated Berkeley DB Applications" />
11    <link rel="prev" href="preface.html" title="Preface" />
12    <link rel="next" href="repadvantage.html" title="Replication Benefits" />
13  </head>
14  <body>
15    <div class="navheader">
16      <table width="100%" summary="Navigation header">
17        <tr>
18          <th colspan="3" align="center">Chapter 1. Introduction</th>
19        </tr>
20        <tr>
21          <td width="20%" align="left"><a accesskey="p" href="preface.html">Prev</a> </td>
22          <th width="60%" align="center"> </th>
23          <td width="20%" align="right"> <a accesskey="n" href="repadvantage.html">Next</a></td>
24        </tr>
25      </table>
26      <hr />
27    </div>
28    <div class="chapter" lang="en" xml:lang="en">
29      <div class="titlepage">
30        <div>
31          <div>
32            <h2 class="title"><a id="introduction"></a>Chapter 1. Introduction</h2>
33          </div>
34        </div>
35      </div>
36      <div class="toc">
37        <p>
38          <b>Table of Contents</b>
39        </p>
40        <dl>
41          <dt>
42            <span class="sect1">
43              <a href="introduction.html#overview">Overview</a>
44            </span>
45          </dt>
46          <dd>
47            <dl>
48              <dt>
49                <span class="sect2">
50                  <a href="introduction.html#repenvirons">Replication Environments</a>
51                </span>
52              </dt>
53              <dt>
54                <span class="sect2">
55                  <a href="introduction.html#repdbs">Replication Databases</a>
56                </span>
57              </dt>
58              <dt>
59                <span class="sect2">
60                  <a href="introduction.html#commlayer">Communications Layer</a>
61                </span>
62              </dt>
63              <dt>
64                <span class="sect2">
65                  <a href="introduction.html#masterselect">Selecting a Master</a>
66                </span>
67              </dt>
68            </dl>
69          </dd>
70          <dt>
71            <span class="sect1">
72              <a href="repadvantage.html">Replication Benefits</a>
73            </span>
74          </dt>
75          <dt>
76            <span class="sect1">
77              <a href="apioverview.html">The Replication APIs</a>
78            </span>
79          </dt>
80          <dd>
81            <dl>
82              <dt>
83                <span class="sect2">
84                  <a href="apioverview.html#repframeworkoverview">Replication Manager Overview</a>
85                </span>
86              </dt>
87              <dt>
88                <span class="sect2">
89                  <a href="apioverview.html#repapioverview">Replication Base API Overview</a>
90                </span>
91              </dt>
92            </dl>
93          </dd>
94          <dt>
95            <span class="sect1">
96              <a href="elections.html">Holding Elections</a>
97            </span>
98          </dt>
99          <dd>
100            <dl>
101              <dt>
102                <span class="sect2">
103                  <a href="elections.html#influencingelections">Influencing Elections</a>
104                </span>
105              </dt>
106              <dt>
107                <span class="sect2">
108                  <a href="elections.html#winningelections">Winning Elections</a>
109                </span>
110              </dt>
111              <dt>
112                <span class="sect2">
113                  <a href="elections.html#switchingmasters">Switching Masters</a>
114                </span>
115              </dt>
116            </dl>
117          </dd>
118          <dt>
119            <span class="sect1">
120              <a href="permmessages.html">Permanent Message Handling</a>
121            </span>
122          </dt>
123          <dd>
124            <dl>
125              <dt>
126                <span class="sect2">
127                  <a href="permmessages.html#permmessagenot">When Not to Manage
128                            Permanent Messages</a>
129                </span>
130              </dt>
131              <dt>
132                <span class="sect2">
133                  <a href="permmessages.html#permmanage">Managing Permanent Messages</a>
134                </span>
135              </dt>
136              <dt>
137                <span class="sect2">
138                  <a href="permmessages.html#permimplement">Implementing Permanent
139                    Message Handling</a>
140                </span>
141              </dt>
142            </dl>
143          </dd>
144        </dl>
145      </div>
146      <p>
147    This book provides a thorough introduction and discussion on
148    replication as used with Berkeley DB (DB). It begins by offering a
149    general overview to replication and the benefits it provides. It also
150    describes the APIs that you use to implement replication, and it
151    describes architecturally the things that you need to do to your
152    application code in order to use the replication APIs. Finally, it
153    discusses the differences in backup and restore strategies that you
154    might pursue when using replication, especially where it comes to log
155    file removal.
156  </p>
157      <p>
158    You should understand the concepts from the
159        <span>
160                <em class="citetitle">Berkeley DB Getting Started with Transaction Processing</em>
161        </span>
162        
163
164     guide before reading this book.
165  </p>
166      <div class="sect1" lang="en" xml:lang="en">
167        <div class="titlepage">
168          <div>
169            <div>
170              <h2 class="title" style="clear: both"><a id="overview"></a>Overview</h2>
171            </div>
172          </div>
173        </div>
174        <div class="toc">
175          <dl>
176            <dt>
177              <span class="sect2">
178                <a href="introduction.html#repenvirons">Replication Environments</a>
179              </span>
180            </dt>
181            <dt>
182              <span class="sect2">
183                <a href="introduction.html#repdbs">Replication Databases</a>
184              </span>
185            </dt>
186            <dt>
187              <span class="sect2">
188                <a href="introduction.html#commlayer">Communications Layer</a>
189              </span>
190            </dt>
191            <dt>
192              <span class="sect2">
193                <a href="introduction.html#masterselect">Selecting a Master</a>
194              </span>
195            </dt>
196          </dl>
197        </div>
198        <p>
199            The DB replication APIs allow you to distribute your database
200            write operations (performed on a read-write master) to one or 
201            more read-only <span class="emphasis"><em>replicas</em></span>.  
202            For this reason, DB's replication implementation is said to be a
203            <span class="emphasis"><em>single master, multiple replica</em></span> replication strategy.
204        </p>
205        <p>
206            Note that your database write operations can occur only on the
207            master; any attempt to write to a replica results in an error
208            being
209                <span>returned to</span>
210                 
211            the DB API used to perform the write.
212        </p>
213        <p>
214            A single replication master and all of its replicas are referred
215            to as a <span class="emphasis"><em>replication group</em></span>.  While all
216            members of the replication group can reside on the same
217            machine, usually each replication participant is placed on a
218            separate physical machine somewhere on the network.
219        </p>
220        <p>
221            Note that all replication applications must first be
222            transactional applications. The data that the master transmits
223            to its replicas are log records that are generated as records are
224            updated. Upon transactional commit, the master transmits a
225            transaction record which tells the replicas to commit the
226            records they previously received from the master. In order for
227            all of this to work, your replicated application must also be a
228            transactional application. For this reason, it is
229            recommended that you write and debug your DB application as
230            a stand-alone transactional application before introducing the
231            replication layer to your code.
232        </p>
233        <div class="sect2" lang="en" xml:lang="en">
234          <div class="titlepage">
235            <div>
236              <div>
237                <h3 class="title"><a id="repenvirons"></a>Replication Environments</h3>
238              </div>
239            </div>
240          </div>
241          <p>
242                The most important requirement for a replication
243                participant is that it must use a unique Berkeley DB database
244                environment independent of all other replication
245                participants. So while multiple replication participants
246                can reside on the same physical machine, no two such participants 
247                can share the same environment home directory. 
248            </p>
249          <p>
250                For this reason, technically replication occurs between
251                unique <span class="emphasis"><em>database environments</em></span>. So in the strictest sense,
252                a replication group consists of a <span class="emphasis"><em>master
253                        environment</em></span> and 
254                one or more <span class="emphasis"><em>replica environments</em></span>. However, the reality
255                is that for production code, each such environment will
256                usually be located on its own unique machine. Consequently,
257                this manual sometimes talks about <span class="emphasis"><em>replication sites</em></span>, meaning the
258                unique combination of environment home directory, host and port that a specific 
259                replication application is using.
260            </p>
261          <p>
262                There is no DB-specified limit to the number of
263                environments which can participate in a replication group.
264                The only limitation here is one of resources —
265                network bandwidth, for example.
266            </p>
267          <p>
268                    (Note, however, that the Replication Manager does place a limit on the
269                    number of environments you can use. See
270                    <a class="xref" href="apioverview.html#repframeworkoverview" title="Replication Manager Overview">Replication Manager Overview</a>
271                    for details.)
272            </p>
273          <p>
274                Also, DB's replication implementation requires all
275                participating environments to be assigned IDs that are
276                locally unique to the given environment. Depending on the
277                replication APIs that you choose to use, you may or may not
278                need to manage this particular detail. 
279            </p>
280          <p>
281                    For detailed information on database environments, see 
282                    the <em class="citetitle">Berkeley DB Getting Started with Transaction Processing</em>
283                    guide.  For more information on environment IDs, see 
284                    the <em class="citetitle">Berkeley DB Programmer's Reference Guide</em>.
285            </p>
286        </div>
287        <div class="sect2" lang="en" xml:lang="en">
288          <div class="titlepage">
289            <div>
290              <div>
291                <h3 class="title"><a id="repdbs"></a>Replication Databases</h3>
292              </div>
293            </div>
294          </div>
295          <p>
296                DB's databases are managed and used in exactly the same way
297                as if you were writing a non-replicated application, with
298                a couple of caveats. First,  the databases maintained in a replicated environment
299                must reside either in the <code class="literal">ENV_HOME</code>
300                directory, or in the directory identified by the 
301                    <code class="methodname">DB_ENV-&gt;set_data_dir()</code>
302                    
303                    
304                method. Unlike non-replication applications, you cannot place your 
305                databases in a subdirectory below these locations. You should
306                also not use full path names for your databases or
307                environments as these are likely to break when they are replicated
308                to other machines.
309            </p>
310        </div>
311        <div class="sect2" lang="en" xml:lang="en">
312          <div class="titlepage">
313            <div>
314              <div>
315                <h3 class="title"><a id="commlayer"></a>Communications Layer</h3>
316              </div>
317            </div>
318          </div>
319          <p>
320                In order to transmit database writes to the replication
321                replicas, DB requires a communications layer.
322                DB is agnostic as to what this layer should
323                look like. The only requirement is that it 
324                be capable of passing two opaque data objects and an
325                environment ID from the master to its replicas without
326                corruption.
327            </p>
328          <p>
329                Because replicas are usually placed on different machines on
330                the network, the communications layer is usually some kind
331                of a network-aware implementation. Beyond that, its
332                implementation details are largely up to you. It could use
333                TCP/IP sockets, for example, or it could use
334                raw sockets if they perform better for your particular
335                application.
336            </p>
337          <p>
338                Note that you may not have to write your own communications
339                layer. DB provides a Replication Manager that
340                includes a fully-functional TCP/IP-based communications layer.
341                See <a class="xref" href="apioverview.html" title="The Replication APIs">The Replication APIs</a>
342                for more information.
343            </p>
344          <p>
345                    See the <em class="citetitle">Berkeley DB Programmer's Reference Guide</em> 
346                    for a description of how to
347                write your own custom replication communications layer.
348            </p>
349        </div>
350        <div class="sect2" lang="en" xml:lang="en">
351          <div class="titlepage">
352            <div>
353              <div>
354                <h3 class="title"><a id="masterselect"></a>Selecting a Master</h3>
355              </div>
356            </div>
357          </div>
358          <p>
359                    Every replication group is allowed one and only one
360                    master environment. Almost always, masters are selected by
361                    holding an <span class="emphasis"><em>election</em></span>.  All such
362                    elections are performed by the underlying Berkeley DB
363                    replication code so you have to do very little to
364                    implement them. 
365                </p>
366          <p>
367                    When holding an election, replicas "vote" on who should
368                    be the master. Among replicas participating in the
369                    election, the one with the most up-to-date set of log
370                    records will win the election. Note that it's possible
371                    for there to be a tie. When this occurs, priorities are
372                    used to select the master. See 
373                    <a class="xref" href="elections.html" title="Holding Elections">Holding Elections</a>
374                    for details.
375                </p>
376          <p>
377                    For more information on holding and managing elections,
378                    see <a class="xref" href="elections.html" title="Holding Elections">Holding Elections</a>.
379                </p>
380        </div>
381      </div>
382    </div>
383    <div class="navfooter">
384      <hr />
385      <table width="100%" summary="Navigation footer">
386        <tr>
387          <td width="40%" align="left"><a accesskey="p" href="preface.html">Prev</a> </td>
388          <td width="20%" align="center"> </td>
389          <td width="40%" align="right"> <a accesskey="n" href="repadvantage.html">Next</a></td>
390        </tr>
391        <tr>
392          <td width="40%" align="left" valign="top">Preface </td>
393          <td width="20%" align="center">
394            <a accesskey="h" href="index.html">Home</a>
395          </td>
396          <td width="40%" align="right" valign="top"> Replication Benefits</td>
397        </tr>
398      </table>
399    </div>
400  </body>
401</html>
402