1<!--$Id: ex_comm.so,v 1.9 2006/08/24 17:59:56 bostic Exp $-->
2<!--Copyright (c) 1997,2008 Oracle.  All rights reserved.-->
3<!--See the file LICENSE for redistribution information.-->
4<html>
5<head>
6<title>Berkeley DB Reference Guide: Ex_rep_base: a TCP/IP based communication infrastructure</title>
7<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
8<meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++">
9</head>
10<body bgcolor=white>
11<table width="100%"><tr valign=top>
12<td><b><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Replication</dl></b></td>
13<td align=right><a href="/rep/ex.html"><img src="/images/prev.gif" alt="Prev"></a><a href="/toc.html"><img src="/images/ref.gif" alt="Ref"></a><a href="/rep/ex_rq.html"><img src="/images/next.gif" alt="Next"></a>
14</td></tr></table>
15<p align=center><b>Ex_rep_base: a TCP/IP based communication infrastructure</b></p>
16<p>
17Applications which use the Base replication API must implement a
18communication infrastructure.  The communication infrastructure
19consists of three parts: a way to map environment IDs to particular
20sites, the functions to get and receive messages, and the application
21architecture that supports the particular communication infrastructure
22used (for example, individual threads per communicating site, a shared
23message handler for all sites, a hybrid solution).  The communication
24infrastructure for ex_rep_base is implemented in the file
25<b>ex_rep/base/rep_net.c</b>, and each part of that infrastructure
26is described as follows.</p>
27<p>Ex_rep_base maintains a table of environment ID to TCP/IP port
28mappings.  A pointer to this table is stored in a structure pointed to
29by the app_private field of the <a href="/api_c/env_class.html">DB_ENV</a> object so it can be
30accessed by any function that has the database environment handle.
31The table is represented by a machtab_t structure which contains a
32reference to a linked list of member_t's, both of which are defined in
33<b>ex_rep/base/rep_net.c</b>.  Each member_t contains the host and
34port identification, the environment ID, and a file descriptor.</p>
35<p>This design is particular to this application and communication
36infrastructure, but provides an indication of the sort of functionality
37that is needed to maintain the application-specific state for a
38TCP/IP-based infrastructure.  The goal of the table and its interfaces
39is threefold: First, it must guarantee that given an environment ID,
40the send function can send a message to the appropriate place.  Second,
41when given the special environment ID <a href="/api_c/rep_transport.html#DB_EID_BROADCAST">DB_EID_BROADCAST</a>, the send
42function can send messages to all the machines in the group.  Third,
43upon receipt of an incoming message, the receive function can correctly
44identify the sender and pass the appropriate environment ID to the
45<a href="/api_c/rep_message.html">DB_ENV-&gt;rep_process_message</a> method.</p>
46<p>Mapping a particular environment ID to a specific port is accomplished
47by looping through the linked list until the desired environment ID is
48found.  Broadcast communication is implemented by looping through the
49linked list and sending to each member found.  Since each port
50communicates with only a single other environment, receipt of a message
51on a particular port precisely identifies the sender.</p>
52<p>This is implemented in the quote_send, quote_send_broadcast and
53quote_send_one functions, which can be found in
54<b>ex_rep/base/rep_net.c</b>.</p>
55<p>The example provided is merely one way to satisfy these requirements,
56and there are alternative implementations as well.  For instance,
57instead of associating separate socket connections with each remote
58environment, an application might instead label each message with a
59sender identifier; instead of looping through a table and sending a
60copy of a message to each member of the replication group, the
61application could send a single message using a broadcast protocol.</p>
62<p>The quote_send function is passed as the callback to
63<a href="/api_c/rep_transport.html">DB_ENV-&gt;rep_set_transport</a>; Berkeley DB automatically sends messages as needed
64for replication.  The receive function is a mirror to the quote_send_one
65function.  It is not a callback function (the application is responsible
66for collecting messages and calling <a href="/api_c/rep_message.html">DB_ENV-&gt;rep_process_message</a> on them as is
67convenient).  In the sample application, all messages transmitted are
68Berkeley DB messages that get handled by <a href="/api_c/rep_message.html">DB_ENV-&gt;rep_process_message</a>, however, this
69is not always going to be the case.  The application may want to pass
70its own messages across the same channels, distinguish between its own
71messages and those of Berkeley DB, and then pass only the Berkeley DB ones to
72<a href="/api_c/rep_message.html">DB_ENV-&gt;rep_process_message</a>.</p>
73<p>The final component of the communication infrastructure is the process
74model used to communicate with all the sites in the replication group.
75Each site creates a thread of control that listens on its designated
76socket (as specified by the <b>-m</b> command line argument) and
77then creates a new channel for each site that contacts it.  In addition,
78each site explicitly connects to the sites specified in the
79<b>-o</b> command line argument.  This is a fairly standard TCP/IP
80process architecture and is implemented by the following functions (all
81in <b>ex_rep/base/rep_net.c</b>).</p>
82<table width="100%"><tr><td><br></td><td align=right><a href="/rep/ex.html"><img src="/images/prev.gif" alt="Prev"></a><a href="/toc.html"><img src="/images/ref.gif" alt="Ref"></a><a href="/rep/ex_rq.html"><img src="/images/next.gif" alt="Next"></a>
83</td></tr></table>
84<p><font size=1>Copyright (c) 1996,2008 Oracle.  All rights reserved.</font>
85</body>
86</html>
87