1% BEGIN LICENSE BLOCK
2% Version: CMPL 1.1
3%
4% The contents of this file are subject to the Cisco-style Mozilla Public
5% License Version 1.1 (the "License"); you may not use this file except
6% in compliance with the License.  You may obtain a copy of the License
7% at www.eclipse-clp.org/license.
8% 
9% Software distributed under the License is distributed on an "AS IS"
10% basis, WITHOUT WARRANTY OF ANY KIND, either express or implied.  See
11% the License for the specific language governing rights and limitations
12% under the License. 
13% 
14% The Original Code is  The ECLiPSe Constraint Logic Programming System. 
15% The Initial Developer of the Original Code is  Cisco Systems, Inc. 
16% Portions created by the Initial Developer are
17% Copyright (C) 2006 Cisco Systems, Inc.  All Rights Reserved.
18% 
19% Contributor(s): 
20% 
21% END LICENSE BLOCK
22\section{\eclipse Message Passing Requirements}
23\label{sec:eclipse}
24
25Message passing libraries such as P4 \cite{p4:parcom4_94} and PVM 
26\cite{pvm:parcom4_94} have made the message passing paradigm very
27popular for implementing parallel applications, because using 
28these libraries enable parallel applications to run on a wide range 
29of hardware platforms including heterogeneous computer networks. A 
30computer network is however quite different from a parallel machine. 
31On a parallel machine one may easily acquire exclusive access to a 
32fixed number of processors that can interact with a relatively low 
33latency. When using a computer network as a virtual parallel machine
34one should take into account that message latencies are much higher
35and that the number of processors available to a parallel application
36is not constant. The latter is caused by the fact that personal
37workstations should only be regarded as available when not in use by 
38its owner sitting in front of it. A means for adding and removing 
39processes dynamically is therefore a general requirement, especially
40for long-running applications. For parallel \eclipse this is quite
41important since its applications may be very dynamic in the amount
42of resources required and it makes therefore sense to release machines, 
43i.e. remove idle workers (with their underlying processes) and free 
44their associated swap space, when not needed for a longer time.
45
46Parallel applications do in general not address the problem of removing
47and migrating processes since this may be too complicated, especially
48in heterogeneous environments. In parallel \eclipse, idle workers can
49be brought into a state in which they do not require any further
50communication. They can easily be removed since they do not hold any
51state that cannot be reproduced. When a worker has to release its
52processor very quickly, e.g. when resident on a workstation whose
53user arrives back at its keyboard, the worker needs to migrate. Although 
54a worker migration mechanism has not been fully designed yet, it is 
55envisioned that it will basically consists of (1) blocking the communication 
56channels to the old worker, (2) creating a new worker, (3) recomputing the 
57state of the old worker in the new worker, (4) forward any pending messages 
58from the old worker to the new worker, (5) replace the communication 
59channels to the old worker by communication channels to the new worker, 
60and (6) unblock the communication channels to the new worker. It is 
61therefore likely that worker migration will require facilities for 
62forwarding messages, blocking and unblocking communication channels, and 
63adding and removing communication channels.
64
65\eclipse workers exchange messages with one another and with the worker
66manager. High message latencies, typical for computer networks, may have 
67a dramatic effect on the overall performance. It is therefore important
68that these latencies can be masked, for example by multi-threading and/or
69asynchronous communication. The former is not that suitable to parallel
70\eclipse, because the state copying scheme requires that all the engines 
71have their data structures (i.e. their state) located at identical areas 
72in the address space of their worker. A worker with multiple engines and
73a thread per engine is therefore not possible. Using a separate thread for 
74the scheduler would only be a partial solution since quite often the scheduler 
75is performing a request on behalf of its engine and that at times when the 
76engine is idle. Parallel \eclipse will therefore have to rely on asynchronous
77message passing primitives for hiding considerable network latencies.
78
79In parallel \eclipse not all messages have the same importance. Some messages
80are so important that they should be acted upon as quickly as possible. An 
81idle worker looking for work should for example be helped out promptly since 
82idle workers do obviously not contribute anything to finding a solution to the 
83search problem at hand. Occasionally a worker runs into an exceptional 
84situation (e.g. running out of memory) which has to be reported to the 
85worker manager (and maybe one or more workers) with the highest priority. 
86Parallel \eclipse requires therefore a mechanism which enables the receiver 
87to differentiate important messages from less important messages. Furthermore, 
88very important messages or {\it exception messages} should not be delayed 
89unnecessary by for example low polling frequencies or time consuming message 
90selection mechanisms. Since it is not known in advance if and when messages 
91will be sent it is appealing to avoid message polling and rely on active 
92messages \cite{am:acm92} or interrupt driven receive primitives. Having to
93select the most important messages from a single port with mixture of all
94kinds of messages is found to be complicated and error prone. With multiple
95communication ports per worker, e.g. one for each importance class, the
96parallel \eclipse would be relieved from the complexities of message 
97multiplexing and demultiplexing. 
98
99In addition to the more specific message passing requirements of parallel
100\eclipse presented above, there are of course also some general requirements
101such as heterogeneity, portability, and efficiency. Luckily, these are 
102quite well supported by most message passing libraries including 
103MPI \cite{mpi:hpcn94,mpi:manual}. 
104
105