1% BEGIN LICENSE BLOCK
2% Version: CMPL 1.1
3%
4% The contents of this file are subject to the Cisco-style Mozilla Public
5% License Version 1.1 (the "License"); you may not use this file except
6% in compliance with the License.  You may obtain a copy of the License
7% at www.eclipse-clp.org/license.
8%
9% Software distributed under the License is distributed on an "AS IS"
10% basis, WITHOUT WARRANTY OF ANY KIND, either express or implied.  See
11% the License for the specific language governing rights and limitations
12% under the License.
13%
14% The Original Code is  The ECLiPSe Constraint Logic Programming System.
15% The Initial Developer of the Original Code is  Cisco Systems, Inc.
16% Portions created by the Initial Developer are
17% Copyright (C) 1993 - 2006 Cisco Systems, Inc.  All Rights Reserved.
18%
19% Contributor(s):
20%
21% END LICENSE BLOCK
22%
23% @(#)umsmemory.tex	1.4 93/03/29
24%
25%
26% umsmemory.tex
27%
28% REL	DATE	AUTHOR		DESCRIPTION
29%	27.4.90	Joachim Schimpf
30%
31
32\chapter{Memory Organisation And Garbage Collection\label{chapmemory}}
33%HEVEA\cutdef[1]{section}
34\index{memory usage}
35\section{Introduction}
36This chapter may be skipped on a first reading.
37Its purpose is to give the advanced user a better understanding
38of how the system uses memory resources.
39In a high level language like Prolog it is often not obvious for the programmer
40to see where the system allocates or frees memory.
41
42The sizes of the different memory areas can be queried by means of the predicate
43\bipref{statistics/2}{../bips/kernel/env/statistics-2.html} and
44\bipref{statistics/0}{../bips/kernel/env/statistics-0.html} prints a summary of
45all these data.
46Here is a sample output:
47\begin{quote}
48\begin{verbatim}
49[eclipse 1]: statistics.
50
51times:                  [1.12, 0.09, 2.74] seconds
52session_time:           2.74 seconds
53event_time:             2.74 seconds
54global_stack_used:      1936 bytes
55global_stack_allocated: 4456448 bytes
56global_stack_peak:      4456448 bytes
57trail_stack_used:       64 bytes
58trail_stack_allocated:  262144 bytes
59trail_stack_peak:       4456448 bytes
60control_stack_used:     564 bytes
61control_stack_allocated:262144 bytes
62control_stack_peak:     262144 bytes
63local_stack_used:       492 bytes
64local_stack_allocated:  262144 bytes
65local_stack_peak:       262144 bytes
66shared_heap_allocated:  1613824 bytes
67shared_heap_used:       1411000 bytes
68private_heap_allocated: 73728 bytes
69private_heap_used:      36992 bytes
70gc_number:              1
71gc_collected:           23472.0 bytes
72gc_area:                23560 bytes
73gc_ratio:               99.6264855687606 %
74gc_time:                0.0 seconds
75dictionary_entries:     3252
76dict_hash_usage:        2117 / 8192
77dict_hash_collisions:   314 / 2117
78dict_gc_number:         2
79dict_gc_time:           0.01 seconds
80\end{verbatim}
81\end{quote}
82The \notation{used}-figures indicate the actual usage at the moment the
83statistics built-in was called. The \notation{allocated} value is the
84amount of memory that is reserved for this area and actually occupied
85by the {\eclipse} process. The \notation{peak} value indicates what was the
86maximum allocated amount during the session.
87In the following we will discuss the six memory areas mentioned.
88The \notation{gc}-figures are described in section \ref{gc}.
89
90\subsection{The Shared/Private Heap}
91\index{shared heap}
92\index{private heap}
93\index{heap}
94The heap is used to store a variety of data:
95\begin{quote}
96\begin{description}
97\item [compiled code:]
98The heap is used to store compiled Prolog code.
99Consequently its size is increased by the various
100\biptxtref{compile}{compile/1}{../bips/kernel/compiler/compile-1.html}-predicates,
101the \biptxtref{assert}{assert/1}{../bips/kernel/dynamic/assert-1.html}-family
102and by \bipref{load/1}{../bips/kernel/externals/load-1.html}.
103Space is freed when single clauses
104(\biptxtref{retract}{retract/1}{../bips/kernel/dynamic/retract-1.html}) or
105whole predicates
106(\txtbipref{abolish}{abolish/1}{../bips/kernel/compiler/abolish-1.html}) are
107removed from the system.
108Note that space reclaiming is usually delayed in these cases
109(see \bipref{trimcore/0}{../bips/kernel/env/trimcore-0.html}),
110since the removed code may still be under execution.
111Erasing a module also reclaims all the memory occupied by the module's
112predicates.
113\item [non-logical storage:]
114All facilities for storing information across backtracking use the
115heap to do so. This includes the handle-based facilities
116(bags, shelves) as well as the name-based facilities (records, non-logical
117variables and arrays).
118\index{bag}
119\index{shelf}
120\index{non-logical variable}\index{variable!non-logical}
121\index{array}
122As a general rule, when a stored term is overwritten, the space for
123the old value is reclaimed. All memory related to a non-logical store is
124reclaimed when the store is destroyed (e.g., using
125\bipref{erase_array/1}{../bips/kernel/storage/erase_array-1.html},
126\bipref{erase_all/1}{../bips/kernel/record/erase_all-1.html},
127\bipref{bag_abolish/1}{../bips/kernel/storage/bag_abolish-1.html},
128\bipref{shelf_abolish/1}{../bips/kernel/storage/shelf_abolish-1.html}).
129\item [dictionary:]
130The \defnotion{dictionary} is the system's table of atoms and functors.
131The dictionary grows whenever the system encounters an atom or functor that
132has not been mentioned so far.
133The dictionary shrinks on dictionary garbage collections, which are triggered
134automatically after a certain number of new entries has been made
135(see \bipref{set_flag/2}{../bips/kernel/env/set_flag-2.html}).
136The dictionary is designed to hold several thousand entries,
137the current number of entries can be queried with
138\bipref{statistics/0,2}{../bips/kernel/env/statistics-0.html}.
139\item [various descriptors:]
140The system manages a number of other internal tables (for modules, predicates,
141streams, operators, etc.) that are also allocated on the heap.
142This space is reclaimed when the related Prolog objects cease to exist.
143\item [I/O-buffers:]
144When streams are opened, the system allocates buffers from the
145heap. They are freed when the stream is closed.
146\item [allocation in C-externals:]
147If third party libraries or external predicates written in C/C++ call
148\notation{malloc()} or related C library functions, this space is also allocated
149from the heap.  It is the allocating code's responsibility to free
150this space if it becomes unused.
151\end{description}
152\end{quote}
153Note that the distinction between shared and private heap is only relevant for
154parallel \eclipse{} systems, where multiple workers share the shared
155heap, but have their own private heap and stacks.
156
157
158\subsection{The Local Stack}
159\index{local stack}
160\index{stacks}
161The Local Stack is very similar to the call/return stack in procedural
162languages.
163It holds Prolog variables and return addresses.
164Space on this stack is allocated during execution of a clause and deallocated
165before the last subgoal is called (due to tail recursion / last call
166optimisation).
167This deallocation can not be done when the clause exits nondeterministically
168(this can be checked with the debugger or the profiling facility).
169However, if a deallocation has been delayed due to nondeterminism, it is
170finally done when a cut is executed or when execution fails beyond
171the allocation point.
172Hence the ways to limit growth of the local stack are
173\begin{itemize}
174\item use tail recursion where possible;
175\item avoid unnecessary nondeterminism (cf. \ref{controlstack}).
176\end{itemize}
177
178%The maximum size of the Local Stack depends on the {\eclipse} version.
179%On many machines, the Local Stack shares with the Control Stack the
180%area specified by the {\tt -l <size>} option on the {\eclipse} command line.
181%On SUN-3 machines the Local Stack uses the machine stack and the
182%C-shell command {\tt limit stacksize <size>} can be used to specify its size.
183%The Local Stack shares with the Control Stack the area specified by the
184%{\tt -l <size>} option on the {\eclipse} command line.
185
186\subsection{The Control Stack\label{controlstack}}
187\index{control stack}
188The main use of the Control Stack is to store so-called
189\defnotionni{choicepoints}.\index{choicepoint}
190A choicepoint is a description of the system's state at a certain point
191in execution.
192It is created when more than one clause of a predicate apply to a given goal.
193Should the first clause fail, the system will backtrack
194to the place where the choice was made, the old state will be restored
195from the choicepoint and the next clause will be tried.
196Disjunctions (\bipref{;/2}{../bips/kernel/control/O-2.html}) also create
197choicepoints.
198
199The only way to reduce Control Stack usage is to avoid unnecessary
200nondeterminism.
201This is done by writing deterministic predicates in such a way that they
202can be recognised by the system.
203The debugger can help to identify nondeterministic predicates:
204When it displays an \notation{*EXIT} port instead of \notation{EXIT} then the
205predicate
206has left a choicepoint behind.
207In this case it should be checked whether the nondeterminism was intended.
208If not, the predicate can often be made deterministic by
209\begin{itemize}
210\item writing the clause heads such that a matching clause can be more
211easily selected by \emph{indexing};
212\item using the if-then-else construct (\notation{..~->~..~;~..});
213\item deliberate insertion of (green) cuts.
214\end{itemize}
215
216%The maximum size of the Control Stack is specified by the {\tt -l <size>}
217%option on the {\eclipse} command line. Depending on the version, this space
218%may be shared with the Local Stack.
219
220\subsection{The Global Stack}
221\index{global stack}
222The Global Stack holds Prolog structures, lists, strings and long numbers.
223So the user's selection of data structures is largely responsible
224for the growth of this stack (cf. \ref{chapstring}).
225In coroutining mode, delayed goals also consume space on the Global Stack.
226It also stores source variable names for terms which
227were read in with the flag \notation{variable_names} being \notation{on}.
228When this feature is not needed, it should be turned off
229so that space on the global stack is saved.
230
231The global stack grows while a program creates data structures.
232It is popped only on failure. {\eclipse} therefore provides a garbage collector
233for the Global Stack which is called when a certain amount
234of new space has been consumed. See section \ref{gc} for how this process
235can be controlled.
236Note again that unnecessary nondeterminism reduces the amount of garbage
237that can be reclaimed and should therefore be avoided.
238
239%The total size of the Global Stack (plus the Trail Stack) is set by
240%the {\tt -g <size>} option on the {\eclipse} command line.
241
242\subsection{The Trail Stack}
243\index{trail stack}
244The Trail Stack is used to record information that is needed on backtracking.
245It is therefore closely related to the Control Stack.
246Ways to reduce Trail Stack consumption are
247\begin{itemize}
248\item avoid unnecessary nondeterminism;
249\item supply \notation{mode} declarations.
250\end{itemize}
251The Trail Stack is popped on failure and
252is garbage collected together with the Global Stack.
253%Its maximum size depends on the {\tt -g <size>} option
254%since this space is shared between Global and Trail Stack.
255
256\section{Garbage collection\label{gc}}
257\index{garbage collection}
258
259The four stacks grow an shrink as needed.\footnote{%
260  Provided that the underlying operating system supports this.}
261In addition, {\eclipse} provides an incremental garbage collector
262for the global and the trail stack.
263It is also equipped with a dictionary garbage
264collector that frees memory that is occupied by obsolete atoms and functors.
265Both collectors are switched on by default and are automatically invoked
266from time to time.
267Nevertheless, there are some predicates to control their action.
268The following predicate calls affect both collectors:
269\begin{quote}
270\begin{description}
271\item [set_flag(gc,~on):]
272\indextt{set_flag/2}
273	Enable the garbage collector (the default).
274\item [set_flag(gc,~verbose):]
275	The same as 'on', but print a message on every collection
276	(the message goes to toplevel_output):
277{\small
278\begin{verbatim}
279GC ... global: 96208 - 88504 (92.0 %), trail: 500 - 476 (95.2 %), time: 0.017
280\end{verbatim}
281}
282	It displays the area to be searched for garbage, the amount
283	and percentage of garbage, and the time for the collection.
284	The message of the dictionary collector is as follows:
285{\small
286\begin{verbatim}
287DICTIONARY GC ... 2814 - 653, (23.2 %), time: 0.033
288\end{verbatim}
289}
290	It displays the number of dictionary entries before the collection,
291	the number of collected entries, the percentage of reduction and
292	the collection time.
293\item [set_flag(gc,~off):]
294	Disable the garbage collector (and risk an overflow), e.g., for
295	time-critical execution sequences.
296\end{description}
297\end{quote}
298\noindent Predicate calls related to the stack collector are:
299\begin{quote}
300\begin{description}
301\item [set_flag(gc_policy,~adaptive):]
302	This option affects the triggering heuristics of the garbage
303	collector, together with the \notation{gc_interval} setting.
304	The \notation{adaptive} policy (the default) minimises garbage
305        collection time.
306\item [set_flag(gc_policy,~fixed):] As above, but the \notation{fixed} policy
307        minimises space consumption.
308\item [set_flag(gc_interval,~\pattern{Nbytes}):]
309	Specify how often the collector is to be invoked. Roughly,
310        \about{Nbytes}
311	is the number of bytes that your program can use up before
312	a garbage collection is triggered.
313	There may be programs that create lots of (useful) lists and
314	structures while leaving few garbage. This will cause the garbage
315	collector to run frequently while reclaiming little space.
316	If you suspect this, you should call \predspec{statistics/0} and check
317	the garbage ratio. If it is very low (say below 50\%) it may
318	make sense to increase the \notation{gc_interval}, thus reducing
319	the number of garbage collections.  This is normally only
320	necessary when the \notation{gc_policy} is set to fixed.  With
321        \notation{gc_policy}
322	set to adaptive, the collection intervals will be adjusted
323	automatically.
324\item [garbage_collect:]
325\indextt{garbage_collect/0}
326	Request an immediate collection (only if enabled). The use of this
327	predicate should be restricted to situations where the automatic
328	triggering performs badly. It should then be inserted in a place
329	where you know for sure that you have just created a lot of garbage,
330	e.g., before the tail-recursive call in something like
331\begin{verbatim}
332cycle(OldState) :-
333        transform(OldState, NewState),  /* long computation     */
334        !,
335        garbage_collect,                /* OldState is obsolete */
336        cycle(NewState).
337\end{verbatim}
338\item [statistics(gc_number,~\pattern{N}):]
339\indextt{statistics/2}
340	The number of stack garbage collections performed during this {\eclipse}
341        session.
342\item [statistics(gc_collected,~\pattern{Bytes}):]
343	The amount of global stack space (in bytes) reclaimed by all the
344	garbage collections.
345\item [statistics(gc_area,~\pattern{Bytes}):]
346	The average global stack area that was scanned by each garbage
347	collection. This number should be close to the amount selected with
348	\notation{gc_interval}, if it is much larger, \notation{gc_interval}
349	should be increased.
350\item [statistics(gc_ratio,~\pattern{Percentage}):]
351	The average percentage of garbage found and reclaimed by each
352	garbage collection.
353	If this ratio is low, \notation{gc_interval} should be increased.
354\item [statistics(gc_time,~\pattern{Seconds}):]
355	The total cpu time spent during all garbage collections.
356\end{description}
357\end{quote}
358
359\noindent Predicates related to the dictionary collector are:
360\begin{quote}
361\begin{description}
362\item [set_flag(gc_interval_dict,~\pattern{N}):]
363	Specify that the dictionary collector should be invoked after
364	\about{N} new dictionary entries have been made.
365\item [statistics(dict_gc_number,~\pattern{N}):]
366    The number of garbage collections performed on the dictionary during this
367    {\eclipse} session.
368\item [statistics(dict_gc_time,~\pattern{Seconds}):]
369    The total cpu time spent by all dictionary garbage collections.
370\end{description}
371\end{quote}
372
373%HEVEA\cutend
374