1% BEGIN LICENSE BLOCK 2% Version: CMPL 1.1 3% 4% The contents of this file are subject to the Cisco-style Mozilla Public 5% License Version 1.1 (the "License"); you may not use this file except 6% in compliance with the License. You may obtain a copy of the License 7% at www.eclipse-clp.org/license. 8% 9% Software distributed under the License is distributed on an "AS IS" 10% basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See 11% the License for the specific language governing rights and limitations 12% under the License. 13% 14% The Original Code is The ECLiPSe Constraint Logic Programming System. 15% The Initial Developer of the Original Code is Cisco Systems, Inc. 16% Portions created by the Initial Developer are 17% Copyright (C) 1993 - 2006 Cisco Systems, Inc. All Rights Reserved. 18% 19% Contributor(s): 20% 21% END LICENSE BLOCK 22% 23% @(#)umsmemory.tex 1.4 93/03/29 24% 25% 26% umsmemory.tex 27% 28% REL DATE AUTHOR DESCRIPTION 29% 27.4.90 Joachim Schimpf 30% 31 32\chapter{Memory Organisation And Garbage Collection\label{chapmemory}} 33%HEVEA\cutdef[1]{section} 34\index{memory usage} 35\section{Introduction} 36This chapter may be skipped on a first reading. 37Its purpose is to give the advanced user a better understanding 38of how the system uses memory resources. 39In a high level language like Prolog it is often not obvious for the programmer 40to see where the system allocates or frees memory. 41 42The sizes of the different memory areas can be queried by means of the predicate 43\bipref{statistics/2}{../bips/kernel/env/statistics-2.html} and 44\bipref{statistics/0}{../bips/kernel/env/statistics-0.html} prints a summary of 45all these data. 46Here is a sample output: 47\begin{quote} 48\begin{verbatim} 49[eclipse 1]: statistics. 50 51times: [1.12, 0.09, 2.74] seconds 52session_time: 2.74 seconds 53event_time: 2.74 seconds 54global_stack_used: 1936 bytes 55global_stack_allocated: 4456448 bytes 56global_stack_peak: 4456448 bytes 57trail_stack_used: 64 bytes 58trail_stack_allocated: 262144 bytes 59trail_stack_peak: 4456448 bytes 60control_stack_used: 564 bytes 61control_stack_allocated:262144 bytes 62control_stack_peak: 262144 bytes 63local_stack_used: 492 bytes 64local_stack_allocated: 262144 bytes 65local_stack_peak: 262144 bytes 66shared_heap_allocated: 1613824 bytes 67shared_heap_used: 1411000 bytes 68private_heap_allocated: 73728 bytes 69private_heap_used: 36992 bytes 70gc_number: 1 71gc_collected: 23472.0 bytes 72gc_area: 23560 bytes 73gc_ratio: 99.6264855687606 % 74gc_time: 0.0 seconds 75dictionary_entries: 3252 76dict_hash_usage: 2117 / 8192 77dict_hash_collisions: 314 / 2117 78dict_gc_number: 2 79dict_gc_time: 0.01 seconds 80\end{verbatim} 81\end{quote} 82The \notation{used}-figures indicate the actual usage at the moment the 83statistics built-in was called. The \notation{allocated} value is the 84amount of memory that is reserved for this area and actually occupied 85by the {\eclipse} process. The \notation{peak} value indicates what was the 86maximum allocated amount during the session. 87In the following we will discuss the six memory areas mentioned. 88The \notation{gc}-figures are described in section \ref{gc}. 89 90\subsection{The Shared/Private Heap} 91\index{shared heap} 92\index{private heap} 93\index{heap} 94The heap is used to store a variety of data: 95\begin{quote} 96\begin{description} 97\item [compiled code:] 98The heap is used to store compiled Prolog code. 99Consequently its size is increased by the various 100\biptxtref{compile}{compile/1}{../bips/kernel/compiler/compile-1.html}-predicates, 101the \biptxtref{assert}{assert/1}{../bips/kernel/dynamic/assert-1.html}-family 102and by \bipref{load/1}{../bips/kernel/externals/load-1.html}. 103Space is freed when single clauses 104(\biptxtref{retract}{retract/1}{../bips/kernel/dynamic/retract-1.html}) or 105whole predicates 106(\txtbipref{abolish}{abolish/1}{../bips/kernel/compiler/abolish-1.html}) are 107removed from the system. 108Note that space reclaiming is usually delayed in these cases 109(see \bipref{trimcore/0}{../bips/kernel/env/trimcore-0.html}), 110since the removed code may still be under execution. 111Erasing a module also reclaims all the memory occupied by the module's 112predicates. 113\item [non-logical storage:] 114All facilities for storing information across backtracking use the 115heap to do so. This includes the handle-based facilities 116(bags, shelves) as well as the name-based facilities (records, non-logical 117variables and arrays). 118\index{bag} 119\index{shelf} 120\index{non-logical variable}\index{variable!non-logical} 121\index{array} 122As a general rule, when a stored term is overwritten, the space for 123the old value is reclaimed. All memory related to a non-logical store is 124reclaimed when the store is destroyed (e.g., using 125\bipref{erase_array/1}{../bips/kernel/storage/erase_array-1.html}, 126\bipref{erase_all/1}{../bips/kernel/record/erase_all-1.html}, 127\bipref{bag_abolish/1}{../bips/kernel/storage/bag_abolish-1.html}, 128\bipref{shelf_abolish/1}{../bips/kernel/storage/shelf_abolish-1.html}). 129\item [dictionary:] 130The \defnotion{dictionary} is the system's table of atoms and functors. 131The dictionary grows whenever the system encounters an atom or functor that 132has not been mentioned so far. 133The dictionary shrinks on dictionary garbage collections, which are triggered 134automatically after a certain number of new entries has been made 135(see \bipref{set_flag/2}{../bips/kernel/env/set_flag-2.html}). 136The dictionary is designed to hold several thousand entries, 137the current number of entries can be queried with 138\bipref{statistics/0,2}{../bips/kernel/env/statistics-0.html}. 139\item [various descriptors:] 140The system manages a number of other internal tables (for modules, predicates, 141streams, operators, etc.) that are also allocated on the heap. 142This space is reclaimed when the related Prolog objects cease to exist. 143\item [I/O-buffers:] 144When streams are opened, the system allocates buffers from the 145heap. They are freed when the stream is closed. 146\item [allocation in C-externals:] 147If third party libraries or external predicates written in C/C++ call 148\notation{malloc()} or related C library functions, this space is also allocated 149from the heap. It is the allocating code's responsibility to free 150this space if it becomes unused. 151\end{description} 152\end{quote} 153Note that the distinction between shared and private heap is only relevant for 154parallel \eclipse{} systems, where multiple workers share the shared 155heap, but have their own private heap and stacks. 156 157 158\subsection{The Local Stack} 159\index{local stack} 160\index{stacks} 161The Local Stack is very similar to the call/return stack in procedural 162languages. 163It holds Prolog variables and return addresses. 164Space on this stack is allocated during execution of a clause and deallocated 165before the last subgoal is called (due to tail recursion / last call 166optimisation). 167This deallocation can not be done when the clause exits nondeterministically 168(this can be checked with the debugger or the profiling facility). 169However, if a deallocation has been delayed due to nondeterminism, it is 170finally done when a cut is executed or when execution fails beyond 171the allocation point. 172Hence the ways to limit growth of the local stack are 173\begin{itemize} 174\item use tail recursion where possible; 175\item avoid unnecessary nondeterminism (cf. \ref{controlstack}). 176\end{itemize} 177 178%The maximum size of the Local Stack depends on the {\eclipse} version. 179%On many machines, the Local Stack shares with the Control Stack the 180%area specified by the {\tt -l <size>} option on the {\eclipse} command line. 181%On SUN-3 machines the Local Stack uses the machine stack and the 182%C-shell command {\tt limit stacksize <size>} can be used to specify its size. 183%The Local Stack shares with the Control Stack the area specified by the 184%{\tt -l <size>} option on the {\eclipse} command line. 185 186\subsection{The Control Stack\label{controlstack}} 187\index{control stack} 188The main use of the Control Stack is to store so-called 189\defnotionni{choicepoints}.\index{choicepoint} 190A choicepoint is a description of the system's state at a certain point 191in execution. 192It is created when more than one clause of a predicate apply to a given goal. 193Should the first clause fail, the system will backtrack 194to the place where the choice was made, the old state will be restored 195from the choicepoint and the next clause will be tried. 196Disjunctions (\bipref{;/2}{../bips/kernel/control/O-2.html}) also create 197choicepoints. 198 199The only way to reduce Control Stack usage is to avoid unnecessary 200nondeterminism. 201This is done by writing deterministic predicates in such a way that they 202can be recognised by the system. 203The debugger can help to identify nondeterministic predicates: 204When it displays an \notation{*EXIT} port instead of \notation{EXIT} then the 205predicate 206has left a choicepoint behind. 207In this case it should be checked whether the nondeterminism was intended. 208If not, the predicate can often be made deterministic by 209\begin{itemize} 210\item writing the clause heads such that a matching clause can be more 211easily selected by \emph{indexing}; 212\item using the if-then-else construct (\notation{..~->~..~;~..}); 213\item deliberate insertion of (green) cuts. 214\end{itemize} 215 216%The maximum size of the Control Stack is specified by the {\tt -l <size>} 217%option on the {\eclipse} command line. Depending on the version, this space 218%may be shared with the Local Stack. 219 220\subsection{The Global Stack} 221\index{global stack} 222The Global Stack holds Prolog structures, lists, strings and long numbers. 223So the user's selection of data structures is largely responsible 224for the growth of this stack (cf. \ref{chapstring}). 225In coroutining mode, delayed goals also consume space on the Global Stack. 226It also stores source variable names for terms which 227were read in with the flag \notation{variable_names} being \notation{on}. 228When this feature is not needed, it should be turned off 229so that space on the global stack is saved. 230 231The global stack grows while a program creates data structures. 232It is popped only on failure. {\eclipse} therefore provides a garbage collector 233for the Global Stack which is called when a certain amount 234of new space has been consumed. See section \ref{gc} for how this process 235can be controlled. 236Note again that unnecessary nondeterminism reduces the amount of garbage 237that can be reclaimed and should therefore be avoided. 238 239%The total size of the Global Stack (plus the Trail Stack) is set by 240%the {\tt -g <size>} option on the {\eclipse} command line. 241 242\subsection{The Trail Stack} 243\index{trail stack} 244The Trail Stack is used to record information that is needed on backtracking. 245It is therefore closely related to the Control Stack. 246Ways to reduce Trail Stack consumption are 247\begin{itemize} 248\item avoid unnecessary nondeterminism; 249\item supply \notation{mode} declarations. 250\end{itemize} 251The Trail Stack is popped on failure and 252is garbage collected together with the Global Stack. 253%Its maximum size depends on the {\tt -g <size>} option 254%since this space is shared between Global and Trail Stack. 255 256\section{Garbage collection\label{gc}} 257\index{garbage collection} 258 259The four stacks grow an shrink as needed.\footnote{% 260 Provided that the underlying operating system supports this.} 261In addition, {\eclipse} provides an incremental garbage collector 262for the global and the trail stack. 263It is also equipped with a dictionary garbage 264collector that frees memory that is occupied by obsolete atoms and functors. 265Both collectors are switched on by default and are automatically invoked 266from time to time. 267Nevertheless, there are some predicates to control their action. 268The following predicate calls affect both collectors: 269\begin{quote} 270\begin{description} 271\item [set_flag(gc,~on):] 272\indextt{set_flag/2} 273 Enable the garbage collector (the default). 274\item [set_flag(gc,~verbose):] 275 The same as 'on', but print a message on every collection 276 (the message goes to toplevel_output): 277{\small 278\begin{verbatim} 279GC ... global: 96208 - 88504 (92.0 %), trail: 500 - 476 (95.2 %), time: 0.017 280\end{verbatim} 281} 282 It displays the area to be searched for garbage, the amount 283 and percentage of garbage, and the time for the collection. 284 The message of the dictionary collector is as follows: 285{\small 286\begin{verbatim} 287DICTIONARY GC ... 2814 - 653, (23.2 %), time: 0.033 288\end{verbatim} 289} 290 It displays the number of dictionary entries before the collection, 291 the number of collected entries, the percentage of reduction and 292 the collection time. 293\item [set_flag(gc,~off):] 294 Disable the garbage collector (and risk an overflow), e.g., for 295 time-critical execution sequences. 296\end{description} 297\end{quote} 298\noindent Predicate calls related to the stack collector are: 299\begin{quote} 300\begin{description} 301\item [set_flag(gc_policy,~adaptive):] 302 This option affects the triggering heuristics of the garbage 303 collector, together with the \notation{gc_interval} setting. 304 The \notation{adaptive} policy (the default) minimises garbage 305 collection time. 306\item [set_flag(gc_policy,~fixed):] As above, but the \notation{fixed} policy 307 minimises space consumption. 308\item [set_flag(gc_interval,~\pattern{Nbytes}):] 309 Specify how often the collector is to be invoked. Roughly, 310 \about{Nbytes} 311 is the number of bytes that your program can use up before 312 a garbage collection is triggered. 313 There may be programs that create lots of (useful) lists and 314 structures while leaving few garbage. This will cause the garbage 315 collector to run frequently while reclaiming little space. 316 If you suspect this, you should call \predspec{statistics/0} and check 317 the garbage ratio. If it is very low (say below 50\%) it may 318 make sense to increase the \notation{gc_interval}, thus reducing 319 the number of garbage collections. This is normally only 320 necessary when the \notation{gc_policy} is set to fixed. With 321 \notation{gc_policy} 322 set to adaptive, the collection intervals will be adjusted 323 automatically. 324\item [garbage_collect:] 325\indextt{garbage_collect/0} 326 Request an immediate collection (only if enabled). The use of this 327 predicate should be restricted to situations where the automatic 328 triggering performs badly. It should then be inserted in a place 329 where you know for sure that you have just created a lot of garbage, 330 e.g., before the tail-recursive call in something like 331\begin{verbatim} 332cycle(OldState) :- 333 transform(OldState, NewState), /* long computation */ 334 !, 335 garbage_collect, /* OldState is obsolete */ 336 cycle(NewState). 337\end{verbatim} 338\item [statistics(gc_number,~\pattern{N}):] 339\indextt{statistics/2} 340 The number of stack garbage collections performed during this {\eclipse} 341 session. 342\item [statistics(gc_collected,~\pattern{Bytes}):] 343 The amount of global stack space (in bytes) reclaimed by all the 344 garbage collections. 345\item [statistics(gc_area,~\pattern{Bytes}):] 346 The average global stack area that was scanned by each garbage 347 collection. This number should be close to the amount selected with 348 \notation{gc_interval}, if it is much larger, \notation{gc_interval} 349 should be increased. 350\item [statistics(gc_ratio,~\pattern{Percentage}):] 351 The average percentage of garbage found and reclaimed by each 352 garbage collection. 353 If this ratio is low, \notation{gc_interval} should be increased. 354\item [statistics(gc_time,~\pattern{Seconds}):] 355 The total cpu time spent during all garbage collections. 356\end{description} 357\end{quote} 358 359\noindent Predicates related to the dictionary collector are: 360\begin{quote} 361\begin{description} 362\item [set_flag(gc_interval_dict,~\pattern{N}):] 363 Specify that the dictionary collector should be invoked after 364 \about{N} new dictionary entries have been made. 365\item [statistics(dict_gc_number,~\pattern{N}):] 366 The number of garbage collections performed on the dictionary during this 367 {\eclipse} session. 368\item [statistics(dict_gc_time,~\pattern{Seconds}):] 369 The total cpu time spent by all dictionary garbage collections. 370\end{description} 371\end{quote} 372 373%HEVEA\cutend 374