1132718Skan@c Copyright (C) 2002, 2003, 2004
2117395Skan@c Free Software Foundation, Inc.
3117395Skan@c This is part of the GCC manual.
4117395Skan@c For copying conditions, see the file gcc.texi.
5117395Skan
6117395Skan@node Type Information
7117395Skan@chapter Memory Management and Type Information
8117395Skan@cindex GGC
9117395Skan@findex GTY
10117395Skan
11117395SkanGCC uses some fairly sophisticated memory management techniques, which
12117395Skaninvolve determining information about GCC's data structures from GCC's
13132718Skansource code and using this information to perform garbage collection and
14132718Skanimplement precompiled headers.
15117395Skan
16169689SkanA full C parser would be too complicated for this task, so a limited
17117395Skansubset of C is interpreted and special markers are used to determine
18169689Skanwhat parts of the source to look at.  All @code{struct} and
19169689Skan@code{union} declarations that define data structures that are
20169689Skanallocated under control of the garbage collector must be marked.  All
21169689Skanglobal variables that hold pointers to garbage-collected memory must
22169689Skanalso be marked.  Finally, all global variables that need to be saved
23169689Skanand restored by a precompiled header must be marked.  (The precompiled
24169689Skanheader mechanism can only save static variables if they're scalar.
25169689SkanComplex data structures must be allocated in garbage-collected memory
26169689Skanto be saved in a precompiled header.)
27117395Skan
28169689SkanThe full format of a marker is
29169689Skan@smallexample
30169689SkanGTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{}))
31169689Skan@end smallexample
32169689Skan@noindent
33169689Skanbut in most cases no options are needed.  The outer double parentheses
34169689Skanare still necessary, though: @code{GTY(())}.  Markers can appear:
35117395Skan
36169689Skan@itemize @bullet
37169689Skan@item
38169689SkanIn a structure definition, before the open brace;
39169689Skan@item
40169689SkanIn a global variable declaration, after the keyword @code{static} or
41169689Skan@code{extern}; and
42169689Skan@item
43169689SkanIn a structure field definition, before the name of the field.
44169689Skan@end itemize
45117395Skan
46169689SkanHere are some examples of marking simple data structures and globals.
47169689Skan
48169689Skan@smallexample
49169689Skanstruct @var{tag} GTY(())
50169689Skan@{
51169689Skan  @var{fields}@dots{}
52169689Skan@};
53169689Skan
54169689Skantypedef struct @var{tag} GTY(())
55169689Skan@{
56169689Skan  @var{fields}@dots{}
57169689Skan@} *@var{typename};
58169689Skan
59169689Skanstatic GTY(()) struct @var{tag} *@var{list};   /* @r{points to GC memory} */
60169689Skanstatic GTY(()) int @var{counter};        /* @r{save counter in a PCH} */
61169689Skan@end smallexample
62169689Skan
63169689SkanThe parser understands simple typedefs such as
64169689Skan@code{typedef struct @var{tag} *@var{name};} and
65169689Skan@code{typedef int @var{name};}.
66169689SkanThese don't need to be marked.
67169689Skan
68117395Skan@menu
69117395Skan* GTY Options::		What goes inside a @code{GTY(())}.
70117395Skan* GGC Roots::		Making global variables GGC roots.
71117395Skan* Files::		How the generated files work.
72117395Skan@end menu
73117395Skan
74117395Skan@node GTY Options
75117395Skan@section The Inside of a @code{GTY(())}
76117395Skan
77169689SkanSometimes the C code is not enough to fully describe the type
78169689Skanstructure.  Extra information can be provided with @code{GTY} options
79169689Skanand additional markers.  Some options take a parameter, which may be
80169689Skaneither a string or a type name, depending on the parameter.  If an
81169689Skanoption takes no parameter, it is acceptable either to omit the
82169689Skanparameter entirely, or to provide an empty string as a parameter.  For
83169689Skanexample, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are
84169689Skanequivalent.
85117395Skan
86169689SkanWhen the parameter is a string, often it is a fragment of C code.  Four
87169689Skanspecial escapes may be used in these strings, to refer to pieces of
88169689Skanthe data structure being marked:
89117395Skan
90117395Skan@cindex % in GTY option
91117395Skan@table @code
92117395Skan@item %h
93169689SkanThe current structure.
94117395Skan@item %1
95169689SkanThe structure that immediately contains the current structure.
96117395Skan@item %0
97169689SkanThe outermost structure that contains the current structure.
98117395Skan@item %a
99169689SkanA partial expression of the form @code{[i1][i2]...} that indexes
100169689Skanthe array item currently being marked.
101117395Skan@end table
102117395Skan
103169689SkanFor instance, suppose that you have a structure of the form
104169689Skan@smallexample
105169689Skanstruct A @{
106169689Skan  ...
107169689Skan@};
108169689Skanstruct B @{
109169689Skan  struct A foo[12];
110169689Skan@};
111169689Skan@end smallexample
112169689Skan@noindent
113169689Skanand @code{b} is a variable of type @code{struct B}.  When marking
114169689Skan@samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]},
115169689Skan@code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a}
116169689Skanwould expand to @samp{[11]}.
117169689Skan
118169689SkanAs in ordinary C, adjacent strings will be concatenated; this is
119169689Skanhelpful when you have a complicated expression.
120169689Skan@smallexample
121169689Skan@group
122169689SkanGTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE"
123169689Skan                  " ? TYPE_NEXT_VARIANT (&%h.generic)"
124169689Skan                  " : TREE_CHAIN (&%h.generic)")))
125169689Skan@end group
126169689Skan@end smallexample
127169689Skan
128117395SkanThe available options are:
129117395Skan
130117395Skan@table @code
131117395Skan@findex length
132169689Skan@item length ("@var{expression}")
133117395Skan
134117395SkanThere are two places the type machinery will need to be explicitly told
135117395Skanthe length of an array.  The first case is when a structure ends in a
136117395Skanvariable-length array, like this:
137169689Skan@smallexample
138169689Skanstruct rtvec_def GTY(()) @{
139169689Skan  int num_elem;		/* @r{number of elements} */
140117395Skan  rtx GTY ((length ("%h.num_elem"))) elem[1];
141169689Skan@};
142169689Skan@end smallexample
143169689Skan
144117395SkanIn this case, the @code{length} option is used to override the specified
145117395Skanarray length (which should usually be @code{1}).  The parameter of the
146117395Skanoption is a fragment of C code that calculates the length.
147117395Skan
148117395SkanThe second case is when a structure or a global variable contains a
149117395Skanpointer to an array, like this:
150117395Skan@smallexample
151117395Skantree *
152117395Skan  GTY ((length ("%h.regno_pointer_align_length"))) regno_decl;
153117395Skan@end smallexample
154117395SkanIn this case, @code{regno_decl} has been allocated by writing something like
155117395Skan@smallexample
156117395Skan  x->regno_decl =
157117395Skan    ggc_alloc (x->regno_pointer_align_length * sizeof (tree));
158117395Skan@end smallexample
159117395Skanand the @code{length} provides the length of the field.
160117395Skan
161117395SkanThis second use of @code{length} also works on global variables, like:
162117395Skan@verbatim
163117395Skan  static GTY((length ("reg_base_value_size")))
164117395Skan    rtx *reg_base_value;
165117395Skan@end verbatim
166117395Skan
167117395Skan@findex skip
168117395Skan@item skip
169117395Skan
170117395SkanIf @code{skip} is applied to a field, the type machinery will ignore it.
171117395SkanThis is somewhat dangerous; the only safe use is in a union when one
172117395Skanfield really isn't ever used.
173117395Skan
174117395Skan@findex desc
175117395Skan@findex tag
176117395Skan@findex default
177169689Skan@item desc ("@var{expression}")
178169689Skan@itemx tag ("@var{constant}")
179117395Skan@itemx default
180117395Skan
181117395SkanThe type machinery needs to be told which field of a @code{union} is
182132718Skancurrently active.  This is done by giving each field a constant
183132718Skan@code{tag} value, and then specifying a discriminator using @code{desc}.
184132718SkanThe value of the expression given by @code{desc} is compared against
185132718Skaneach @code{tag} value, each of which should be different.  If no
186132718Skan@code{tag} is matched, the field marked with @code{default} is used if
187132718Skanthere is one, otherwise no field in the union will be marked.
188132718Skan
189132718SkanIn the @code{desc} option, the ``current structure'' is the union that
190132718Skanit discriminates.  Use @code{%1} to mean the structure containing it.
191169689SkanThere are no escapes available to the @code{tag} option, since it is a
192169689Skanconstant.
193132718Skan
194132718SkanFor example,
195117395Skan@smallexample
196117395Skanstruct tree_binding GTY(())
197117395Skan@{
198117395Skan  struct tree_common common;
199117395Skan  union tree_binding_u @{
200117395Skan    tree GTY ((tag ("0"))) scope;
201117395Skan    struct cp_binding_level * GTY ((tag ("1"))) level;
202132718Skan  @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope;
203117395Skan  tree value;
204117395Skan@};
205117395Skan@end smallexample
206117395Skan
207132718SkanIn this example, the value of BINDING_HAS_LEVEL_P when applied to a
208132718Skan@code{struct tree_binding *} is presumed to be 0 or 1.  If 1, the type
209132718Skanmechanism will treat the field @code{level} as being present and if 0,
210132718Skanwill treat the field @code{scope} as being present.
211117395Skan
212117395Skan@findex param_is
213117395Skan@findex use_param
214169689Skan@item param_is (@var{type})
215117395Skan@itemx use_param
216117395Skan
217117395SkanSometimes it's convenient to define some data structure to work on
218117395Skangeneric pointers (that is, @code{PTR}) and then use it with a specific
219117395Skantype.  @code{param_is} specifies the real type pointed to, and
220117395Skan@code{use_param} says where in the generic data structure that type
221117395Skanshould be put.
222117395Skan
223169689SkanFor instance, to have a @code{htab_t} that points to trees, one would
224169689Skanwrite the definition of @code{htab_t} like this:
225169689Skan@smallexample
226169689Skantypedef struct GTY(()) @{
227169689Skan  @dots{}
228169689Skan  void ** GTY ((use_param, @dots{})) entries;
229169689Skan  @dots{}
230169689Skan@} htab_t;
231169689Skan@end smallexample
232169689Skanand then declare variables like this:
233169689Skan@smallexample
234169689Skan  static htab_t GTY ((param_is (union tree_node))) ict;
235169689Skan@end smallexample
236117395Skan
237117395Skan@findex param@var{n}_is
238117395Skan@findex use_param@var{n}
239169689Skan@item param@var{n}_is (@var{type})
240117395Skan@itemx use_param@var{n}
241117395Skan
242117395SkanIn more complicated cases, the data structure might need to work on
243117395Skanseveral different types, which might not necessarily all be pointers.
244117395SkanFor this, @code{param1_is} through @code{param9_is} may be used to
245117395Skanspecify the real type of a field identified by @code{use_param1} through
246117395Skan@code{use_param9}.
247117395Skan
248117395Skan@findex use_params
249117395Skan@item use_params
250117395Skan
251117395SkanWhen a structure contains another structure that is parameterized,
252117395Skanthere's no need to do anything special, the inner structure inherits the
253117395Skanparameters of the outer one.  When a structure contains a pointer to a
254117395Skanparameterized structure, the type machinery won't automatically detect
255117395Skanthis (it could, it just doesn't yet), so it's necessary to tell it that
256117395Skanthe pointed-to structure should use the same parameters as the outer
257117395Skanstructure.  This is done by marking the pointer with the
258117395Skan@code{use_params} option.
259117395Skan
260117395Skan@findex deletable
261117395Skan@item deletable
262117395Skan
263117395Skan@code{deletable}, when applied to a global variable, indicates that when
264117395Skangarbage collection runs, there's no need to mark anything pointed to
265117395Skanby this variable, it can just be set to @code{NULL} instead.  This is used
266117395Skanto keep a list of free structures around for re-use.
267117395Skan
268117395Skan@findex if_marked
269169689Skan@item if_marked ("@var{expression}")
270117395Skan
271117395SkanSuppose you want some kinds of object to be unique, and so you put them
272117395Skanin a hash table.  If garbage collection marks the hash table, these
273117395Skanobjects will never be freed, even if the last other reference to them
274117395Skangoes away.  GGC has special handling to deal with this: if you use the
275117395Skan@code{if_marked} option on a global hash table, GGC will call the
276117395Skanroutine whose name is the parameter to the option on each hash table
277117395Skanentry.  If the routine returns nonzero, the hash table entry will
278117395Skanbe marked as usual.  If the routine returns zero, the hash table entry
279117395Skanwill be deleted.
280117395Skan
281117395SkanThe routine @code{ggc_marked_p} can be used to determine if an element
282117395Skanhas been marked already; in fact, the usual case is to use
283117395Skan@code{if_marked ("ggc_marked_p")}.
284117395Skan
285117395Skan@findex maybe_undef
286117395Skan@item maybe_undef
287117395Skan
288117395SkanWhen applied to a field, @code{maybe_undef} indicates that it's OK if
289117395Skanthe structure that this fields points to is never defined, so long as
290117395Skanthis field is always @code{NULL}.  This is used to avoid requiring
291117395Skanbackends to define certain optional structures.  It doesn't work with
292117395Skanlanguage frontends.
293117395Skan
294169689Skan@findex nested_ptr
295169689Skan@item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}")
296169689Skan
297169689SkanThe type machinery expects all pointers to point to the start of an
298169689Skanobject.  Sometimes for abstraction purposes it's convenient to have
299169689Skana pointer which points inside an object.  So long as it's possible to
300169689Skanconvert the original object to and from the pointer, such pointers
301169689Skancan still be used.  @var{type} is the type of the original object,
302169689Skanthe @var{to expression} returns the pointer given the original object,
303169689Skanand the @var{from expression} returns the original object given
304169689Skanthe pointer.  The pointer will be available using the @code{%h}
305169689Skanescape.
306169689Skan
307132718Skan@findex chain_next
308132718Skan@findex chain_prev
309169689Skan@item chain_next ("@var{expression}")
310169689Skan@itemx chain_prev ("@var{expression}")
311132718Skan
312132718SkanIt's helpful for the type machinery to know if objects are often
313132718Skanchained together in long lists; this lets it generate code that uses
314132718Skanless stack space by iterating along the list instead of recursing down
315132718Skanit.  @code{chain_next} is an expression for the next item in the list,
316169689Skan@code{chain_prev} is an expression for the previous item.  For singly
317169689Skanlinked lists, use only @code{chain_next}; for doubly linked lists, use
318169689Skanboth.  The machinery requires that taking the next item of the
319169689Skanprevious item gives the original item.
320132718Skan
321132718Skan@findex reorder
322169689Skan@item reorder ("@var{function name}")
323132718Skan
324132718SkanSome data structures depend on the relative ordering of pointers.  If
325169689Skanthe precompiled header machinery needs to change that ordering, it
326169689Skanwill call the function referenced by the @code{reorder} option, before
327169689Skanchanging the pointers in the object that's pointed to by the field the
328169689Skanoption applies to.  The function must take four arguments, with the
329169689Skansignature @samp{@w{void *, void *, gt_pointer_operator, void *}}.
330169689SkanThe first parameter is a pointer to the structure that contains the
331169689Skanobject being updated, or the object itself if there is no containing
332169689Skanstructure.  The second parameter is a cookie that should be ignored.
333169689SkanThe third parameter is a routine that, given a pointer, will update it
334169689Skanto its correct new value.  The fourth parameter is a cookie that must
335169689Skanbe passed to the second parameter.
336132718Skan
337169689SkanPCH cannot handle data structures that depend on the absolute values
338169689Skanof pointers.  @code{reorder} functions can be expensive.  When
339169689Skanpossible, it is better to depend on properties of the data, like an ID
340169689Skannumber or the hash of a string instead.
341132718Skan
342117395Skan@findex special
343169689Skan@item special ("@var{name}")
344117395Skan
345169689SkanThe @code{special} option is used to mark types that have to be dealt
346169689Skanwith by special case machinery.  The parameter is the name of the
347169689Skanspecial case.  See @file{gengtype.c} for further details.  Avoid
348169689Skanadding new special cases unless there is no other alternative.
349117395Skan@end table
350117395Skan
351117395Skan@node GGC Roots
352117395Skan@section Marking Roots for the Garbage Collector
353117395Skan@cindex roots, marking
354117395Skan@cindex marking roots
355117395Skan
356117395SkanIn addition to keeping track of types, the type machinery also locates
357169689Skanthe global variables (@dfn{roots}) that the garbage collector starts
358169689Skanat.  Roots must be declared using one of the following syntaxes:
359117395Skan
360169689Skan@itemize @bullet
361117395Skan@item
362169689Skan@code{extern GTY(([@var{options}])) @var{type} @var{name};}
363117395Skan@item
364169689Skan@code{static GTY(([@var{options}])) @var{type} @var{name};}
365169689Skan@end itemize
366169689Skan@noindent
367169689SkanThe syntax
368169689Skan@itemize @bullet
369169689Skan@item
370169689Skan@code{GTY(([@var{options}])) @var{type} @var{name};}
371169689Skan@end itemize
372169689Skan@noindent
373169689Skanis @emph{not} accepted.  There should be an @code{extern} declaration
374169689Skanof such a variable in a header somewhere---mark that, not the
375169689Skandefinition.  Or, if the variable is only used in one file, make it
376169689Skan@code{static}.
377117395Skan
378117395Skan@node Files
379117395Skan@section Source Files Containing Type Information
380117395Skan@cindex generated files
381117395Skan@cindex files, generated
382117395Skan
383169689SkanWhenever you add @code{GTY} markers to a source file that previously
384169689Skanhad none, or create a new source file containing @code{GTY} markers,
385169689Skanthere are three things you need to do:
386117395Skan
387117395Skan@enumerate
388117395Skan@item
389117395SkanYou need to add the file to the list of source files the type
390169689Skanmachinery scans.  There are four cases:
391117395Skan
392117395Skan@enumerate a
393117395Skan@item
394117395SkanFor a back-end file, this is usually done
395117395Skanautomatically; if not, you should add it to @code{target_gtfiles} in
396132718Skanthe appropriate port's entries in @file{config.gcc}.
397117395Skan
398117395Skan@item
399169689SkanFor files shared by all front ends, add the filename to the
400169689Skan@code{GTFILES} variable in @file{Makefile.in}.
401117395Skan
402132718Skan@item
403169689SkanFor files that are part of one front end, add the filename to the
404169689Skan@code{gtfiles} variable defined in the appropriate
405117395Skan@file{config-lang.in}.  For C, the file is @file{c-config-lang.in}.
406117395Skan
407169689Skan@item
408169689SkanFor files that are part of some but not all front ends, add the
409169689Skanfilename to the @code{gtfiles} variable of @emph{all} the front ends
410169689Skanthat use it.
411117395Skan@end enumerate
412117395Skan
413117395Skan@item
414117395SkanIf the file was a header file, you'll need to check that it's included
415117395Skanin the right place to be visible to the generated files.  For a back-end
416117395Skanheader file, this should be done automatically.  For a front-end header
417117395Skanfile, it needs to be included by the same file that includes
418117395Skan@file{gtype-@var{lang}.h}.  For other header files, it needs to be
419117395Skanincluded in @file{gtype-desc.c}, which is a generated file, so add it to
420132718Skan@code{ifiles} in @code{open_base_file} in @file{gengtype.c}.
421117395Skan
422117395SkanFor source files that aren't header files, the machinery will generate a
423117395Skanheader file that should be included in the source file you just changed.
424117395SkanThe file will be called @file{gt-@var{path}.h} where @var{path} is the
425117395Skanpathname relative to the @file{gcc} directory with slashes replaced by
426117395Skan@verb{|-|}, so for example the header file to be included in
427169689Skan@file{cp/parser.c} is called @file{gt-cp-parser.c}.  The
428117395Skangenerated header file should be included after everything else in the
429117395Skansource file.  Don't forget to mention this file as a dependency in the
430117395Skan@file{Makefile}!
431117395Skan
432117395Skan@end enumerate
433117395Skan
434117395SkanFor language frontends, there is another file that needs to be included
435117395Skansomewhere.  It will be called @file{gtype-@var{lang}.h}, where
436117395Skan@var{lang} is the name of the subdirectory the language is contained in.
437