1132718Skan@c Copyright (C) 2002, 2003, 2004 2117395Skan@c Free Software Foundation, Inc. 3117395Skan@c This is part of the GCC manual. 4117395Skan@c For copying conditions, see the file gcc.texi. 5117395Skan 6117395Skan@node Type Information 7117395Skan@chapter Memory Management and Type Information 8117395Skan@cindex GGC 9117395Skan@findex GTY 10117395Skan 11117395SkanGCC uses some fairly sophisticated memory management techniques, which 12117395Skaninvolve determining information about GCC's data structures from GCC's 13132718Skansource code and using this information to perform garbage collection and 14132718Skanimplement precompiled headers. 15117395Skan 16169689SkanA full C parser would be too complicated for this task, so a limited 17117395Skansubset of C is interpreted and special markers are used to determine 18169689Skanwhat parts of the source to look at. All @code{struct} and 19169689Skan@code{union} declarations that define data structures that are 20169689Skanallocated under control of the garbage collector must be marked. All 21169689Skanglobal variables that hold pointers to garbage-collected memory must 22169689Skanalso be marked. Finally, all global variables that need to be saved 23169689Skanand restored by a precompiled header must be marked. (The precompiled 24169689Skanheader mechanism can only save static variables if they're scalar. 25169689SkanComplex data structures must be allocated in garbage-collected memory 26169689Skanto be saved in a precompiled header.) 27117395Skan 28169689SkanThe full format of a marker is 29169689Skan@smallexample 30169689SkanGTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{})) 31169689Skan@end smallexample 32169689Skan@noindent 33169689Skanbut in most cases no options are needed. The outer double parentheses 34169689Skanare still necessary, though: @code{GTY(())}. Markers can appear: 35117395Skan 36169689Skan@itemize @bullet 37169689Skan@item 38169689SkanIn a structure definition, before the open brace; 39169689Skan@item 40169689SkanIn a global variable declaration, after the keyword @code{static} or 41169689Skan@code{extern}; and 42169689Skan@item 43169689SkanIn a structure field definition, before the name of the field. 44169689Skan@end itemize 45117395Skan 46169689SkanHere are some examples of marking simple data structures and globals. 47169689Skan 48169689Skan@smallexample 49169689Skanstruct @var{tag} GTY(()) 50169689Skan@{ 51169689Skan @var{fields}@dots{} 52169689Skan@}; 53169689Skan 54169689Skantypedef struct @var{tag} GTY(()) 55169689Skan@{ 56169689Skan @var{fields}@dots{} 57169689Skan@} *@var{typename}; 58169689Skan 59169689Skanstatic GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */ 60169689Skanstatic GTY(()) int @var{counter}; /* @r{save counter in a PCH} */ 61169689Skan@end smallexample 62169689Skan 63169689SkanThe parser understands simple typedefs such as 64169689Skan@code{typedef struct @var{tag} *@var{name};} and 65169689Skan@code{typedef int @var{name};}. 66169689SkanThese don't need to be marked. 67169689Skan 68117395Skan@menu 69117395Skan* GTY Options:: What goes inside a @code{GTY(())}. 70117395Skan* GGC Roots:: Making global variables GGC roots. 71117395Skan* Files:: How the generated files work. 72117395Skan@end menu 73117395Skan 74117395Skan@node GTY Options 75117395Skan@section The Inside of a @code{GTY(())} 76117395Skan 77169689SkanSometimes the C code is not enough to fully describe the type 78169689Skanstructure. Extra information can be provided with @code{GTY} options 79169689Skanand additional markers. Some options take a parameter, which may be 80169689Skaneither a string or a type name, depending on the parameter. If an 81169689Skanoption takes no parameter, it is acceptable either to omit the 82169689Skanparameter entirely, or to provide an empty string as a parameter. For 83169689Skanexample, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are 84169689Skanequivalent. 85117395Skan 86169689SkanWhen the parameter is a string, often it is a fragment of C code. Four 87169689Skanspecial escapes may be used in these strings, to refer to pieces of 88169689Skanthe data structure being marked: 89117395Skan 90117395Skan@cindex % in GTY option 91117395Skan@table @code 92117395Skan@item %h 93169689SkanThe current structure. 94117395Skan@item %1 95169689SkanThe structure that immediately contains the current structure. 96117395Skan@item %0 97169689SkanThe outermost structure that contains the current structure. 98117395Skan@item %a 99169689SkanA partial expression of the form @code{[i1][i2]...} that indexes 100169689Skanthe array item currently being marked. 101117395Skan@end table 102117395Skan 103169689SkanFor instance, suppose that you have a structure of the form 104169689Skan@smallexample 105169689Skanstruct A @{ 106169689Skan ... 107169689Skan@}; 108169689Skanstruct B @{ 109169689Skan struct A foo[12]; 110169689Skan@}; 111169689Skan@end smallexample 112169689Skan@noindent 113169689Skanand @code{b} is a variable of type @code{struct B}. When marking 114169689Skan@samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]}, 115169689Skan@code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a} 116169689Skanwould expand to @samp{[11]}. 117169689Skan 118169689SkanAs in ordinary C, adjacent strings will be concatenated; this is 119169689Skanhelpful when you have a complicated expression. 120169689Skan@smallexample 121169689Skan@group 122169689SkanGTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE" 123169689Skan " ? TYPE_NEXT_VARIANT (&%h.generic)" 124169689Skan " : TREE_CHAIN (&%h.generic)"))) 125169689Skan@end group 126169689Skan@end smallexample 127169689Skan 128117395SkanThe available options are: 129117395Skan 130117395Skan@table @code 131117395Skan@findex length 132169689Skan@item length ("@var{expression}") 133117395Skan 134117395SkanThere are two places the type machinery will need to be explicitly told 135117395Skanthe length of an array. The first case is when a structure ends in a 136117395Skanvariable-length array, like this: 137169689Skan@smallexample 138169689Skanstruct rtvec_def GTY(()) @{ 139169689Skan int num_elem; /* @r{number of elements} */ 140117395Skan rtx GTY ((length ("%h.num_elem"))) elem[1]; 141169689Skan@}; 142169689Skan@end smallexample 143169689Skan 144117395SkanIn this case, the @code{length} option is used to override the specified 145117395Skanarray length (which should usually be @code{1}). The parameter of the 146117395Skanoption is a fragment of C code that calculates the length. 147117395Skan 148117395SkanThe second case is when a structure or a global variable contains a 149117395Skanpointer to an array, like this: 150117395Skan@smallexample 151117395Skantree * 152117395Skan GTY ((length ("%h.regno_pointer_align_length"))) regno_decl; 153117395Skan@end smallexample 154117395SkanIn this case, @code{regno_decl} has been allocated by writing something like 155117395Skan@smallexample 156117395Skan x->regno_decl = 157117395Skan ggc_alloc (x->regno_pointer_align_length * sizeof (tree)); 158117395Skan@end smallexample 159117395Skanand the @code{length} provides the length of the field. 160117395Skan 161117395SkanThis second use of @code{length} also works on global variables, like: 162117395Skan@verbatim 163117395Skan static GTY((length ("reg_base_value_size"))) 164117395Skan rtx *reg_base_value; 165117395Skan@end verbatim 166117395Skan 167117395Skan@findex skip 168117395Skan@item skip 169117395Skan 170117395SkanIf @code{skip} is applied to a field, the type machinery will ignore it. 171117395SkanThis is somewhat dangerous; the only safe use is in a union when one 172117395Skanfield really isn't ever used. 173117395Skan 174117395Skan@findex desc 175117395Skan@findex tag 176117395Skan@findex default 177169689Skan@item desc ("@var{expression}") 178169689Skan@itemx tag ("@var{constant}") 179117395Skan@itemx default 180117395Skan 181117395SkanThe type machinery needs to be told which field of a @code{union} is 182132718Skancurrently active. This is done by giving each field a constant 183132718Skan@code{tag} value, and then specifying a discriminator using @code{desc}. 184132718SkanThe value of the expression given by @code{desc} is compared against 185132718Skaneach @code{tag} value, each of which should be different. If no 186132718Skan@code{tag} is matched, the field marked with @code{default} is used if 187132718Skanthere is one, otherwise no field in the union will be marked. 188132718Skan 189132718SkanIn the @code{desc} option, the ``current structure'' is the union that 190132718Skanit discriminates. Use @code{%1} to mean the structure containing it. 191169689SkanThere are no escapes available to the @code{tag} option, since it is a 192169689Skanconstant. 193132718Skan 194132718SkanFor example, 195117395Skan@smallexample 196117395Skanstruct tree_binding GTY(()) 197117395Skan@{ 198117395Skan struct tree_common common; 199117395Skan union tree_binding_u @{ 200117395Skan tree GTY ((tag ("0"))) scope; 201117395Skan struct cp_binding_level * GTY ((tag ("1"))) level; 202132718Skan @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope; 203117395Skan tree value; 204117395Skan@}; 205117395Skan@end smallexample 206117395Skan 207132718SkanIn this example, the value of BINDING_HAS_LEVEL_P when applied to a 208132718Skan@code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type 209132718Skanmechanism will treat the field @code{level} as being present and if 0, 210132718Skanwill treat the field @code{scope} as being present. 211117395Skan 212117395Skan@findex param_is 213117395Skan@findex use_param 214169689Skan@item param_is (@var{type}) 215117395Skan@itemx use_param 216117395Skan 217117395SkanSometimes it's convenient to define some data structure to work on 218117395Skangeneric pointers (that is, @code{PTR}) and then use it with a specific 219117395Skantype. @code{param_is} specifies the real type pointed to, and 220117395Skan@code{use_param} says where in the generic data structure that type 221117395Skanshould be put. 222117395Skan 223169689SkanFor instance, to have a @code{htab_t} that points to trees, one would 224169689Skanwrite the definition of @code{htab_t} like this: 225169689Skan@smallexample 226169689Skantypedef struct GTY(()) @{ 227169689Skan @dots{} 228169689Skan void ** GTY ((use_param, @dots{})) entries; 229169689Skan @dots{} 230169689Skan@} htab_t; 231169689Skan@end smallexample 232169689Skanand then declare variables like this: 233169689Skan@smallexample 234169689Skan static htab_t GTY ((param_is (union tree_node))) ict; 235169689Skan@end smallexample 236117395Skan 237117395Skan@findex param@var{n}_is 238117395Skan@findex use_param@var{n} 239169689Skan@item param@var{n}_is (@var{type}) 240117395Skan@itemx use_param@var{n} 241117395Skan 242117395SkanIn more complicated cases, the data structure might need to work on 243117395Skanseveral different types, which might not necessarily all be pointers. 244117395SkanFor this, @code{param1_is} through @code{param9_is} may be used to 245117395Skanspecify the real type of a field identified by @code{use_param1} through 246117395Skan@code{use_param9}. 247117395Skan 248117395Skan@findex use_params 249117395Skan@item use_params 250117395Skan 251117395SkanWhen a structure contains another structure that is parameterized, 252117395Skanthere's no need to do anything special, the inner structure inherits the 253117395Skanparameters of the outer one. When a structure contains a pointer to a 254117395Skanparameterized structure, the type machinery won't automatically detect 255117395Skanthis (it could, it just doesn't yet), so it's necessary to tell it that 256117395Skanthe pointed-to structure should use the same parameters as the outer 257117395Skanstructure. This is done by marking the pointer with the 258117395Skan@code{use_params} option. 259117395Skan 260117395Skan@findex deletable 261117395Skan@item deletable 262117395Skan 263117395Skan@code{deletable}, when applied to a global variable, indicates that when 264117395Skangarbage collection runs, there's no need to mark anything pointed to 265117395Skanby this variable, it can just be set to @code{NULL} instead. This is used 266117395Skanto keep a list of free structures around for re-use. 267117395Skan 268117395Skan@findex if_marked 269169689Skan@item if_marked ("@var{expression}") 270117395Skan 271117395SkanSuppose you want some kinds of object to be unique, and so you put them 272117395Skanin a hash table. If garbage collection marks the hash table, these 273117395Skanobjects will never be freed, even if the last other reference to them 274117395Skangoes away. GGC has special handling to deal with this: if you use the 275117395Skan@code{if_marked} option on a global hash table, GGC will call the 276117395Skanroutine whose name is the parameter to the option on each hash table 277117395Skanentry. If the routine returns nonzero, the hash table entry will 278117395Skanbe marked as usual. If the routine returns zero, the hash table entry 279117395Skanwill be deleted. 280117395Skan 281117395SkanThe routine @code{ggc_marked_p} can be used to determine if an element 282117395Skanhas been marked already; in fact, the usual case is to use 283117395Skan@code{if_marked ("ggc_marked_p")}. 284117395Skan 285117395Skan@findex maybe_undef 286117395Skan@item maybe_undef 287117395Skan 288117395SkanWhen applied to a field, @code{maybe_undef} indicates that it's OK if 289117395Skanthe structure that this fields points to is never defined, so long as 290117395Skanthis field is always @code{NULL}. This is used to avoid requiring 291117395Skanbackends to define certain optional structures. It doesn't work with 292117395Skanlanguage frontends. 293117395Skan 294169689Skan@findex nested_ptr 295169689Skan@item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}") 296169689Skan 297169689SkanThe type machinery expects all pointers to point to the start of an 298169689Skanobject. Sometimes for abstraction purposes it's convenient to have 299169689Skana pointer which points inside an object. So long as it's possible to 300169689Skanconvert the original object to and from the pointer, such pointers 301169689Skancan still be used. @var{type} is the type of the original object, 302169689Skanthe @var{to expression} returns the pointer given the original object, 303169689Skanand the @var{from expression} returns the original object given 304169689Skanthe pointer. The pointer will be available using the @code{%h} 305169689Skanescape. 306169689Skan 307132718Skan@findex chain_next 308132718Skan@findex chain_prev 309169689Skan@item chain_next ("@var{expression}") 310169689Skan@itemx chain_prev ("@var{expression}") 311132718Skan 312132718SkanIt's helpful for the type machinery to know if objects are often 313132718Skanchained together in long lists; this lets it generate code that uses 314132718Skanless stack space by iterating along the list instead of recursing down 315132718Skanit. @code{chain_next} is an expression for the next item in the list, 316169689Skan@code{chain_prev} is an expression for the previous item. For singly 317169689Skanlinked lists, use only @code{chain_next}; for doubly linked lists, use 318169689Skanboth. The machinery requires that taking the next item of the 319169689Skanprevious item gives the original item. 320132718Skan 321132718Skan@findex reorder 322169689Skan@item reorder ("@var{function name}") 323132718Skan 324132718SkanSome data structures depend on the relative ordering of pointers. If 325169689Skanthe precompiled header machinery needs to change that ordering, it 326169689Skanwill call the function referenced by the @code{reorder} option, before 327169689Skanchanging the pointers in the object that's pointed to by the field the 328169689Skanoption applies to. The function must take four arguments, with the 329169689Skansignature @samp{@w{void *, void *, gt_pointer_operator, void *}}. 330169689SkanThe first parameter is a pointer to the structure that contains the 331169689Skanobject being updated, or the object itself if there is no containing 332169689Skanstructure. The second parameter is a cookie that should be ignored. 333169689SkanThe third parameter is a routine that, given a pointer, will update it 334169689Skanto its correct new value. The fourth parameter is a cookie that must 335169689Skanbe passed to the second parameter. 336132718Skan 337169689SkanPCH cannot handle data structures that depend on the absolute values 338169689Skanof pointers. @code{reorder} functions can be expensive. When 339169689Skanpossible, it is better to depend on properties of the data, like an ID 340169689Skannumber or the hash of a string instead. 341132718Skan 342117395Skan@findex special 343169689Skan@item special ("@var{name}") 344117395Skan 345169689SkanThe @code{special} option is used to mark types that have to be dealt 346169689Skanwith by special case machinery. The parameter is the name of the 347169689Skanspecial case. See @file{gengtype.c} for further details. Avoid 348169689Skanadding new special cases unless there is no other alternative. 349117395Skan@end table 350117395Skan 351117395Skan@node GGC Roots 352117395Skan@section Marking Roots for the Garbage Collector 353117395Skan@cindex roots, marking 354117395Skan@cindex marking roots 355117395Skan 356117395SkanIn addition to keeping track of types, the type machinery also locates 357169689Skanthe global variables (@dfn{roots}) that the garbage collector starts 358169689Skanat. Roots must be declared using one of the following syntaxes: 359117395Skan 360169689Skan@itemize @bullet 361117395Skan@item 362169689Skan@code{extern GTY(([@var{options}])) @var{type} @var{name};} 363117395Skan@item 364169689Skan@code{static GTY(([@var{options}])) @var{type} @var{name};} 365169689Skan@end itemize 366169689Skan@noindent 367169689SkanThe syntax 368169689Skan@itemize @bullet 369169689Skan@item 370169689Skan@code{GTY(([@var{options}])) @var{type} @var{name};} 371169689Skan@end itemize 372169689Skan@noindent 373169689Skanis @emph{not} accepted. There should be an @code{extern} declaration 374169689Skanof such a variable in a header somewhere---mark that, not the 375169689Skandefinition. Or, if the variable is only used in one file, make it 376169689Skan@code{static}. 377117395Skan 378117395Skan@node Files 379117395Skan@section Source Files Containing Type Information 380117395Skan@cindex generated files 381117395Skan@cindex files, generated 382117395Skan 383169689SkanWhenever you add @code{GTY} markers to a source file that previously 384169689Skanhad none, or create a new source file containing @code{GTY} markers, 385169689Skanthere are three things you need to do: 386117395Skan 387117395Skan@enumerate 388117395Skan@item 389117395SkanYou need to add the file to the list of source files the type 390169689Skanmachinery scans. There are four cases: 391117395Skan 392117395Skan@enumerate a 393117395Skan@item 394117395SkanFor a back-end file, this is usually done 395117395Skanautomatically; if not, you should add it to @code{target_gtfiles} in 396132718Skanthe appropriate port's entries in @file{config.gcc}. 397117395Skan 398117395Skan@item 399169689SkanFor files shared by all front ends, add the filename to the 400169689Skan@code{GTFILES} variable in @file{Makefile.in}. 401117395Skan 402132718Skan@item 403169689SkanFor files that are part of one front end, add the filename to the 404169689Skan@code{gtfiles} variable defined in the appropriate 405117395Skan@file{config-lang.in}. For C, the file is @file{c-config-lang.in}. 406117395Skan 407169689Skan@item 408169689SkanFor files that are part of some but not all front ends, add the 409169689Skanfilename to the @code{gtfiles} variable of @emph{all} the front ends 410169689Skanthat use it. 411117395Skan@end enumerate 412117395Skan 413117395Skan@item 414117395SkanIf the file was a header file, you'll need to check that it's included 415117395Skanin the right place to be visible to the generated files. For a back-end 416117395Skanheader file, this should be done automatically. For a front-end header 417117395Skanfile, it needs to be included by the same file that includes 418117395Skan@file{gtype-@var{lang}.h}. For other header files, it needs to be 419117395Skanincluded in @file{gtype-desc.c}, which is a generated file, so add it to 420132718Skan@code{ifiles} in @code{open_base_file} in @file{gengtype.c}. 421117395Skan 422117395SkanFor source files that aren't header files, the machinery will generate a 423117395Skanheader file that should be included in the source file you just changed. 424117395SkanThe file will be called @file{gt-@var{path}.h} where @var{path} is the 425117395Skanpathname relative to the @file{gcc} directory with slashes replaced by 426117395Skan@verb{|-|}, so for example the header file to be included in 427169689Skan@file{cp/parser.c} is called @file{gt-cp-parser.c}. The 428117395Skangenerated header file should be included after everything else in the 429117395Skansource file. Don't forget to mention this file as a dependency in the 430117395Skan@file{Makefile}! 431117395Skan 432117395Skan@end enumerate 433117395Skan 434117395SkanFor language frontends, there is another file that needs to be included 435117395Skansomewhere. It will be called @file{gtype-@var{lang}.h}, where 436117395Skan@var{lang} is the name of the subdirectory the language is contained in. 437