1\input texinfo @c -*- Texinfo -*- 2@setfilename ctf-spec.info 3@settitle The CTF File Format 4@ifnottex 5@xrefautomaticsectiontitle on 6@end ifnottex 7@synindex fn cp 8@synindex tp cp 9@synindex vr cp 10 11@copying 12Copyright @copyright{} 2021-2022 Free Software Foundation, Inc. 13 14Permission is granted to copy, distribute and/or modify this document 15under the terms of the GNU General Public License, Version 3 or any 16later version published by the Free Software Foundation. A copy of the 17license is included in the section entitled ``GNU General Public 18License''. 19 20@end copying 21 22@dircategory Software development 23@direntry 24* CTF: (ctf-spec). The CTF file format. 25@end direntry 26 27@titlepage 28@title The CTF File Format 29@subtitle Version 3 30@author Nick Alcock 31 32@page 33@vskip 0pt plus 1filll 34@insertcopying 35@end titlepage 36@contents 37 38@ifnottex 39@node Top 40@top The CTF file format 41 42This manual describes version 3 of the CTF file format, which is 43intended to model the C type system in a fashion that C programs can 44consume at runtime. 45@end ifnottex 46 47@node Overview 48@unnumbered Overview 49@cindex Overview 50 51The CTF file format compactly describes C types and the association 52between function and data symbols and types: if embedded in ELF objects, 53it can exploit the ELF string table to reduce duplication further. 54There is no real concept of namespacing: only top-level types are 55described, not types scoped to within single functions. 56 57CTF dictionaries can be @dfn{children} of other dictionaries, in a 58one-level hierarchy: child dictionaries can refer to types in the 59parent, but the opposite is not sensible (since if you refer to a child 60type in the parent, the actual type you cited would vary depending on 61what child was attached). This parent/child definition is recorded in 62the child, but only as a recommendation: users of the API have to attach 63parents to children explicitly, and can choose to attach a child to any 64parent they like, or to none, though doing so might lead to unpleasant 65consequences like dangling references to types. @xref{Type indexes and 66type IDs}. Type lookups in child dicts that are not associated with a 67parent at all will fail with @code{ECTF_NOPARENT} if a parent type was 68needed. 69 70The associated API to generate, merge together, and query this file 71format will be described in the accompanying @code{libctf} manual once 72it is written. There is no API to modify dictionaries once they've been 73written out: CTF is a write-once file format. (However, it is always 74possible to dynamically create a new child dictionary on the fly and 75attach it to a pre-existing, read-only parent.) 76 77There are two major pieces to CTF: the @dfn{archive} and the 78@dfn{dictionary}. Some relatives and ancestors of CTF call dictionaries 79@dfn{containers}: the archive format is unique to this variant of CTF. 80(Much of the source code still uses the old term.) 81 82The archive file format is a very simple mmappable archive used to group 83multiple dictionaries together into groups: it is expected to slowly go 84away and be replaced by other mechanisms, but right now it is an 85important part of the file format, used to group dictionaries containing 86types with conflicting definitions in different TUs with the overarching 87dictionary used to store all other types. (Even when archives go away, 88the @code{libctf} API used to access them will remain, and access the 89other mechanisms that replace it instead.) 90 91The CTF dictionary consists of a @dfn{preamble}, which does not vary 92between versions of the CTF file format, and a @dfn{header} and some 93number of @dfn{sections}, which can vary between versions. 94 95The rest of this specification describes the format of these sections, 96first for the latest version of CTF, then for all earlier versions 97supported by @code{libctf}: the earlier versions are defined in terms of 98their differences from the next later one. We describe each part of the 99format first by reproducing the C structure which defines that part, 100then describing it at greater length in terms of file offsets. 101 102The description of the file format ends with a description of relevant 103limits that apply to it. These limits can vary between file format 104versions. 105 106This document is quite young, so for now the C code in @file{ctf.h} 107should be presumed correct when this document conflicts with it. 108 109@node CTF archive 110@chapter CTF archives 111@cindex archive, CTF archive 112 113The CTF archive format maps names to CTF dictionaries. The names may 114contain any character other than \0, but for now archives containing 115slashes in the names may not extract correctly. It is possible to 116insert multiple members with the same name, but these are quite hard to 117access reliably (you have to iterate through all the members rather than 118opening by name) so this is not recommended. 119 120CTF archives are not themselves compressed: the constituent components, 121CTF dictionaries, can be compressed. (@xref{CTF header}). 122 123CTF archives usually contain a collection of related dictionaries, one 124parent and many children of that parent. CTF archives can have a member 125with a @dfn{default name}, @code{.ctf} (which can be represented as 126@code{NULL} in the API). If present, this member is usually the parent 127of all the children, but it is possible for CTF producers to emit 128parents with different names if they wish (usually for backward- 129compatibility purposes). 130 131@code{.ctf} sections in ELF objects consist of a single CTF dictionary 132rather than an archive of dictionaries if and only if the section 133contains no types with identical names but conflicting definitions: if 134two conflicting definitions exist, the deduplicator will place the type 135most commonly referred to by other types in the parent and will place 136the other type in a child named after the translation unit it is found 137in, and will emit a CTF archive containing both dictionaries instead of 138a raw dictionary. All types that refer to such conflicting types are 139also placed in the per-translation-unit child. 140 141The definition of an archive in @file{ctf.h} is as follows: 142 143@verbatim 144struct ctf_archive 145{ 146 uint64_t ctfa_magic; 147 uint64_t ctfa_model; 148 uint64_t ctfa_nfiles; 149 uint64_t ctfa_names; 150 uint64_t ctfa_ctfs; 151}; 152 153typedef struct ctf_archive_modent 154{ 155 uint64_t name_offset; 156 uint64_t ctf_offset; 157} ctf_archive_modent_t; 158@end verbatim 159 160(Note one irregularity here: the @code{ctf_archive_t} is not a typedef 161to @code{struct ctf_archive}, but a different typedef, private to 162@code{libctf}, so that things that are not really archives can be made 163to appear as if they were.) 164 165All the above items are always in little-endian byte order, regardless 166of the machine endianness. 167 168The archive header has the following fields: 169 170@tindex struct ctf_archive 171@multitable {Offset} {@code{uint64_t ctfa_nfiles}} {The data model for this archive: an arbitrary integer} 172@headitem Offset @tab Name @tab Description 173@item 0x00 174@tab @code{uint64_t ctfa_magic} 175@vindex ctfa_magic 176@vindex struct ctf_archive, ctfa_magic 177@tab The magic number for archives, @code{CTFA_MAGIC}: 0x8b47f2a4d7623eeb. 178@tindex CTFA_MAGIC 179 180@item 0x08 181@tab @code{uint64_t ctfa_model} 182@vindex ctfa_model 183@vindex struct ctf_archive, ctfa_model 184@tab The data model for this archive: an arbitrary integer that serves no 185purpose but to be handed back by the libctf API. @xref{Data models}. 186 187@item 0x10 188@tab @code{uint64_t ctfa_nfiles} 189@vindex ctfa_nfiles 190@vindex struct ctf_archive, ctfa_nfiles 191@tab The number of CTF dictionaries in this archive. 192 193@item 0x18 194@tab @code{uint64_t ctfa_names} 195@vindex ctfa_names 196@vindex struct ctf_archive, ctfa_names 197@tab Offset of the name table, in bytes from the start of the archive. 198The name table is an array of @code{struct ctf_archive_modent_t[ctfa_nfiles]}. 199 200@item 0x20 201@tab @code{uint64_t ctfa_ctfs} 202@vindex ctfa_ctfs 203@vindex struct ctf_archive, ctfa_ctfs 204@tab Offset of the CTF table. Each element starts with a @code{uint64_t} size, 205followed by a CTF dictionary. 206 207@end multitable 208 209The array pointed to by @code{ctfa_names} is an array of entries of 210@code{ctf_archive_modent}: 211 212@tindex struct ctf_archive_modent 213@tindex ctf_archive_modent_t 214@multitable {Offset} {@code{uint64_t name_offset}} {Offset of this name, in bytes from the start} 215@headitem Offset @tab Name @tab Description 216@item 0x00 217@tab @code{uint64_t name_offset} 218@vindex name_offset 219@vindex struct ctf_archive_modent, name_offset 220@vindex ctf_archive_modent_t, name_offset 221@tab Offset of this name, in bytes from the start of the archive. 222 223@item 0x08 224@tab @code{uint64_t ctf_offset} 225@vindex ctf_offset 226@vindex struct ctf_archive_modent, ctf_offset 227@vindex ctf_archive_modent_t, ctf_offset 228@tab Offset of this CTF dictionary, in bytes from the start of the archive. 229 230@end multitable 231 232The @code{ctfa_names} array is sorted into ASCIIbetical order by name 233(i.e. by the result of dereferencing the @code{name_offset}). 234 235The archive file also contains a name table and a table of CTF 236dictionaries: these are pointed to by the structures above. The name 237table is a simple strtab which is not required to be sorted; the 238dictionary array is described above in the entry for @code{ctfa_ctfs}. 239 240The relative order of these various parts is not defined, except that 241the header naturally always comes first. 242 243@node CTF dictionaries 244@chapter CTF dictionaries 245@cindex dictionary, CTF dictionary 246 247CTF dictionaries consist of a header, starting with a premable, and a 248number of sections. 249 250@node CTF Preamble 251@section CTF Preamble 252 253The preamble is the only part of the CTF dictionary whose format cannot 254vary between versions. It is never compressed. It is correspondingly 255simple: 256 257@verbatim 258typedef struct ctf_preamble 259{ 260 unsigned short ctp_magic; 261 unsigned char ctp_version; 262 unsigned char ctp_flags; 263} ctf_preamble_t; 264@end verbatim 265 266@code{#define}s are provided under the names @code{cth_magic}, 267@code{cth_version} and @code{cth_flags} to make the fields of the 268@code{ctf_preamble_t} appear to be part of the @code{ctf_header_t}, so 269consuming programs rarely need to consider the existence of the preamble 270as a separate structure. 271 272@tindex struct ctf_preamble 273@tindex ctf_preamble_t 274@multitable {Offset} {@code{unsigned char ctp_version}} {The magic number for CTF dictionaries} 275@headitem Offset @tab Name @tab Description 276@item 0x00 277@tab @code{unsigned short ctp_magic} 278@vindex ctp_magic 279@vindex cth_magic 280@vindex ctf_preamble_t, ctp_magic 281@vindex struct ctf_preamble, ctp_magic 282@vindex ctf_header_t, cth_magic 283@vindex struct ctf_header, cth_magic 284@tab The magic number for CTF dictionaries, @code{CTF_MAGIC}: 0xdff2. 285@tindex CTF_MAGIC 286 287@item 0x02 288@tab @code {unsigned char ctp_version} 289@vindex ctp_version 290@vindex cth_version 291@vindex ctf_preamble_t, ctp_version 292@vindex struct ctf_preamble, ctp_version 293@vindex ctf_header_t, cth_version 294@vindex struct ctf_header, cth_version 295@tab The version number of this CTF dictionary. 296 297@item 0x03 298@tab @code{ctp_flags} 299@vindex ctp_flags 300@vindex cth_flags 301@vindex ctf_preamble_t, ctp_flags 302@vindex struct ctf_preamble, ctp_flags 303@vindex ctf_header_t, cth_flags 304@vindex struct ctf_header, cth_flags 305@tab Flags for this CTF file. @xref{CTF file-wide flags}. 306@end multitable 307 308@cindex alignment 309Every element of a dictionary must be naturally aligned unless otherwise 310specified. (This restriction will be lifted in later versions.) 311 312@cindex endianness 313CTF dictionaries are stored in the native endianness of the system that 314generates them: the consumer (e.g., @code{libctf}) can detect whether to 315endian-flip a CTF dictionary by inspecting the @code{ctp_magic}. (If it 316appears as 0xf2df, endian-flipping is needed.) 317 318The version of the CTF dictionary can be determined by inspecting 319@code{ctp_version}. The following versions are currently valid, and 320@code{libctf} can read all of them: 321 322@tindex CTF_VERSION_3 323@cindex CTF versions, versions 324@multitable {@code{CTF_VERSION_1_UPGRADED_3}} {Number} {First version, rare. Very similar to Solaris CTF.} 325@headitem Version @tab Number @tab Description 326@item @code{CTF_VERSION_1} 327@tab 1 @tab First version, rare. Very similar to Solaris CTF. 328 329@item @code{CTF_VERSION_1_UPGRADED_3} 330@tab 2 @tab First version, upgraded to v3 or higher and written out again. 331Name may change. Very rare. 332 333@item @code{CTF_VERSION_2} 334@tab 3 @tab Second version, with many range limits lifted. 335 336@item @code{CTF_VERSION_3} 337@tab 4 @tab Third and current version, documented here. 338@end multitable 339 340This section documents @code{CTF_VERSION_3}. 341 342@vindex ctp_flags 343@node CTF file-wide flags 344@subsection CTF file-wide flags 345 346The preamble contains bitflags in its @code{ctp_flags} field that 347describe various file-wide properties. Some of the flags are valid only 348for particular file-format versions, which means the flags can be used 349to fix file-format bugs. Consumers that see unknown flags should 350accordingly assume that the dictionary is not comprehensible, and 351refuse to open them. 352 353The following flags are currently defined. Many are bug workarounds, 354valid only in CTFv3, and will not be valid in any future versions: the 355same values may be reused for other flags in v4+. 356 357@multitable {@code{CTF_F_NEWFUNCINFO}} {Versions} {Value} {The external strtab is in @code{.dynstr} and the} 358@headitem Flag @tab Versions @tab Value @tab Meaning 359@tindex CTF_F_COMPRESS 360@item @code{CTF_F_COMPRESS} @tab All @tab 0x1 @tab Compressed with zlib 361@tindex CTF_F_NEWFUNCINFO 362@item @code{CTF_F_NEWFUNCINFO} @tab 3 only @tab 0x2 363@tab ``New-format'' func info section. 364@tindex CTF_F_IDXSORTED 365@item @code{CTF_F_IDXSORTED} @tab 3+ @tab 0x4 @tab The index section is 366in sorted order 367@tindex CTF_F_DYNSTR 368@item @code{CTF_F_DYNSTR} @tab 3 only @tab 0x8 @tab The external strtab is 369in @code{.dynstr} and the symtab used is @code{.dynsym}. 370@xref{The string section} 371@end multitable 372 373@code{CTF_F_NEWFUNCINFO} and @code{CTF_F_IDXSORTED} relate to the 374function info and data object sections. @xref{The symtypetab sections}. 375 376Further flags (and further compression methods) wil be added in future. 377 378@node CTF header 379@section CTF header 380@cindex CTF header 381@cindex Sections, header 382 383The CTF header is the first part of a CTF dictionary, including the 384preamble. All parts of it other than the preamble (@pxref{CTF Preamble}) 385can vary between CTF file versions and are never compressed. It 386contains things that apply to the dictionary as a whole, and a table of 387the sections into which the rest of the dictionary is divided. The 388sections tile the file: each section runs from the offset given until 389the start of the next section. Only the last section cannot follow this 390rule, so the header has a length for it instead. 391 392All section offsets, here and in the rest of the CTF file, are relative to the 393@emph{end} of the header. (This is annoyingly different to how offsets in CTF 394archives are handled.) 395 396This is the first structure to include offsets into the string table, which are 397not straight references because CTF dictionaries can include references into the 398ELF string table to save space, as well as into the string table internal to the 399CTF dictionary. @xref{The string section} for more on these. Offset 0 is 400always the null string. 401 402@verbatim 403typedef struct ctf_header 404{ 405 ctf_preamble_t cth_preamble; 406 uint32_t cth_parlabel; 407 uint32_t cth_parname; 408 uint32_t cth_cuname; 409 uint32_t cth_lbloff; 410 uint32_t cth_objtoff; 411 uint32_t cth_funcoff; 412 uint32_t cth_objtidxoff; 413 uint32_t cth_funcidxoff; 414 uint32_t cth_varoff; 415 uint32_t cth_typeoff; 416 uint32_t cth_stroff; 417 uint32_t cth_strlen; 418} ctf_header_t; 419@end verbatim 420 421In detail: 422 423@tindex struct ctf_header 424@tindex ctf_header_t 425@multitable {Offset} {@code{ctf_preamble_t cth_preamble}} {The parent label, if deduplication happened against} 426@headitem Offset @tab Name @tab Description 427@item 0x00 428@tab @code{ctf_preamble_t cth_preamble} 429@vindex cth_preamble 430@vindex struct ctf_header, cth_preamble 431@vindex ctf_header_t, cth_preamble 432@tab The preamble (conceptually embedded in the header). @xref{CTF Preamble} 433 434@item 0x04 435@tab @code{uint32_t cth_parlabel} 436@vindex cth_parlabel 437@vindex struct ctf_header, cth_parlabel 438@vindex ctf_header_t, cth_parlabel 439@tab The parent label, if deduplication happened against a specific label: a 440strtab offset. @xref{The label section}. Currently unused and always 0, but may 441be used in future when semantics are attached to the label section. 442 443@item 0x08 444@tab @code{uint32_t cth_parname} 445@vindex cth_parname 446@vindex struct ctf_header, cth_parname 447@vindex ctf_header_t, cth_parname 448@tab The name of the parent dictionary deduplicated against: a strtab offset. 449Interpretation is up to the consumer (usually a CTF archive member name). 0 450(the null string) if this is not a child dictionary. 451 452@item 0x1c 453@tab @code{uint32_t cth_cuname} 454@vindex cth_cuname 455@vindex struct ctf_header, cth_cuname 456@vindex ctf_header_t, cth_cuname 457@tab The name of the compilation unit, for consumers like GDB that want to 458know the name of CUs associated with single CUs: a strtab offset. 0 if this 459dictionary describes types from many CUs. 460 461@item 0x10 462@tab @code{uint32_t cth_lbloff} 463@vindex cth_lbloff 464@vindex struct ctf_header, cth_lbloff 465@vindex ctf_header_t, cth_lbloff 466@tab The offset of the label section, which tiles the type space into 467named regions. @xref{The label section}. 468 469@item 0x14 470@tab @code{uint32_t cth_objtoff} 471@vindex cth_objtoff 472@vindex struct ctf_header, cth_objtoff 473@vindex ctf_header_t, cth_objtoff 474@tab The offset of the data object symtypetab section, which maps ELF data symbols to 475types. @xref{The symtypetab sections}. 476 477@item 0x18 478@tab @code{uint32_t cth_funcoff} 479@vindex cth_funcoff 480@vindex struct ctf_header, cth_funcoff 481@vindex ctf_header_t, cth_funcoff 482@tab The offset of the function info symtypetab section, which maps ELF function 483symbols to a return type and arg types. @xref{The symtypetab sections}. 484 485@item 0x1c 486@tab @code{uint32_t cth_objtidxoff} 487@vindex cth_objtidxoff 488@vindex struct ctf_header, cth_objtidxoff 489@vindex ctf_header_t, cth_objtidxoff 490@tab The offset of the object index section, which maps ELF object symbols to 491entries in the data object section. @xref{The symtypetab sections}. 492 493@item 0x20 494@tab @code{uint32_t cth_funcidxoff} 495@vindex cth_funcidxoff 496@vindex struct ctf_header, cth_funcidxoff 497@vindex ctf_header_t, cth_funcidxoff 498@tab The offset of the function info index section, which maps ELF function 499symbols to entries in the function info section. @xref{The symtypetab sections}. 500 501@item 0x24 502@tab @code{uint32_t cth_varoff} 503@vindex cth_varoff 504@vindex struct ctf_header, cth_varoff 505@vindex ctf_header_t, cth_varoff 506@tab The offset of the variable section, which maps string names to types. 507@xref{The variable section}. 508 509@item 0x28 510@tab @code{uint32_t cth_typeoff} 511@vindex cth_typeoff 512@vindex struct ctf_header, cth_typeoff 513@vindex ctf_header_t, cth_typeoff 514@tab The offset of the type section, the core of CTF, which describes types 515 using variable-length array elements. @xref{The type section}. 516 517@item 0x2c 518@tab @code{uint32_t cth_stroff} 519@vindex cth_stroff 520@vindex struct ctf_header, cth_stroff 521@vindex ctf_header_t, cth_stroff 522@tab The offset of the string section. @xref{The string section}. 523 524@item 0x30 525@tab @code{uint32_t cth_strlen} 526@vindex cth_strlen 527@vindex struct ctf_header, cth_strlen 528@vindex ctf_header_t, cth_strlen 529@tab The length of the string section (not an offset!). The CTF file ends 530at this point. 531 532@end multitable 533 534Everything from this point on (until the end of the file at @code{cth_stroff} + 535@code{cth_strlen}) is compressed with zlib if @code{CTF_F_COMPRESS} is set in 536the preamble's @code{ctp_flags}. 537 538@node The type section 539@section The type section 540@cindex Type section 541@cindex Sections, type 542 543This section is the most important section in CTF, describing all the top-level 544types in the program. It consists of an array of type structures, each of which 545describes a type of some @dfn{kind}: each kind of type has some amount of 546variable-length data associated with it (some kinds have none). The amount of 547variable-length data associated with a given type can be determined by 548inspecting the type, so the reading code can walk through the types in sequence 549at opening time. 550 551Each type structure is one of a set of overlapping structures in a discriminated 552union of sorts: the variable-length data for each type immediately follows the 553type's type structure. Here's the largest of the overlapping structures, which 554is only needed for huge types and so is very rarely seen: 555 556@verbatim 557typedef struct ctf_type 558{ 559 uint32_t ctt_name; 560 uint32_t ctt_info; 561 __extension__ 562 union 563 { 564 uint32_t ctt_size; 565 uint32_t ctt_type; 566 }; 567 uint32_t ctt_lsizehi; 568 uint32_t ctt_lsizelo; 569} ctf_type_t; 570@end verbatim 571 572Here's the much more common smaller form: 573 574@verbatim 575typedef struct ctf_stype 576{ 577 uint32_t ctt_name; 578 uint32_t ctt_info; 579 __extension__ 580 union 581 { 582 uint32_t ctt_size; 583 uint32_t ctt_type; 584 }; 585} ctf_type_t; 586@end verbatim 587 588If @code{ctt_size} is the #define @code{CTF_LSIZE_SENT}, 0xffffffff, this type 589is described by a @code{ctf_type_t}: otherwise, a @code{ctf_stype_t}. 590@tindex CTF_LSIZE_SENT 591 592Here's what the fields mean: 593 594@tindex struct ctf_type 595@tindex struct ctf_stype 596@tindex ctf_type_t 597@tindex ctf_stype_t 598@multitable {0x1c (@code{ctf_type_t}} {@code{uint32_t ctt_lsizehi}} {The size of this type, if this type is of a kind for} 599@headitem Offset @tab Name @tab Description 600@item 0x00 601@tab @code{uint32_t ctt_name} 602@vindex ctt_name 603@tab Strtab offset of the type name, if any (0 if none). 604 605@item 0x04 606@tab @code{uint32_t ctt_info} 607@vindex ctt_info 608@vindex struct ctf_type, ctt_info 609@vindex ctf_type_t, ctt_info 610@vindex struct ctf_stype, ctt_info 611@vindex ctf_stype_t, ctt_info 612@tab The @dfn{info word}, containing information on the kind of this type, its 613variable-length data and whether it is visible to name lookup. See @xref{The 614info word}. 615 616@item 0x08 617@tab @code{uint32_t ctt_size} 618@vindex ctt_size 619@vindex struct ctf_type, ctt_size 620@vindex ctf_type_t, ctt_size 621@vindex struct ctf_stype, ctt_size 622@vindex ctf_stype_t, ctt_size 623@tab The size of this type, if this type is of a kind for which a size needs 624to be recorded (constant-size types don't need one). If this is 625@code{CTF_LSIZE_SENT}, this type is a huge type described by @code{ctf_type_t}. 626 627@item 0x08 628@tab @code{uint32_t ctt_type} 629@vindex ctt_type 630@vindex struct ctf_stype, ctt_type 631@vindex ctf_stype_t, ctt_type 632@tab The type this type refers to, if this type is of a kind which refers to 633other types (like a pointer). All such types are fixed-size, and no types that 634are variable-size refer to other types, so @code{ctt_size} and @code{ctt_type} 635overlap. All type kinds that use @code{ctt_type} are described by 636@code{ctf_stype_t}, not @code{ctf_type_t}. @xref{Type indexes and type IDs}. 637 638@item 0x0c (@code{ctf_type_t} only) 639@tab @code{uint32_t ctt_lsizehi} 640@vindex ctt_lsizehi 641@vindex struct ctf_type, ctt_lsizehi 642@vindex ctf_type_t, ctt_lsizehi 643@tab The high 32 bits of the size of a very large type. The @code{CTF_TYPE_LSIZE} macro 644can be used to get a 64-bit size out of this field and the next one. 645@code{CTF_SIZE_TO_LSIZE_HI} splits the @code{ctt_lsizehi} out of it again. 646@findex CTF_TYPE_LSIZE 647@findex CTF_SIZE_TO_LSIZE_HI 648 649@item 0x10 (@code{ctf_type_t} only) 650@tab @code{uint32_t ctt_lsizelo} 651@vindex ctt_lsizelo 652@vindex struct ctf_type, ctt_lsizelo 653@vindex ctf_type_t, ctt_lsizelo 654@tab The low 32 bits of the size of a very large type. 655@code{CTF_SIZE_TO_LSIZE_LO} splits the @code{ctt_lsizelo} out of a 64-bit size. 656@findex CTF_SIZE_TO_LSIZE_LO 657@end multitable 658 659Two aspects of this need further explanation: the info word, and what exactly a 660type ID is and how you determine it. (Information on the various type-kind- 661dependent things, like whether @code{ctt_size} or @code{ctt_type} is used, 662is described in the section devoted to each kind.) 663 664@node The info word 665@subsection The info word, ctt_info 666 667The info word is a bitfield split into three parts. From MSB to LSB: 668 669@multitable {Bit offset} {@code{isroot}} {Length of variable-length data for this type (some kinds only).} 670@headitem Bit offset @tab Name @tab Description 671@item 26--31 672@tab @code{kind} 673@tab Type kind: @pxref{Type kinds}. 674 675@item 25 676@tab @code{isroot} 677@tab 1 if this type is visible to name lookup 678 679@item 0--24 680@tab @code{vlen} 681@tab Length of variable-length data for this type (some kinds only). 682The variable-length data directly follows the @code{ctf_type_t} or 683@code{ctf_stype_t}. This is a kind-dependent array length value, 684not a length in bytes. Some kinds have no variable-length data, or 685fixed-size variable-length data, and do not use this value. 686@end multitable 687 688The most mysterious of these is undoubtedly @code{isroot}. This indicates 689whether types with names (nonzero @code{ctt_name}) are visible to name lookup: 690if zero, this type is considered a @dfn{non-root type} and you can't look it up 691by name at all. Multiple types with the same name in the same C namespace 692(struct, union, enum, other) can exist in a single dictionary, but only one of 693them may have a nonzero value for @code{isroot}. @code{libctf} validates this 694at open time and refuses to open dictionaries that violate this constraint. 695 696Historically, this feature was introduced for the encoding of bitfields 697(@pxref{Integer types}): for instance, int bitfields will all be named 698@code{int} with different widths or offsets, but only the full-width one at 699offset zero is wanted when you look up the type named @code{int}. With the 700introduction of slices (@pxref{Slices}) as a more general bitfield encoding 701mechanism, this is less important, but we still use non-root types to handle 702conflicts if the linker API is used to fuse multiple translation units into one 703dictionary and those translation units contain types with the same name and 704conflicting definitions. (We do not discuss this further here, because the 705linker never does this: only specialized type mergers do, like that used for the 706Linux kernel. The libctf documentation will describe this in more detail.) 707@c XXX update when libctf docs are written. 708 709The @code{CTF_TYPE_INFO} macro can be used to compose an info word from 710a @code{kind}, @code{isroot}, and @code{vlen}; @code{CTF_V2_INFO_KIND}, 711@code{CTF_V2_INFO_ISROOT} and @code{CTF_V2_INFO_VLEN} pick it apart again. 712@findex CTF_TYPE_INFO 713@findex CTF_V2_INFO_KIND 714@findex CTF_V2_INFO_ISROOT 715@findex CTF_V2_INFO_VLEN 716 717@node Type indexes and type IDs 718@subsection Type indexes and type IDs 719@cindex Type indexes 720@cindex Type IDs 721@cindex Type, IDs of 722@cindex Type, indexes of 723@cindex ctf_id_t 724 725@cindex Parent range 726@cindex Child range 727@cindex Type IDs, ranges 728Types are referred to within the CTF file via @dfn{type IDs}. A type ID is a 729number from 0 to @math{2^32}, from a space divided in half. Types @math{2^31-1} 730and below are in the @dfn{parent range}: these IDs are used for dictionaries 731that have not had any other dictionary @code{ctf_import}ed into it as a parent. 732Both completely standalone dictionaries and parent dictionaries with children 733hanging off them have types in this range. Types @math{2^31} and above are in 734the @dfn{child range}: only types in child dictionaries are in this range. 735 736These IDs appear in @code{ctf_type_t.ctt_type} (@pxref{The type section}), but 737the types themselves have no visible ID: quite intentionally, because adding an 738ID uses space, and every ID is different so they don't compress well. The IDs 739are implicit: at open time, the consumer walks through the entire type section 740and counts the types in the type section. The type section is an array of 741variable-length elements, so each entry could be considered as having an index, 742starting from 1. We count these indexes and associate each with its 743corresponding @code{ctf_type_t} or @code{ctf_stype_t}. 744 745Lookups of types with IDs in the parent space look in the parent dictionary if 746this dictionary has one associated with it; lookups of types with IDs in the 747child space error out if the dictionary does not have a parent, and otherwise 748convert the ID into an index by shaving off the top bit and look up the index 749in the child. 750 751These properties mean that the same dictionary can be used as a parent of child 752dictionaries and can also be used directly with no children at all, but a 753dictionary created as a child dictionary must always be associated with a parent 754--- usually, the same parent --- because its references to its own types have 755the high bit turned on and this is only flipped off again if this is a child 756dictionary. (This is not a problem, because if you @emph{don't} associate the 757child with a parent, any references within it to its parent types will fail, and 758there are almost certain to be many such references, or why is it a child at 759all?) 760 761This does mean that consumers should keep a close eye on the distinction between 762type IDs and type indexes: if you mix them up, everything will appear to work as 763long as you're only using parent dictionaries or standalone dictionaries, but as 764soon as you start using children, everything will fail horribly. 765 766Type index zero, and type ID zero, are used to indicate that this type cannot be 767represented in CTF as currently constituted: they are emitted by the compiler, 768but all type chains that terminate in the unknown type are erased at link time 769(structure fields that use them just vanish, etc). So you will probably never 770see a use of type zero outside the symtypetab sections, where they serve as 771sentinels of sorts, to indicate symbols with no associated type. 772 773The macros @code{CTF_V2_TYPE_TO_INDEX} and @code{CTF_V2_INDEX_TO_TYPE} may help 774in translation between types and indexes: @code{CTF_V2_TYPE_ISPARENT} and 775@code{CTF_V2_TYPE_ISCHILD} can be used to tell whether a given ID is in the 776parent or child range. 777@findex CTF_V2_TYPE_TO_INDEX 778@findex CTF_V2_INDEX_TO_TYPE 779@findex CTF_V2_TYPE_ISPARENT 780@findex CTF_V2_TYPE_ISCHILD 781 782It is quite possible and indeed common for type IDs to point forward in the 783dictionary, as well as backward. 784 785@node Type kinds 786@subsection Type kinds 787@cindex Type kinds 788@cindex Type, kinds of 789 790Every type in CTF is of some @dfn{kind}. Each kind is some variety of C type: 791all structures are a single kind, as are all unions, all pointers, all arrays, 792all integers regardless of their bitfield width, etc. The kind of a type is 793given in the @code{kind} field of the @code{ctt_info} word (@pxref{The info 794word}). 795 796The space of type kinds is only a quarter full so far, so there is plenty of 797room for expansion. It is likely that in future versions of the file format, 798types with smaller kinds will be more efficiently encoded than types with larger 799kinds, so their numerical value will actually start to matter in future. (So 800these IDs will probably change their numerical values in a later release of this 801format, to move more frequently-used kinds like structures and cv-quals towards 802the top of the space, and move rarely-used kinds like integers downwards. Yes, 803integers are rare: how many kinds of @code{int} are there in a program? They're 804just very frequently @emph{referenced}.) 805 806Here's the set of kinds so far. Each kind has a @code{#define} associated with 807it, also given here. 808 809@multitable {Kind} {@code{CTF_K_VOLATILE}} {Indicates a type that cannot be represented in CTF, or that} {@xref{Pointers typedefs and cvr-quals}} 810@headitem Kind @tab Macro @tab Purpose 811@item 0 812@tab @code{CTF_K_UNKNOWN} 813@tab Indicates a type that cannot be represented in CTF, or that is being skipped. 814It is very similar to type ID 0, except that you can have @emph{multiple}, distinct types 815of kind @code{CTF_K_UNKNOWN}. 816@tindex CTF_K_UNKNOWN 817 818@item 1 819@tab @code{CTF_K_INTEGER} 820@tab An integer type. @xref{Integer types}. 821 822@item 2 823@tab @code{CTF_K_FLOAT} 824@tab A floating-point type. @xref{Floating-point types}. 825 826@item 3 827@tab @code{CTF_K_POINTER} 828@tab A pointer. @xref{Pointers typedefs and cvr-quals}. 829 830@item 4 831@tab @code{CTF_K_ARRAY} 832@tab An array. @xref{Arrays}. 833 834@item 5 835@tab @code{CTF_K_FUNCTION} 836@tab A function pointer. @xref{Function pointers}. 837 838@item 6 839@tab @code{CTF_K_STRUCT} 840@tab A structure. @xref{Structs and unions}. 841 842@item 7 843@tab @code{CTF_K_UNION} 844@tab A union. @xref{Structs and unions}. 845 846@item 8 847@tab @code{CTF_K_ENUM} 848@tab An enumerated type. @xref{Enums}. 849 850@item 9 851@tab @code{CTF_K_FORWARD} 852@tab A forward. @xref{Forward declarations}. 853 854@item 10 855@tab @code{CTF_K_TYPEDEF} 856@tab A typedef. @xref{Pointers typedefs and cvr-quals}. 857 858@item 11 859@tab @code{CTF_K_VOLATILE} 860@tab A volatile-qualified type. @xref{Pointers typedefs and cvr-quals}. 861 862@item 12 863@tab @code{CTF_K_CONST} 864@tab A const-qualified type. @xref{Pointers typedefs and cvr-quals}. 865 866@item 13 867@tab @code{CTF_K_RESTRICT} 868@tab A restrict-qualified type. @xref{Pointers typedefs and cvr-quals}. 869 870@item 14 871@tab @code{CTF_K_SLICE} 872@tab A slice, a change of the bit-width or offset of some other type. @xref{Slices}. 873@end multitable 874 875Now we cover all type kinds in turn. Some are more complicated than others. 876 877@node Integer types 878@subsection Integer types 879@cindex Integer types 880@cindex Types, integer 881@tindex int 882@tindex long 883@tindex long long 884@tindex short 885@tindex char 886@tindex bool 887@tindex unsigned int 888@tindex unsigned long 889@tindex unsigned long long 890@tindex unsigned short 891@tindex unsigned char 892@tindex signed int 893@tindex signed long 894@tindex signed long long 895@tindex signed short 896@tindex signed char 897@cindex CTF_K_INTEGER 898 899Integral types are all represented as types of kind @code{CTF_K_INTEGER}. These 900types fill out @code{ctt_size} in the @code{ctf_stype_t} with the size in bytes 901of the integral type in question. They are always represented by 902@code{ctf_stype_t}, never @code{ctf_type_t}. Their variable-length data is one 903@code{uint32_t} in length: @code{vlen} in the info word should be disregarded 904and is always zero. 905 906The variable-length data for integers has multiple items packed into it much 907like the info word does. 908 909@multitable {Bit offset} {Encoding} {The integer encoding and desired display representation.} 910@headitem Bit offset @tab Name @tab Description 911@item 24--31 912@tab Encoding 913@tab The desired display representation of this integer. You can extract this 914field with the @code{CTF_INT_ENCODING} macro. See below. 915@findex CTF_INT_ENCODING 916 917@item 16--23 918@tab Offset 919@tab The offset of this integral type in bits from the start of its enclosing 920structure field, adjusted for endianness: @pxref{Structs and unions}. You can 921extract this field with the @code{CTF_INT_OFFSET} macro. 922@findex CTF_INT_OFFSET 923 924@item 0--15 925@tab Bit-width 926@tab The width of this integral type in bits. You can extract this field with 927the @code{CTF_INT_BITS} macro. 928@findex CTF_INT_BITS 929@end multitable 930 931If you choose, bitfields can be represented using the things above as a sort of 932integral type with the @code{isroot} bit flipped off and the offset and bits 933values set in the vlen word: you can populate it with the @code{CTF_INT_DATA} 934macro. (But it may be more convenient to represent them using slices of a 935full-width integer: @pxref{Slices}.) 936@findex CTF_INT_DATA 937 938Integers that are bitfields usually have a @code{ctt_size} rounded up to the 939nearest power of two in bytes, for natural alignment (e.g. a 17-bit integer 940would have a @code{ctt_size} of 4). However, not all types are naturally 941aligned on all architectures: packed structures may in theory use integral 942bitfields with different @code{ctt_size}, though this is rarely observed. 943 944The @dfn{encoding} for integers is a bit-field comprised of the values below, 945which consumers can use to decide how to display values of this type: 946 947@multitable {Offset} {@code{CTF_INT_VARARGS}} {If set, this is a char type. It is platform-dependent whether unadorned} 948@headitem Offset @tab Name @tab Description 949@item 0x01 950@tab @code{CTF_INT_SIGNED} 951@tab If set, this is a signed int: if false, unsigned. 952@tindex CTF_INT_SIGNED 953 954@item 0x02 955@tab @code{CTF_INT_CHAR} 956@tab If set, this is a char type. It is platform-dependent whether unadorned 957@code{char} is signed or not: the @code{CTF_CHAR} macro produces an integral 958type suitable for the definition of @code{char} on this platform. 959@tindex CTF_INT_CHAR 960@findex CTF_CHAR 961 962@item 0x04 963@tab @code{CTF_INT_BOOL} 964@tab If set, this is a boolean type. (It is theoretically possible to turn this 965and @code{CTF_INT_CHAR} on at the same time, but it is not clear what this would 966mean.) 967@tindex CTF_INT_BOOL 968 969@item 0x08 970@tab @code{CTF_INT_VARARGS} 971@tab If set, this is a varargs-promoted value in a K&R function definition. 972This is not currently produced or consumed by anything that we know of: it is set 973aside for future use. 974@end multitable 975 976The GCC ``@code{Complex int}'' and fixed-point extensions are not yet supported: 977references to such types will be emitted as type 0. 978 979@node Floating-point types 980@subsection Floating-point types 981@cindex Floating-point types 982@cindex Types, floating-point 983@tindex float 984@tindex double 985@tindex signed float 986@tindex signed double 987@tindex unsigned float 988@tindex unsigned double 989@tindex Complex, float 990@tindex Complex, double 991@tindex Complex, signed float 992@tindex Complex, signed double 993@tindex Complex, unsigned float 994@tindex Complex, unsigned double 995@cindex CTF_K_FLOAT 996 997Floating-point types are all represented as types of kind @code{CTF_K_FLOAT}. 998Like integers, These types fill out @code{ctt_size} in the @code{ctf_stype_t} 999with the size in bytes of the floating-point type in question. They are always 1000represented by @code{ctf_stype_t}, never @code{ctf_type_t}. 1001 1002This part of CTF shows many rough edges in the more obscure corners of 1003floating-point handling, and is likely to change in format v4. 1004 1005The variable-length data for floats has multiple items packed into it just like 1006integers do: 1007 1008@multitable {Bit offset} {Encoding} {The floating-;point encoding and desired display representation.} 1009@headitem Bit offset @tab Name @tab Description 1010@item 24--31 1011@tab Encoding 1012@tab The desired display representation of this float. You can extract this 1013field with the @code{CTF_FP_ENCODING} macro. See below. 1014@findex CTF_FP_ENCODING 1015 1016@item 16--23 1017@tab Offset 1018@tab The offset of this floating-point type in bits from the start of its enclosing 1019structure field, adjusted for endianness: @pxref{Structs and unions}. You can 1020extract this field with the @code{CTF_FP_OFFSET} macro. 1021@findex CTF_FP_OFFSET 1022 1023@item 0--15 1024@tab Bit-width 1025@tab The width of this floating-point type in bits. You can extract this field with 1026the @code{CTF_FP_BITS} macro. 1027@findex CTF_FP_BITS 1028@end multitable 1029 1030The purpose of the floating-point offset and bit-width is somewhat opaque, since 1031there are no such things as floating-point bitfields in C: the bit-width should 1032be filled out with the full width of the type in bits, and the offset should 1033always be zero. It is likely that these fields will go away in the future. As 1034with integers, you can use @code{CTF_FP_DATA} to assemble one of these vlen 1035items from its component parts. 1036@findex CTF_INT_DATA 1037 1038The @dfn{encoding} for floats is not a bitfield but a simple value indicating 1039the display representation. Many of these are unused, relate to 1040Solaris-specific compiler extensions, and will be recycled in future: some are 1041unused and will become used in future. 1042 1043@multitable {Offset} {@code{CTF_FP_LDIMAGRY}} {This is a @code{float} interval type, a Solaris-specific extension.} 1044@headitem Offset @tab Name @tab Description 1045@item 1 1046@tab @code{CTF_FP_SINGLE} 1047@tab This is a single-precision IEEE 754 @code{float}. 1048@tindex CTF_FP_SINGLE 1049@item 2 1050@tab @code{CTF_FP_DOUBLE} 1051@tab This is a double-precision IEEE 754 @code{double}. 1052@tindex CTF_FP_DOUBLE 1053@item 3 1054@tab @code{CTF_FP_CPLX} 1055@tab This is a @code{Complex float}. 1056@tindex CTF_FP_CPLX 1057@item 4 1058@tab @code{CTF_FP_DCPLX} 1059@tab This is a @code{Complex double}. 1060@tindex CTF_FP_DCPLX 1061@item 5 1062@tab @code{CTF_FP_LDCPLX} 1063@tab This is a @code{Complex long double}. 1064@tindex CTF_FP_LDCPLX 1065@item 6 1066@tab @code{CTF_FP_LDOUBLE} 1067@tab This is a @code{long double}. 1068@tindex CTF_FP_LDOUBLE 1069@item 7 1070@tab @code{CTF_FP_INTRVL} 1071@tab This is a @code{float} interval type, a Solaris-specific extension. 1072Unused: will be recycled. 1073@tindex CTF_FP_INTRVL 1074@cindex Unused bits 1075@item 8 1076@tab @code{CTF_FP_DINTRVL} 1077@tab This is a @code{double} interval type, a Solaris-specific extension. 1078Unused: will be recycled. 1079@tindex CTF_FP_DINTRVL 1080@cindex Unused bits 1081@item 9 1082@tab @code{CTF_FP_LDINTRVL} 1083@tab This is a @code{long double} interval type, a Solaris-specific extension. 1084Unused: will be recycled. 1085@tindex CTF_FP_LDINTRVL 1086@cindex Unused bits 1087@item 10 1088@tab @code{CTF_FP_IMAGRY} 1089@tab This is a the imaginary part of a @code{Complex float}. Not currently 1090generated. May change. 1091@tindex CTF_FP_IMAGRY 1092@cindex Unused bits 1093@item 11 1094@tab @code{CTF_FP_DIMAGRY} 1095@tab This is a the imaginary part of a @code{Complex double}. Not currently 1096generated. May change. 1097@tindex CTF_FP_DIMAGRY 1098@cindex Unused bits 1099@item 12 1100@tab @code{CTF_FP_LDIMAGRY} 1101@tab This is a the imaginary part of a @code{Complex long double}. Not currently 1102generated. May change. 1103@tindex CTF_FP_LDIMAGRY 1104@cindex Unused bits 1105@end multitable 1106 1107The use of the complex floating-point encodings is obscure: it is possible that 1108@code{CTF_FP_CPLX} is meant to be used for only the real part of complex types, 1109and @code{CTF_FP_IMAGRY} et al for the imaginary part -- but for now, we are 1110emitting @code{CTF_FP_CPLX} to cover the entire type, with no way to get at its 1111constituent parts. There appear to be no uses of these encodings anywhere, so 1112they are quite likely to change incompatibly in future. 1113 1114@node Slices 1115@subsection Slices 1116@cindex Slices 1117@cindex Types, slices of integral 1118@tindex CTF_K_SLICE 1119 1120Slices, with kind @code{CTF_K_SLICE}, are an unusual CTF construct: they do not 1121directly correspond to any C type, but are a way to model other types in a more 1122convenient fashion for CTF generators. 1123 1124A slice is like a pointer or other reference type in that they are always 1125represented by @code{ctf_stype_t}: but unlike pointers and other reference 1126types, they populate the @code{ctt_size} field just like integral types do, and 1127come with an attached encoding and transform the encoding of the underlying 1128type. The underlying type is described in the variable-length data, similarly 1129to structure and union fields: see below. Requests for the type size should 1130also chase down to the referenced type. 1131 1132Slices are always nameless: @code{ctt_name} is always zero for them. 1133 1134(The @code{libctf} API behaviour is unusual as well, and justifies the existence 1135of slices: @code{ctf_type_kind} never returns @code{CTF_K_SLICE} but always the 1136underlying type kind, so that consumers never need to know about slices: they 1137can tell if an apparent integer is actually a slice if they need to by calling 1138@code{ctf_type_reference}, which will uniquely return the underlying integral 1139type rather than erroring out with @code{ECTF_NOTREF} if this is actually a 1140slice. So slices act just like an integer with an encoding, but more closely 1141mirror DWARF and other debugging information formats by allowing CTF file 1142creators to represent a bitfield as a slice of an underlying integral type.) 1143@findex Slices, effect on ctf_type_kind 1144@findex Slices, effect on ctf_type_reference 1145@findex libctf, effect of slices 1146 1147The vlen in the info word for a slice should be ignored and is always zero. The 1148variable-length data for a slice is a single @code{ctf_slice_t}: 1149 1150@verbatim 1151typedef struct ctf_slice 1152{ 1153 uint32_t cts_type; 1154 unsigned short cts_offset; 1155 unsigned short cts_bits; 1156} ctf_slice_t; 1157@end verbatim 1158 1159@tindex struct ctf_slice 1160@tindex ctf_slice_t 1161@multitable {Offset} {@code{unsigned short cts_offset}} {The type this slice is a slice of. Must be an} 1162@headitem Offset @tab Name @tab Description 1163@item 0x0 1164@tab @code{uint32_t cts_type} 1165@vindex cts_type 1166@vindex struct ctf_slice, cts_type 1167@vindex ctf_slice_t, cts_type 1168@tab The type this slice is a slice of. Must be an integral type (or a 1169floating-point type, but this nonsensical option will go away in v4.) 1170 1171@item 0x4 1172@tab @code{unsigned short cts_offset} 1173@vindex cts_offset 1174@vindex struct ctf_slice, cts_offset 1175@vindex ctf_slice_t, cts_offset 1176@tab The offset of this integral type in bits from the start of its enclosing 1177structure field, adjusted for endianness: @pxref{Structs and unions}. Identical 1178semantics to the @code{CTF_INT_OFFSET} field: @pxref{Integer types}. This field 1179is much too long, because the maximum possible offset of an integral type would 1180easily fit in a char: this field is bigger just for the sake of alignment. This 1181will change in v4. 1182 1183@item 0x6 1184@tab @code{unsigned short cts_bits} 1185@vindex cts_bits 1186@vindex struct ctf_slice, cts_bits 1187@vindex ctf_slice_t, cts_bits 1188@tab The bit-width of this integral type. Identical semantics to the 1189@code{CTF_INT_BITS} field: @pxref{Integer types}. As above, this field is 1190really too large and will shrink in v4. 1191@end multitable 1192 1193@node Pointers typedefs and cvr-quals 1194@subsection Pointers, typedefs, and cvr-quals 1195@cindex Pointers 1196@cindex Typedefs 1197@cindex cvr-quals 1198@tindex typedef 1199@tindex const 1200@tindex volatile 1201@tindex restrict 1202@tindex CTF_K_POINTER 1203@tindex CTF_K_TYPEDEF 1204@tindex CTF_K_CONST 1205@tindex CTF_K_VOLATILE 1206@tindex CTF_K_RESTRICT 1207 1208Pointers, @code{typedef}s, and @code{const}, @code{volatile} and @code{restrict} 1209qualifiers are represented identically except for their type kind (though they 1210may be treated differently by consuming libraries like @code{libctf}, since 1211pointers affect assignment-compatibility in ways cvr-quals do not, and they may 1212have different alignment requirements, etc). 1213 1214All of these are represented by @code{ctf_stype_t}, have no variable data at 1215all, and populate @code{ctt_type} with the type ID of the type they point 1216to. These types can stack: a @code{CTF_K_RESTRICT} can point to a 1217@code{CTF_K_CONST} which can point to a @code{CTF_K_POINTER} etc. 1218 1219They are all unnamed: @code{ctt_name} is 0. 1220 1221The size of @code{CTF_K_POINTER} is derived from the data model (@pxref{Data 1222models}), i.e. in practice, from the target machine ABI, and is not explicitly 1223represented. The size of other kinds in this set should be determined by 1224chasing ctf_types as necessary until a non-typedef/const/volatile/restrict is 1225found, and using that. 1226 1227@node Arrays 1228@subsection Arrays 1229@cindex Arrays 1230 1231Arrays are encoded as types of kind @code{CTF_K_ARRAY} in a @code{ctf_stype_t}. 1232Both size and kind for arrays are zero. The variable-length data is a 1233@code{ctf_array_t}: @code{vlen} in the info word should be disregarded and is 1234always zero. 1235 1236@verbatim 1237typedef struct ctf_array 1238{ 1239 uint32_t cta_contents; 1240 uint32_t cta_index; 1241 uint32_t cta_nelems; 1242} ctf_array_t; 1243@end verbatim 1244 1245@tindex struct ctf_array 1246@tindex ctf_array_t 1247@multitable {Offset} {@code{unsigned short cta_contents}} {The type of the array index: a type ID of an} 1248@headitem Offset @tab Name @tab Description 1249@item 0x0 1250@tab @code{uint32_t cta_contents} 1251@vindex cta_contents 1252@vindex struct ctf_array, cta_contents 1253@vindex ctf_array_t, cta_contents 1254@tab The type of the array elements: a type ID. 1255 1256@item 0x4 1257@tab @code{uint32_t cta_index} 1258@vindex cta_index 1259@vindex struct ctf_array, cta_index 1260@vindex ctf_array_t, cta_index 1261@tab The type of the array index: a type ID of an integral type. 1262If this is a variable-length array, the index type ID will be 0 1263(but the actual index type of this array is probably @code{int}). 1264Probably redundant and may be dropped in v4. 1265 1266@item 0x8 1267@tab @code{uint32_t cta_nelems} 1268@vindex cta_nelems 1269@vindex struct ctf_array, cta_nelems 1270@vindex ctf_array_t, cta_nelems 1271@tab The number of array elements. 0 for VLAs, and also for 1272the historical variety of VLA which has explicit zero dimensions (which will 1273have a nonzero @code{cta_index}.) 1274@end multitable 1275 1276The size of an array can be computed by simple multiplication of the size of the 1277@code{cta_contents} type by the @code{cta_nelems}. 1278 1279@node Function pointers 1280@subsection Function pointers 1281@cindex Function pointers 1282@cindex Pointers, to functions 1283 1284Function pointers are explicitly represented in the CTF type section by a type 1285of kind @code{CTF_K_FUNCTION}, always encoded with a @code{ctf_stype_t}. The 1286@code{ctt_type} is the function return type ID. The @code{vlen} in the info 1287word is the number of arguments, each of which is a type ID, a @code{uint32_t}: 1288if the last argument is 0, this is a varargs function and the number of 1289arguments is one less than indicated by the vlen. 1290 1291If the number of arguments is odd, a single @code{uint32_t} of padding is 1292inserted to maintain alignment. 1293 1294@node Enums 1295@subsection Enums 1296@cindex Enums 1297@tindex enum 1298@tindex CTF_K_ENUM 1299 1300Enumerated types are represented as types of kind @code{CTF_K_ENUM} in a 1301@code{ctf_stype_t}. The @code{ctt_size} is always the size of an int from the 1302data model (enum bitfields are implemented via slices). The @code{vlen} is a 1303count of enumerations, each of which is represented by a @code{ctf_enum_t} in 1304the vlen: 1305 1306@verbatim 1307typedef struct ctf_enum 1308{ 1309 uint32_t cte_name; 1310 int32_t cte_value; 1311} ctf_enum_t; 1312@end verbatim 1313 1314@tindex struct ctf_enum 1315@tindex ctf_enum_t 1316@multitable {Offset} {@code{int32_t cte_value}} {Strtab offset of the enumeration name.} 1317@headitem Offset @tab Name @tab Description 1318@item 0x0 1319@tab @code{uint32_t cte_name} 1320@vindex cte_name 1321@vindex struct ctf_enum, cte_name 1322@vindex ctf_enum_t, cte_name 1323@tab Strtab offset of the enumeration name. Must not be 0. 1324 1325@item 0x4 1326@tab @code{int32_t cte_value} 1327@vindex cte_value 1328@vindex struct ctf_enum, cte_value 1329@vindex ctf_enum_t, cte_value 1330@tab The enumeration value. 1331 1332@end multitable 1333 1334Enumeration values larger than @math{2^32} are not yet supported and are omitted 1335from the enumeration. (v4 will lift this restriction by encoding the value 1336differently.) 1337 1338Forward declarations of enums are not implemented with this kind: @pxref{Forward 1339declarations}. 1340 1341Enumerated type names, as usual in C, go into their own namespace, and do not 1342conflict with non-enums, structs, or unions with the same name. 1343 1344@node Structs and unions 1345@subsection Structs and unions 1346@cindex Structures 1347@cindex Unions 1348@tindex struct 1349@tindex union 1350@tindex CTF_K_STRUCT 1351@tindex CTF_K_UNION 1352 1353Structures and unions are represnted as types of kind @code{CTF_K_STRUCT} and 1354@code{CTF_K_UNION}: their representation is otherwise identical, and it is 1355perfectly allowed for ``structs'' to contain overlapping fields etc, so we will 1356treat them together for the rest of this section. 1357 1358They fill out @code{ctt_size}, and use @code{ctf_type_t} in preference to 1359@code{ctf_stype_t} if the structure size is greater than @code{CTF_MAX_SIZE} 1360(0xfffffffe). 1361@tindex CTF_MAX_LSIZE 1362 1363The vlen for structures and unions is a count of structure fields, but the type 1364used to represent a structure field (and thus the size of the variable-length 1365array element representing the type) depends on the size of the structure: truly 1366huge structures, greater than @code{CTF_LSTRUCT_THRESH} bytes in length, use a 1367different type. (@code{CTF_LSTRUCT_THRESH} is 536870912, so such structures are 1368vanishingly rare: in v4, this representation will change somewhat for greater 1369compactness. It's inherited from v1, where the limits were much lower.) 1370@tindex CTF_LSTRUCT_THRESH 1371 1372Most structures can get away with using @code{ctf_member_t}: 1373 1374@verbatim 1375typedef struct ctf_member_v2 1376{ 1377 uint32_t ctm_name; 1378 uint32_t ctm_offset; 1379 uint32_t ctm_type; 1380} ctf_member_t; 1381@end verbatim 1382 1383Huge structures that are represented by @code{ctf_type_t} rather than 1384@code{ctf_stype_t} have to use @code{ctf_lmember_t}, which splits the offset as 1385@code{ctf_type_t} splits the size: 1386 1387@verbatim 1388typedef struct ctf_lmember_v2 1389{ 1390 uint32_t ctlm_name; 1391 uint32_t ctlm_offsethi; 1392 uint32_t ctlm_type; 1393 uint32_t ctlm_offsetlo; 1394} ctf_lmember_t; 1395@end verbatim 1396 1397Here's what the fields of @code{ctf_member} mean: 1398 1399@tindex struct ctf_member_v2 1400@tindex ctf_member_t 1401@multitable {Offset} {@code{uint32_t ctm_offset}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is} 1402@headitem Offset @tab Name @tab Description 1403@item 0x00 1404@tab @code{uint32_t ctm_name} 1405@vindex ctm_name 1406@vindex struct ctf_member_v2, ctm_name 1407@vindex ctf_member_t, ctm_name 1408@tab Strtab offset of the field name. 1409 1410@item 0x04 1411@tab @code{uint32_t ctm_offset} 1412@vindex ctm_offset 1413@vindex struct ctf_member_v2, ctm_offset 1414@vindex ctf_member_t, ctm_offset 1415@tab The offset of this field @emph{in bits}. (Usually, for bitfields, this is 1416machine-word-aligned and the individual field has an offset in bits, but 1417the format allows for the offset to be encoded in bits here.) 1418 1419@item 0x08 1420@tab @code{uint32_t ctm_type} 1421@vindex ctm_type 1422@vindex struct ctf_member_v2, ctm_type 1423@vindex ctf_member_t, ctm_type 1424@tab The type ID of the type of the field. 1425@end multitable 1426 1427Here's what the fields of the very similar @code{ctf_lmember} mean: 1428 1429@tindex struct ctf_lmember_v2 1430@tindex ctf_lmember_t 1431@multitable {Offset} {@code{uint32_t ctlm_offsethi}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is} 1432@headitem Offset @tab Name @tab Description 1433@item 0x00 1434@tab @code{uint32_t ctlm_name} 1435@vindex ctlm_name 1436@vindex struct ctf_lmember_v2, ctlm_name 1437@vindex ctf_lmember_t, ctlm_name 1438@tab Strtab offset of the field name. 1439 1440@item 0x04 1441@tab @code{uint32_t ctlm_offsethi} 1442@vindex ctlm_offsethi 1443@vindex struct ctf_lmember_v2, ctlm_offsethi 1444@vindex ctf_lmember_t, ctlm_offsethi 1445@tab The high 32 bits of the offset of this field in bits. 1446 1447@item 0x08 1448@tab @code{uint32_t ctlm_type} 1449@vindex ctm_type 1450@vindex struct ctf_lmember_v2, ctlm_type 1451@vindex ctf_member_t, ctlm_type 1452@tab The type ID of the type of the field. 1453 1454@item 0x0c 1455@tab @code{uint32_t ctlm_offsetlo} 1456@vindex ctlm_offsetlo 1457@vindex struct ctf_lmember_v2, ctlm_offsetlo 1458@vindex ctf_lmember_t, ctlm_offsetlo 1459@tab The low 32 bits of the offset of this field in bits. 1460@end multitable 1461 1462Macros @code{CTF_LMEM_OFFSET}, @code{CTF_OFFSET_TO_LMEMHI} and 1463@code{CTF_OFFSET_TO_LMEMLO} serve to extract and install the values of the 1464@code{ctlm_offset} fields, much as with the split size fields in 1465@code{ctf_type_t}. 1466 1467Unnamed structure and union fields are simply implemented by collapsing the 1468unnamed field's members into the containing structure or union: this does mean 1469that a structure containing an unnamed union can end up being a ``structure'' 1470with multiple members at the same offset. (A future format revision may 1471collapse @code{CTF_K_STRUCT} and @code{CTF_K_UNION} into the same kind and 1472decide among them based on whether their members do in fact overlap.) 1473 1474Structure and union type names, as usual in C, go into their own namespace, 1475just as enum type names do. 1476 1477Forward declarations of structures and unions are not implemented with this 1478kind: @pxref{Forward declarations}. 1479 1480@node Forward declarations 1481@subsection Forward declarations 1482@cindex Forwards 1483@tindex enum 1484@tindex struct 1485@tindex union 1486@tindex CTF_K_FORWARD 1487 1488When the compiler encounters a forward declaration of a struct, union, or enum, 1489it emits a type of kind @code{CTF_K_FORWARD}. If it later encounters a non- 1490forward declaration of the same thing, it marks the forward as non-root-visible: 1491before link time, therefore, non-root-visible forwards indicate that a 1492non-forward is coming. 1493 1494After link time, forwards are fused with their corresponding non-forwards by the 1495deduplicator where possible. They are kept if there is no non-forward 1496definition (maybe it's not visible from any TU at all) or if @code{multiple} 1497conflicting structures with the same name might match it. Otherwise, all other 1498forwards are converted to structures, unions, or enums as appropriate, even 1499across TUs if only one structure could correspond to the forward (after all, 1500all types across all TUs land in the same dictionary unless they conflict, 1501so promoting forwards to their concrete type seems most helpful). 1502 1503A forward has a rather strange representation: it is encoded with a 1504@code{ctf_stype_t} but the @code{ctt_type} is populated not with a type (if it's 1505a forward, we don't have an underlying type yet: if we did, we'd have promoted 1506it and this wouldn't be a forward any more) but with the @code{kind} of the 1507forward. This means that we can distinguish forwards to structs, enums and 1508unions reliably and ensure they land in the appropriate namespace even before 1509the actual struct, union or enum is found. 1510 1511@node The symtypetab sections 1512@section The symtypetab sections 1513@cindex Symtypetab section 1514@cindex Sections, symtypetab 1515@cindex Function info section 1516@cindex Sections, function info 1517@cindex Data object section 1518@cindex Sections, data object 1519@cindex Function info index section 1520@cindex Sections, function info index 1521@cindex Data object index section 1522@cindex Sections, data object index 1523@tindex CTF_F_IDXSORTED 1524@tindex CTF_F_DYNSTR 1525@cindex Bug workarounds, CTF_F_DYNSTR 1526 1527These are two very simple sections with identical formats, used by consumers to 1528map from ELF function and data symbols directly to their types. So they are 1529usually populated only in CTF sections that are embedded in ELF objects. 1530 1531Their format is very simple: an array of type IDs. Which symbol each type ID 1532corresponds to depends on whether the optional @emph{index section} associated 1533with this symtypetab section has any content. 1534 1535If the index section is nonempty, it is an array of @code{uint32_t} string table 1536offsets, each giving the name of the symbol whose type is at the same offset in 1537the corresponding non-index section: users can look up symbols in such a table 1538by name. The index section and corresponding symtypetab section is usually 1539ASCIIbetically sorted (indicated by the @code{CTF_F_IDXSORTED} flag in the 1540header): if it's sorted, it can be bsearched for a symbol name rather than 1541having to use a slower linear search. 1542 1543If the data object index section is empty, the entries in the data object and 1544function info sections are associated 1:1 with ELF symbols of type 1545@code{STT_OBJECT} (for data object) or @code{STT_FUNC} (for function info) with 1546a nonzero value: the linker shuffles the symtypetab sections to correspond with 1547the order of the symbols in the ELF file. Symbols with no name, undefined 1548symbols and symbols named ``@code{_START_}'' and ``@code{_END_}'' are skipped 1549and never appear in either section. Symbols that have no corresponding type are 1550represented by type ID 0. The section may have fewer entries than the symbol 1551table, in which case no later entries have associated types. This format is 1552more compact than an indexed form if most entries have types (since there is no 1553need to record any symbol names), but if the producer and consumer disagree even 1554slightly about which symbols are omitted, the types of all further symbols will 1555be wrong! 1556 1557The compiler always emits indexed symtypetab tables, because there is no symbol 1558table yet. The linker will always have to read them all in and always works 1559through them from start to end, so there is no benefit having the compiler sort 1560them either. The linker (actually, @code{libctf}'s linking machinery) will 1561automatically sort unsorted indexed sections, and convert indexed sections that 1562contain a lot of pads into the more compact, unindexed form. 1563 1564If child dicts are in use, only symbols that use types actually mentioned in the 1565child appear in the child's symtypetab: symbols that use only types in the 1566parent appear in the parent's symtypetab instead. So the child's symtypetab will 1567almost always be very sparse, and thus will usually use the indexed form even in 1568fully linked objects. (It is, of course, impossible for symbols to exist that 1569use types from multiple child dicts at once, since it's impossible to declare a 1570function in C that uses types that are only visible in two different, disjoint 1571translation units.) 1572 1573@node The variable section 1574@section The variable section 1575@cindex Variable section 1576@cindex Sections, variable 1577 1578The variable section is a simple array mapping names (strtab entries) to type 1579IDs, intended to provide a replacement for the data object section in dynamic 1580situations in which there is no static ELF strtab but the consumer instead hands 1581back names. The section is sorted into ASCIIbetical order by name for rapid 1582lookup, like the CTF archive name table. 1583 1584The section is an array of these structures: 1585 1586@verbatim 1587typedef struct ctf_varent 1588{ 1589 uint32_t ctv_name; 1590 uint32_t ctv_type; 1591} ctf_varent_t; 1592@end verbatim 1593 1594@tindex struct ctf_varent 1595@tindex ctf_varent_t 1596@multitable {Offset} {@code{uint32_t ctv_name}} {Strtab offset of the name} 1597@headitem Offset @tab Name @tab Description 1598@item 0x00 1599@tab @code{uint32_t ctv_name} 1600@vindex ctv_name 1601@vindex struct ctf_varent, ctv_name 1602@vindex ctf_varent_t, ctv_name 1603@tab Strtab offset of the name 1604 1605@item 0x04 1606@tab @code{uint32_t ctv_type} 1607@vindex ctv_type 1608@vindex struct ctf_varent, ctv_type 1609@vindex ctf_varent_t, ctv_type 1610@tab Type ID of this type 1611@end multitable 1612 1613There is no analogue of the function info section yet: v4 will probably drop 1614this section in favour of a way to put both indexed (thus, named) and nonindexed 1615symbols into the symtypetab sections at the same time. 1616 1617@node The label section 1618@section The label section 1619@cindex Label section 1620@cindex Sections, label 1621 1622The label section is a currently-unused facility allowing the tiling of the type 1623space with names taken from the strtab. The section is an array of these 1624structures: 1625 1626@verbatim 1627typedef struct ctf_lblent 1628{ 1629 uint32_t ctl_label; 1630 uint32_t ctl_type; 1631} ctf_lblent_t; 1632@end verbatim 1633 1634@tindex struct ctf_lblent 1635@tindex ctf_lblent_t 1636@multitable {Offset} {@code{uint32_t ctl_label}} {Strtab offset of the label} 1637@headitem Offset @tab Name @tab Description 1638@item 0x00 1639@tab @code{uint32_t ctl_label} 1640@vindex ctl_label 1641@vindex struct ctf_lblent, ctl_label 1642@vindex ctf_lblent_t, ctl_label 1643@tab Strtab offset of the label 1644 1645@item 0x04 1646@tab @code{uint32_t ctl_type} 1647@vindex ctl_type 1648@vindex struct ctf_lblent, ctl_type 1649@vindex ctf_lblent_t, ctl_type 1650@tab Type ID of the last type covered by this label 1651@end multitable 1652 1653Semantics will be attached to labels soon, probably in v4 (the plan is to use 1654them to allow multiple disjoint namespaces in a single CTF file, removing many 1655uses of CTF archives, in particular in the @code{.ctf} section in ELF objects). 1656 1657@node The string section 1658@section The string section 1659@cindex String section 1660@cindex Sections, string 1661 1662This section is a simple ELF-format strtab, starting with a zero byte (thus 1663ensuring that the string with offset 0 is the null string, as assumed elsewhere 1664in this spec). The strtab is usually ASCIIbetically sorted to somewhat improve 1665compression efficiency. 1666 1667Where the strtab is unusual is the @emph{references} to it. CTF has two 1668string tables, the internal strtab and an external strtab associated 1669with the CTF dictionary at open time: usually, this is the ELF dynamic 1670strtab (@code{.dynstr}) of a CTF dictionary embedded in an ELF file. We 1671distinguish between these strtabs by the most significant bit, bit 31, 1672of the 32-bit strtab references: if it is 0, the offset is in the 1673internal strtab: if 1, the offset is in the external strtab. 1674 1675@tindex CTF_F_DYNSTR 1676@cindex Bug workarounds, CTF_F_DYNSTR 1677There is a bug workaround in this area: in format v3 (the first version 1678to have working support for external strtabs), the external strtab is 1679@code{.strtab} unless the @code{CTF_F_DYNSTR} flag is set on the 1680dictionary (@pxref{CTF file-wide flags}). Format v4 will introduce a 1681header field that explicitly names the external strtab, making this flag 1682unnecessary. 1683 1684@node Data models 1685@section Data models 1686@cindex Data models 1687 1688The data model is a simple integer which indicates the ABI in use on this 1689platform. Right now, it is very simple, distinguishing only between 32- and 169064-bit types: a model of 1 indicates ILP32, 2 indicats LP64. The mapping from 1691ABI integer to type sizes is hardwired into @code{libctf}: currently, we use 1692this to hardwire the size of pointers, function pointers, and enumerated types, 1693 1694This is a very kludgy corner of CTF and will probably be replaced with explicit 1695header fields to record this sort of thing in future. 1696 1697@node Limits of CTF 1698@section Limits of CTF 1699@cindex Limits 1700 1701The following limits are imposed by various aspects of CTF version 3: 1702 1703@table @code 1704@item CTF_MAX_TYPE 1705Maximum type identifier (maximum number of types accessible with parent and 1706child containers in use): 0xfffffffe 1707@item CTF_MAX_PTYPE 1708Maximum type identifier in a parent dictioanry: maximum number of types in any 1709one dictionary: 0x7fffffff 1710@item CTF_MAX_NAME 1711Maximum offset into a string table: 0x7fffffff 1712@item CTF_MAX_VLEN 1713Maximum number of members in a struct, union, or enum: maximum number of 1714function args: 0xffffff 1715@item CTF_MAX_SIZE 1716Maximum size of a @code{ctf_stype_t} in bytes before we fall back to 1717@code{ctf_type_t}: 0xfffffffe bytes 1718@end table 1719 1720Other maxima without associated macros: 1721@itemize 1722@item 1723Maximum value of an enumerated type: 2^32 1724@item 1725Maximum size of an array element: 2^32 1726@end itemize 1727 1728These maxima are generally considered to be too low, because C programs can and 1729do exceed them: they will be lifted in format v4. 1730 1731@node Index 1732@unnumbered Index 1733 1734@printindex cp 1735 1736@bye 1737