1\input texinfo 2@c Copyright (C) 1988-2024 Free Software Foundation, Inc. 3@setfilename bfdint.info 4 5@settitle BFD Internals 6@iftex 7@titlepage 8@title{BFD Internals} 9@author{Ian Lance Taylor} 10@author{Cygnus Solutions} 11@page 12@end iftex 13 14@copying 15This file documents the internals of the BFD library. 16 17Copyright @copyright{} 1988-2024 Free Software Foundation, Inc. 18Contributed by Cygnus Support. 19 20Permission is granted to copy, distribute and/or modify this document 21under the terms of the GNU Free Documentation License, Version 1.1 or 22any later version published by the Free Software Foundation; with the 23Invariant Sections being ``GNU General Public License'' and ``Funding 24Free Software'', the Front-Cover texts being (a) (see below), and with 25the Back-Cover Texts being (b) (see below). A copy of the license is 26included in the section entitled ``GNU Free Documentation License''. 27 28(a) The FSF's Front-Cover Text is: 29 30 A GNU Manual 31 32(b) The FSF's Back-Cover Text is: 33 34 You have freedom to copy and modify this GNU Manual, like GNU 35 software. Copies published by the Free Software Foundation raise 36 funds for GNU development. 37@end copying 38 39@node Top 40@top BFD Internals 41@raisesections 42@cindex bfd internals 43 44This document describes some BFD internal information which may be 45helpful when working on BFD. It is very incomplete. 46 47This document is not updated regularly, and may be out of date. 48 49The initial version of this document was written by Ian Lance Taylor 50@email{ian@@cygnus.com}. 51 52@menu 53* BFD overview:: BFD overview 54* BFD guidelines:: BFD programming guidelines 55* BFD target vector:: BFD target vector 56* BFD generated files:: BFD generated files 57* BFD multiple compilations:: Files compiled multiple times in BFD 58* BFD relocation handling:: BFD relocation handling 59* BFD ELF support:: BFD ELF support 60* BFD glossary:: Glossary 61* Index:: Index 62@end menu 63 64@node BFD overview 65@section BFD overview 66 67BFD is a library which provides a single interface to read and write 68object files, executables, archive files, and core files in any format. 69 70@menu 71* BFD library interfaces:: BFD library interfaces 72* BFD library users:: BFD library users 73* BFD view:: The BFD view of a file 74* BFD blindness:: BFD loses information 75@end menu 76 77@node BFD library interfaces 78@subsection BFD library interfaces 79 80One way to look at the BFD library is to divide it into four parts by 81type of interface. 82 83The first interface is the set of generic functions which programs using 84the BFD library will call. These generic function normally translate 85directly or indirectly into calls to routines which are specific to a 86particular object file format. Many of these generic functions are 87actually defined as macros in @file{bfd.h}. These functions comprise 88the official BFD interface. 89 90The second interface is the set of functions which appear in the target 91vectors. This is the bulk of the code in BFD. A target vector is a set 92of function pointers specific to a particular object file format. The 93target vector is used to implement the generic BFD functions. These 94functions are always called through the target vector, and are never 95called directly. The target vector is described in detail in @ref{BFD 96target vector}. The set of functions which appear in a particular 97target vector is often referred to as a BFD backend. 98 99The third interface is a set of oddball functions which are typically 100specific to a particular object file format, are not generic functions, 101and are called from outside of the BFD library. These are used as hooks 102by the linker and the assembler when a particular object file format 103requires some action which the BFD generic interface does not provide. 104These functions are typically declared in @file{bfd.h}, but in many 105cases they are only provided when BFD is configured with support for a 106particular object file format. These functions live in a grey area, and 107are not really part of the official BFD interface. 108 109The fourth interface is the set of BFD support functions which are 110called by the other BFD functions. These manage issues like memory 111allocation, error handling, file access, hash tables, swapping, and the 112like. These functions are never called from outside of the BFD library. 113 114@node BFD library users 115@subsection BFD library users 116 117Another way to look at the BFD library is to divide it into three parts 118by the manner in which it is used. 119 120The first use is to read an object file. The object file readers are 121programs like @samp{gdb}, @samp{nm}, @samp{objdump}, and @samp{objcopy}. 122These programs use BFD to view an object file in a generic form. The 123official BFD interface is normally fully adequate for these programs. 124 125The second use is to write an object file. The object file writers are 126programs like @samp{gas} and @samp{objcopy}. These programs use BFD to 127create an object file. The official BFD interface is normally adequate 128for these programs, but for some object file formats the assembler needs 129some additional hooks in order to set particular flags or other 130information. The official BFD interface includes functions to copy 131private information from one object file to another, and these functions 132are used by @samp{objcopy} to avoid information loss. 133 134The third use is to link object files. There is only one object file 135linker, @samp{ld}. Originally, @samp{ld} was an object file reader and 136an object file writer, and it did the link operation using the generic 137BFD structures. However, this turned out to be too slow and too memory 138intensive. 139 140The official BFD linker functions were written to permit specific BFD 141backends to perform the link without translating through the generic 142structures, in the normal case where all the input files and output file 143have the same object file format. Not all of the backends currently 144implement the new interface, and there are default linking functions 145within BFD which use the generic structures and which work with all 146backends. 147 148For several object file formats the linker needs additional hooks which 149are not provided by the official BFD interface, particularly for dynamic 150linking support. These functions are typically called from the linker 151emulation template. 152 153@node BFD view 154@subsection The BFD view of a file 155 156BFD uses generic structures to manage information. It translates data 157into the generic form when reading files, and out of the generic form 158when writing files. 159 160BFD describes a file as a pointer to the @samp{bfd} type. A @samp{bfd} 161is composed of the following elements. The BFD information can be 162displayed using the @samp{objdump} program with various options. 163 164@table @asis 165@item general information 166The object file format, a few general flags, the start address. 167@item architecture 168The architecture, including both a general processor type (m68k, MIPS 169etc.) and a specific machine number (m68000, R4000, etc.). 170@item sections 171A list of sections. 172@item symbols 173A symbol table. 174@end table 175 176BFD represents a section as a pointer to the @samp{asection} type. Each 177section has a name and a size. Most sections also have an associated 178block of data, known as the section contents. Sections also have 179associated flags, a virtual memory address, a load memory address, a 180required alignment, a list of relocations, and other miscellaneous 181information. 182 183BFD represents a relocation as a pointer to the @samp{arelent} type. A 184relocation describes an action which the linker must take to modify the 185section contents. Relocations have a symbol, an address, an addend, and 186a pointer to a howto structure which describes how to perform the 187relocation. For more information, see @ref{BFD relocation handling}. 188 189BFD represents a symbol as a pointer to the @samp{asymbol} type. A 190symbol has a name, a pointer to a section, an offset within that 191section, and some flags. 192 193Archive files do not have any sections or symbols. Instead, BFD 194represents an archive file as a file which contains a list of 195@samp{bfd}s. BFD also provides access to the archive symbol map, as a 196list of symbol names. BFD provides a function to return the @samp{bfd} 197within the archive which corresponds to a particular entry in the 198archive symbol map. 199 200@node BFD blindness 201@subsection BFD loses information 202 203Most object file formats have information which BFD can not represent in 204its generic form, at least as currently defined. 205 206There is often explicit information which BFD can not represent. For 207example, the COFF version stamp, or the ELF program segments. BFD 208provides special hooks to handle this information when copying, 209printing, or linking an object file. The BFD support for a particular 210object file format will normally store this information in private data 211and handle it using the special hooks. 212 213In some cases there is also implicit information which BFD can not 214represent. For example, the MIPS processor distinguishes small and 215large symbols, and requires that all small symbols be within 32K of the 216GP register. This means that the MIPS assembler must be able to mark 217variables as either small or large, and the MIPS linker must know to put 218small symbols within range of the GP register. Since BFD can not 219represent this information, this means that the assembler and linker 220must have information that is specific to a particular object file 221format which is outside of the BFD library. 222 223This loss of information indicates areas where the BFD paradigm breaks 224down. It is not actually possible to represent the myriad differences 225among object file formats using a single generic interface, at least not 226in the manner which BFD does it today. 227 228Nevertheless, the BFD library does greatly simplify the task of dealing 229with object files, and particular problems caused by information loss 230can normally be solved using some sort of relatively constrained hook 231into the library. 232 233 234 235@node BFD guidelines 236@section BFD programming guidelines 237@cindex bfd programming guidelines 238@cindex programming guidelines for bfd 239@cindex guidelines, bfd programming 240 241There is a lot of poorly written and confusing code in BFD. New BFD 242code should be written to a higher standard. Merely because some BFD 243code is written in a particular manner does not mean that you should 244emulate it. 245 246Here are some general BFD programming guidelines: 247 248@itemize @bullet 249@item 250Follow the GNU coding standards. 251 252@item 253Avoid global variables. We ideally want BFD to be fully reentrant, so 254that it can be used in multiple threads. All uses of global or static 255variables interfere with that. Initialized constant variables are OK, 256and they should be explicitly marked with @samp{const}. Instead of global 257variables, use data attached to a BFD or to a linker hash table. 258 259@item 260All externally visible functions should have names which start with 261@samp{bfd_}. All such functions should be declared in some header file, 262typically @file{bfd.h}. See, for example, the various declarations near 263the end of @file{bfd-in.h}, which mostly declare functions required by 264specific linker emulations. 265 266@item 267All functions which need to be visible from one file to another within 268BFD, but should not be visible outside of BFD, should start with 269@samp{_bfd_}. Although external names beginning with @samp{_} are 270prohibited by the ANSI standard, in practice this usage will always 271work, and it is required by the GNU coding standards. 272 273@item 274Always remember that people can compile using @samp{--enable-targets} to 275build several, or all, targets at once. It must be possible to link 276together the files for all targets. 277 278@item 279BFD code should compile with few or no warnings using @samp{gcc -Wall}. 280Some warnings are OK, like the absence of certain function declarations 281which may or may not be declared in system header files. Warnings about 282ambiguous expressions and the like should always be fixed. 283@end itemize 284 285@node BFD target vector 286@section BFD target vector 287@cindex bfd target vector 288@cindex target vector in bfd 289 290BFD supports multiple object file formats by using the @dfn{target 291vector}. This is simply a set of function pointers which implement 292behaviour that is specific to a particular object file format. 293 294In this section I list all of the entries in the target vector and 295describe what they do. 296 297@menu 298* BFD target vector miscellaneous:: Miscellaneous constants 299* BFD target vector swap:: Swapping functions 300* BFD target vector format:: Format type dependent functions 301* BFD_JUMP_TABLE macros:: BFD_JUMP_TABLE macros 302* BFD target vector generic:: Generic functions 303* BFD target vector copy:: Copy functions 304* BFD target vector core:: Core file support functions 305* BFD target vector archive:: Archive functions 306* BFD target vector symbols:: Symbol table functions 307* BFD target vector relocs:: Relocation support 308* BFD target vector write:: Output functions 309* BFD target vector link:: Linker functions 310* BFD target vector dynamic:: Dynamic linking information functions 311@end menu 312 313@node BFD target vector miscellaneous 314@subsection Miscellaneous constants 315 316The target vector starts with a set of constants. 317 318@table @samp 319@item name 320The name of the target vector. This is an arbitrary string. This is 321how the target vector is named in command-line options for tools which 322use BFD, such as the @samp{--oformat} linker option. 323 324@item flavour 325A general description of the type of target. The following flavours are 326currently defined: 327 328@table @samp 329@item bfd_target_unknown_flavour 330Undefined or unknown. 331@item bfd_target_aout_flavour 332a.out. 333@item bfd_target_coff_flavour 334COFF. 335@item bfd_target_ecoff_flavour 336ECOFF. 337@item bfd_target_elf_flavour 338ELF. 339@item bfd_target_tekhex_flavour 340Tektronix hex format. 341@item bfd_target_srec_flavour 342Motorola S-record format. 343@item bfd_target_ihex_flavour 344Intel hex format. 345@item bfd_target_som_flavour 346SOM (used on HP/UX). 347@item bfd_target_verilog_flavour 348Verilog memory hex dump format. 349@item bfd_target_msdos_flavour 350MS-DOS. 351@item bfd_target_evax_flavour 352openVMS. 353@item bfd_target_mmo_flavour 354Donald Knuth's MMIXware object format. 355@end table 356 357@item byteorder 358The byte order of data in the object file. One of 359@samp{BFD_ENDIAN_BIG}, @samp{BFD_ENDIAN_LITTLE}, or 360@samp{BFD_ENDIAN_UNKNOWN}. The latter would be used for a format such 361as S-records which do not record the architecture of the data. 362 363@item header_byteorder 364The byte order of header information in the object file. Normally the 365same as the @samp{byteorder} field, but there are certain cases where it 366may be different. 367 368@item object_flags 369Flags which may appear in the @samp{flags} field of a BFD with this 370format. 371 372@item section_flags 373Flags which may appear in the @samp{flags} field of a section within a 374BFD with this format. 375 376@item symbol_leading_char 377A character which the C compiler normally puts before a symbol. For 378example, an a.out compiler will typically generate the symbol 379@samp{_foo} for a function named @samp{foo} in the C source, in which 380case this field would be @samp{_}. If there is no such character, this 381field will be @samp{0}. 382 383@item ar_pad_char 384The padding character to use at the end of an archive name. Normally 385@samp{/}. 386 387@item ar_max_namelen 388The maximum length of a short name in an archive. Normally @samp{14}. 389 390@item backend_data 391A pointer to constant backend data. This is used by backends to store 392whatever additional information they need to distinguish similar target 393vectors which use the same sets of functions. 394@end table 395 396@node BFD target vector swap 397@subsection Swapping functions 398 399Every target vector has function pointers used for swapping information 400in and out of the target representation. There are two sets of 401functions: one for data information, and one for header information. 402Each set has three sizes: 64-bit, 32-bit, and 16-bit. Each size has 403three actual functions: put, get unsigned, and get signed. 404 405These 18 functions are used to convert data between the host and target 406representations. 407 408@node BFD target vector format 409@subsection Format type dependent functions 410 411Every target vector has three arrays of function pointers which are 412indexed by the BFD format type. The BFD format types are as follows: 413 414@table @samp 415@item bfd_unknown 416Unknown format. Not used for anything useful. 417@item bfd_object 418Object file. 419@item bfd_archive 420Archive file. 421@item bfd_core 422Core file. 423@end table 424 425The three arrays of function pointers are as follows: 426 427@table @samp 428@item bfd_check_format 429Check whether the BFD is of a particular format (object file, archive 430file, or core file) corresponding to this target vector. This is called 431by the @samp{bfd_check_format} function when examining an existing BFD. 432If the BFD matches the desired format, this function will initialize any 433format specific information such as the @samp{tdata} field of the BFD. 434This function must be called before any other BFD target vector function 435on a file opened for reading. 436 437@item bfd_set_format 438Set the format of a BFD which was created for output. This is called by 439the @samp{bfd_set_format} function after creating the BFD with a 440function such as @samp{bfd_openw}. This function will initialize format 441specific information required to write out an object file or whatever of 442the given format. This function must be called before any other BFD 443target vector function on a file opened for writing. 444 445@item bfd_write_contents 446Write out the contents of the BFD in the given format. This is called 447by @samp{bfd_close} function for a BFD opened for writing. This really 448should not be an array selected by format type, as the 449@samp{bfd_set_format} function provides all the required information. 450In fact, BFD will fail if a different format is used when calling 451through the @samp{bfd_set_format} and the @samp{bfd_write_contents} 452arrays; fortunately, since @samp{bfd_close} gets it right, this is a 453difficult error to make. 454@end table 455 456@node BFD_JUMP_TABLE macros 457@subsection @samp{BFD_JUMP_TABLE} macros 458@cindex @samp{BFD_JUMP_TABLE} 459 460Most target vectors are defined using @samp{BFD_JUMP_TABLE} macros. 461These macros take a single argument, which is a prefix applied to a set 462of functions. The macros are then used to initialize the fields in the 463target vector. 464 465For example, the @samp{BFD_JUMP_TABLE_RELOCS} macro defines three 466functions: @samp{_get_reloc_upper_bound}, @samp{_canonicalize_reloc}, 467and @samp{_bfd_reloc_type_lookup}. A reference like 468@samp{BFD_JUMP_TABLE_RELOCS (foo)} will expand into three functions 469prefixed with @samp{foo}: @samp{foo_get_reloc_upper_bound}, etc. The 470@samp{BFD_JUMP_TABLE_RELOCS} macro will be placed such that those three 471functions initialize the appropriate fields in the BFD target vector. 472 473This is done because it turns out that many different target vectors can 474share certain classes of functions. For example, archives are similar 475on most platforms, so most target vectors can use the same archive 476functions. Those target vectors all use @samp{BFD_JUMP_TABLE_ARCHIVE} 477with the same argument, calling a set of functions which is defined in 478@file{archive.c}. 479 480Each of the @samp{BFD_JUMP_TABLE} macros is mentioned below along with 481the description of the function pointers which it defines. The function 482pointers will be described using the name without the prefix which the 483@samp{BFD_JUMP_TABLE} macro defines. This name is normally the same as 484the name of the field in the target vector structure. Any differences 485will be noted. 486 487@node BFD target vector generic 488@subsection Generic functions 489@cindex @samp{BFD_JUMP_TABLE_GENERIC} 490 491The @samp{BFD_JUMP_TABLE_GENERIC} macro is used for some catch all 492functions which don't easily fit into other categories. 493 494@table @samp 495@item _close_and_cleanup 496Free any target specific information associated with the BFD that 497isn't freed by @samp{_bfd_free_cached_info}. This is called when any 498BFD is closed (the @samp{bfd_write_contents} function mentioned 499earlier is only called for a BFD opened for writing). This function 500pointer is typically set to @samp{_bfd_generic_close_and_cleanup}, 501which simply returns true. 502 503@item _bfd_free_cached_info 504This function is designed for use by the generic archive routines, and 505is also called by bfd_close. After creating the archive map archive 506element bfds don't need symbols and other structures. Many targets 507use @samp{bfd_alloc} to allocate target specific information and thus 508do not need to do anything special for this entry point, and just set 509it to @samp{_bfd_generic_free_cached_info} which throws away objalloc 510memory for the bfd. Note that this means the bfd tdata and sections 511are no longer available. Targets that malloc memory, attaching it to 512the bfd tdata or to section used_by_bfd should implement a target 513version of this function to free that memory before calling 514@samp{_bfd_generic_free_cached_info}. 515 516@item _new_section_hook 517This is called from @samp{bfd_make_section_anyway} whenever a new 518section is created. Most targets use it to initialize section specific 519information. This function is called whether or not the section 520corresponds to an actual section in an actual BFD. 521 522@item _get_section_contents 523Get the contents of a section. This is called from 524@samp{bfd_get_section_contents}. Most targets set this to 525@samp{_bfd_generic_get_section_contents}, which does a @samp{bfd_seek} 526based on the section's @samp{filepos} field and a @samp{bfd_read}. The 527corresponding field in the target vector is named 528@samp{_bfd_get_section_contents}. 529 530@item _get_section_contents_in_window 531Set a @samp{bfd_window} to hold the contents of a section. This is 532called from @samp{bfd_get_section_contents_in_window}. The 533@samp{bfd_window} idea never really caught on, and I don't think this is 534ever called. Pretty much all targets implement this as 535@samp{bfd_generic_get_section_contents_in_window}, which uses 536@samp{bfd_get_section_contents} to do the right thing. The 537corresponding field in the target vector is named 538@samp{_bfd_get_section_contents_in_window}. 539@end table 540 541@node BFD target vector copy 542@subsection Copy functions 543@cindex @samp{BFD_JUMP_TABLE_COPY} 544 545The @samp{BFD_JUMP_TABLE_COPY} macro is used for functions which are 546called when copying BFDs, and for a couple of functions which deal with 547internal BFD information. 548 549@table @samp 550@item _bfd_copy_private_bfd_data 551This is called when copying a BFD, via @samp{bfd_copy_private_bfd_data}. 552If the input and output BFDs have the same format, this will copy any 553private information over. This is called after all the section contents 554have been written to the output file. Only a few targets do anything in 555this function. 556 557@item _bfd_merge_private_bfd_data 558This is called when linking, via @samp{bfd_merge_private_bfd_data}. It 559gives the backend linker code a chance to set any special flags in the 560output file based on the contents of the input file. Only a few targets 561do anything in this function. 562 563@item _bfd_copy_private_section_data 564This is similar to @samp{_bfd_copy_private_bfd_data}, but it is called 565for each section, via @samp{bfd_copy_private_section_data}. This 566function is called before any section contents have been written. Only 567a few targets do anything in this function. 568 569@item _bfd_copy_private_symbol_data 570This is called via @samp{bfd_copy_private_symbol_data}, but I don't 571think anything actually calls it. If it were defined, it could be used 572to copy private symbol data from one BFD to another. However, most BFDs 573store extra symbol information by allocating space which is larger than 574the @samp{asymbol} structure and storing private information in the 575extra space. Since @samp{objcopy} and other programs copy symbol 576information by copying pointers to @samp{asymbol} structures, the 577private symbol information is automatically copied as well. Most 578targets do not do anything in this function. 579 580@item _bfd_set_private_flags 581This is called via @samp{bfd_set_private_flags}. It is basically a hook 582for the assembler to set magic information. For example, the PowerPC 583ELF assembler uses it to set flags which appear in the e_flags field of 584the ELF header. Most targets do not do anything in this function. 585 586@item _bfd_print_private_bfd_data 587This is called by @samp{objdump} when the @samp{-p} option is used. It 588is called via @samp{bfd_print_private_data}. It prints any interesting 589information about the BFD which can not be otherwise represented by BFD 590and thus can not be printed by @samp{objdump}. Most targets do not do 591anything in this function. 592@end table 593 594@node BFD target vector core 595@subsection Core file support functions 596@cindex @samp{BFD_JUMP_TABLE_CORE} 597 598The @samp{BFD_JUMP_TABLE_CORE} macro is used for functions which deal 599with core files. Obviously, these functions only do something 600interesting for targets which have core file support. 601 602@table @samp 603@item _core_file_failing_command 604Given a core file, this returns the command which was run to produce the 605core file. 606 607@item _core_file_failing_signal 608Given a core file, this returns the signal number which produced the 609core file. 610 611@item _core_file_matches_executable_p 612Given a core file and a BFD for an executable, this returns whether the 613core file was generated by the executable. 614@end table 615 616@node BFD target vector archive 617@subsection Archive functions 618@cindex @samp{BFD_JUMP_TABLE_ARCHIVE} 619 620The @samp{BFD_JUMP_TABLE_ARCHIVE} macro is used for functions which deal 621with archive files. Most targets use COFF style archive files 622(including ELF targets), and these use @samp{_bfd_archive_coff} as the 623argument to @samp{BFD_JUMP_TABLE_ARCHIVE}. Some targets use BSD/a.out 624style archives, and these use @samp{_bfd_archive_bsd}. (The main 625difference between BSD and COFF archives is the format of the archive 626symbol table). Targets with no archive support use 627@samp{_bfd_noarchive}. Finally, a few targets have unusual archive 628handling. 629 630@table @samp 631@item _slurp_armap 632Read in the archive symbol table, storing it in private BFD data. This 633is normally called from the archive @samp{check_format} routine. The 634corresponding field in the target vector is named 635@samp{_bfd_slurp_armap}. 636 637@item _slurp_extended_name_table 638Read in the extended name table from the archive, if there is one, 639storing it in private BFD data. This is normally called from the 640archive @samp{check_format} routine. The corresponding field in the 641target vector is named @samp{_bfd_slurp_extended_name_table}. 642 643@item construct_extended_name_table 644Build and return an extended name table if one is needed to write out 645the archive. This also adjusts the archive headers to refer to the 646extended name table appropriately. This is normally called from the 647archive @samp{write_contents} routine. The corresponding field in the 648target vector is named @samp{_bfd_construct_extended_name_table}. 649 650@item _truncate_arname 651This copies a file name into an archive header, truncating it as 652required. It is normally called from the archive @samp{write_contents} 653routine. This function is more interesting in targets which do not 654support extended name tables, but I think the GNU @samp{ar} program 655always uses extended name tables anyhow. The corresponding field in the 656target vector is named @samp{_bfd_truncate_arname}. 657 658@item _write_armap 659Write out the archive symbol table using calls to @samp{bfd_write}. 660This is normally called from the archive @samp{write_contents} routine. 661The corresponding field in the target vector is named @samp{write_armap} 662(no leading underscore). 663 664@item _read_ar_hdr 665Read and parse an archive header. This handles expanding the archive 666header name into the real file name using the extended name table. This 667is called by routines which read the archive symbol table or the archive 668itself. The corresponding field in the target vector is named 669@samp{_bfd_read_ar_hdr_fn}. 670 671@item _openr_next_archived_file 672Given an archive and a BFD representing a file stored within the 673archive, return a BFD for the next file in the archive. This is called 674via @samp{bfd_openr_next_archived_file}. The corresponding field in the 675target vector is named @samp{openr_next_archived_file} (no leading 676underscore). 677 678@item _get_elt_at_index 679Given an archive and an index, return a BFD for the file in the archive 680corresponding to that entry in the archive symbol table. This is called 681via @samp{bfd_get_elt_at_index}. The corresponding field in the target 682vector is named @samp{_bfd_get_elt_at_index}. 683 684@item _generic_stat_arch_elt 685Do a stat on an element of an archive, returning information read from 686the archive header (modification time, uid, gid, file mode, size). This 687is called via @samp{bfd_stat_arch_elt}. The corresponding field in the 688target vector is named @samp{_bfd_stat_arch_elt}. 689 690@item _update_armap_timestamp 691After the entire contents of an archive have been written out, update 692the timestamp of the archive symbol table to be newer than that of the 693file. This is required for a.out style archives. This is normally 694called by the archive @samp{write_contents} routine. The corresponding 695field in the target vector is named @samp{_bfd_update_armap_timestamp}. 696@end table 697 698@node BFD target vector symbols 699@subsection Symbol table functions 700@cindex @samp{BFD_JUMP_TABLE_SYMBOLS} 701 702The @samp{BFD_JUMP_TABLE_SYMBOLS} macro is used for functions which deal 703with symbols. 704 705@table @samp 706@item _get_symtab_upper_bound 707Return a sensible upper bound on the amount of memory which will be 708required to read the symbol table. In practice most targets return the 709amount of memory required to hold @samp{asymbol} pointers for all the 710symbols plus a trailing @samp{NULL} entry, and store the actual symbol 711information in BFD private data. This is called via 712@samp{bfd_get_symtab_upper_bound}. The corresponding field in the 713target vector is named @samp{_bfd_get_symtab_upper_bound}. 714 715@item _canonicalize_symtab 716Read in the symbol table. This is called via 717@samp{bfd_canonicalize_symtab}. The corresponding field in the target 718vector is named @samp{_bfd_canonicalize_symtab}. 719 720@item _make_empty_symbol 721Create an empty symbol for the BFD. This is needed because most targets 722store extra information with each symbol by allocating a structure 723larger than an @samp{asymbol} and storing the extra information at the 724end. This function will allocate the right amount of memory, and return 725what looks like a pointer to an empty @samp{asymbol}. This is called 726via @samp{bfd_make_empty_symbol}. The corresponding field in the target 727vector is named @samp{_bfd_make_empty_symbol}. 728 729@item _print_symbol 730Print information about the symbol. This is called via 731@samp{bfd_print_symbol}. One of the arguments indicates what sort of 732information should be printed: 733 734@table @samp 735@item bfd_print_symbol_name 736Just print the symbol name. 737@item bfd_print_symbol_more 738Print the symbol name and some interesting flags. I don't think 739anything actually uses this. 740@item bfd_print_symbol_all 741Print all information about the symbol. This is used by @samp{objdump} 742when run with the @samp{-t} option. 743@end table 744The corresponding field in the target vector is named 745@samp{_bfd_print_symbol}. 746 747@item _get_symbol_info 748Return a standard set of information about the symbol. This is called 749via @samp{bfd_symbol_info}. The corresponding field in the target 750vector is named @samp{_bfd_get_symbol_info}. 751 752@item _bfd_is_local_label_name 753Return whether the given string would normally represent the name of a 754local label. This is called via @samp{bfd_is_local_label} and 755@samp{bfd_is_local_label_name}. Local labels are normally discarded by 756the assembler. In the linker, this defines the difference between the 757@samp{-x} and @samp{-X} options. 758 759@item _get_lineno 760Return line number information for a symbol. This is only meaningful 761for a COFF target. This is called when writing out COFF line numbers. 762 763@item _find_nearest_line 764Given an address within a section, use the debugging information to find 765the matching file name, function name, and line number, if any. This is 766called via @samp{bfd_find_nearest_line}. The corresponding field in the 767target vector is named @samp{_bfd_find_nearest_line}. 768 769@item _bfd_make_debug_symbol 770Make a debugging symbol. This is only meaningful for a COFF target, 771where it simply returns a symbol which will be placed in the 772@samp{N_DEBUG} section when it is written out. This is called via 773@samp{bfd_make_debug_symbol}. 774 775@item _read_minisymbols 776Minisymbols are used to reduce the memory requirements of programs like 777@samp{nm}. A minisymbol is a cookie pointing to internal symbol 778information which the caller can use to extract complete symbol 779information. This permits BFD to not convert all the symbols into 780generic form, but to instead convert them one at a time. This is called 781via @samp{bfd_read_minisymbols}. Most targets do not implement this, 782and just use generic support which is based on using standard 783@samp{asymbol} structures. 784 785@item _minisymbol_to_symbol 786Convert a minisymbol to a standard @samp{asymbol}. This is called via 787@samp{bfd_minisymbol_to_symbol}. 788@end table 789 790@node BFD target vector relocs 791@subsection Relocation support 792@cindex @samp{BFD_JUMP_TABLE_RELOCS} 793 794The @samp{BFD_JUMP_TABLE_RELOCS} macro is used for functions which deal 795with relocations. 796 797@table @samp 798@item _get_reloc_upper_bound 799Return a sensible upper bound on the amount of memory which will be 800required to read the relocations for a section. In practice most 801targets return the amount of memory required to hold @samp{arelent} 802pointers for all the relocations plus a trailing @samp{NULL} entry, and 803store the actual relocation information in BFD private data. This is 804called via @samp{bfd_get_reloc_upper_bound}. 805 806@item _canonicalize_reloc 807Return the relocation information for a section. This is called via 808@samp{bfd_canonicalize_reloc}. The corresponding field in the target 809vector is named @samp{_bfd_canonicalize_reloc}. 810 811@item _bfd_reloc_type_lookup 812Given a relocation code, return the corresponding howto structure 813(@pxref{BFD relocation codes}). This is called via 814@samp{bfd_reloc_type_lookup}. The corresponding field in the target 815vector is named @samp{reloc_type_lookup}. 816@end table 817 818@node BFD target vector write 819@subsection Output functions 820@cindex @samp{BFD_JUMP_TABLE_WRITE} 821 822The @samp{BFD_JUMP_TABLE_WRITE} macro is used for functions which deal 823with writing out a BFD. 824 825@table @samp 826@item _set_arch_mach 827Set the architecture and machine number for a BFD. This is called via 828@samp{bfd_set_arch_mach}. Most targets implement this by calling 829@samp{bfd_default_set_arch_mach}. The corresponding field in the target 830vector is named @samp{_bfd_set_arch_mach}. 831 832@item _set_section_contents 833Write out the contents of a section. This is called via 834@samp{bfd_set_section_contents}. The corresponding field in the target 835vector is named @samp{_bfd_set_section_contents}. 836@end table 837 838@node BFD target vector link 839@subsection Linker functions 840@cindex @samp{BFD_JUMP_TABLE_LINK} 841 842The @samp{BFD_JUMP_TABLE_LINK} macro is used for functions called by the 843linker. 844 845@table @samp 846@item _sizeof_headers 847Return the size of the header information required for a BFD. This is 848used to implement the @samp{SIZEOF_HEADERS} linker script function. It 849is normally used to align the first section at an efficient position on 850the page. This is called via @samp{bfd_sizeof_headers}. The 851corresponding field in the target vector is named 852@samp{_bfd_sizeof_headers}. 853 854@item _bfd_get_relocated_section_contents 855Read the contents of a section and apply the relocation information. 856This handles both a final link and a relocatable link; in the latter 857case, it adjust the relocation information as well. This is called via 858@samp{bfd_get_relocated_section_contents}. Most targets implement it by 859calling @samp{bfd_generic_get_relocated_section_contents}. 860 861@item _bfd_relax_section 862Try to use relaxation to shrink the size of a section. This is called 863by the linker when the @samp{-relax} option is used. This is called via 864@samp{bfd_relax_section}. Most targets do not support any sort of 865relaxation. 866 867@item _bfd_link_hash_table_create 868Create the symbol hash table to use for the linker. This linker hook 869permits the backend to control the size and information of the elements 870in the linker symbol hash table. This is called via 871@samp{bfd_link_hash_table_create}. 872 873@item _bfd_link_add_symbols 874Given an object file or an archive, add all symbols into the linker 875symbol hash table. Use callbacks to the linker to include archive 876elements in the link. This is called via @samp{bfd_link_add_symbols}. 877 878@item _bfd_final_link 879Finish the linking process. The linker calls this hook after all of the 880input files have been read, when it is ready to finish the link and 881generate the output file. This is called via @samp{bfd_final_link}. 882 883@item _bfd_link_split_section 884I don't know what this is for. Nothing seems to call it. The only 885non-trivial definition is in @file{som.c}. 886@end table 887 888@node BFD target vector dynamic 889@subsection Dynamic linking information functions 890@cindex @samp{BFD_JUMP_TABLE_DYNAMIC} 891 892The @samp{BFD_JUMP_TABLE_DYNAMIC} macro is used for functions which read 893dynamic linking information. 894 895@table @samp 896@item _get_dynamic_symtab_upper_bound 897Return a sensible upper bound on the amount of memory which will be 898required to read the dynamic symbol table. In practice most targets 899return the amount of memory required to hold @samp{asymbol} pointers for 900all the symbols plus a trailing @samp{NULL} entry, and store the actual 901symbol information in BFD private data. This is called via 902@samp{bfd_get_dynamic_symtab_upper_bound}. The corresponding field in 903the target vector is named @samp{_bfd_get_dynamic_symtab_upper_bound}. 904 905@item _canonicalize_dynamic_symtab 906Read the dynamic symbol table. This is called via 907@samp{bfd_canonicalize_dynamic_symtab}. The corresponding field in the 908target vector is named @samp{_bfd_canonicalize_dynamic_symtab}. 909 910@item _get_dynamic_reloc_upper_bound 911Return a sensible upper bound on the amount of memory which will be 912required to read the dynamic relocations. In practice most targets 913return the amount of memory required to hold @samp{arelent} pointers for 914all the relocations plus a trailing @samp{NULL} entry, and store the 915actual relocation information in BFD private data. This is called via 916@samp{bfd_get_dynamic_reloc_upper_bound}. The corresponding field in 917the target vector is named @samp{_bfd_get_dynamic_reloc_upper_bound}. 918 919@item _canonicalize_dynamic_reloc 920Read the dynamic relocations. This is called via 921@samp{bfd_canonicalize_dynamic_reloc}. The corresponding field in the 922target vector is named @samp{_bfd_canonicalize_dynamic_reloc}. 923@end table 924 925@node BFD generated files 926@section BFD generated files 927@cindex generated files in bfd 928@cindex bfd generated files 929 930BFD contains several automatically generated files. This section 931describes them. Some files are created at configure time, when you 932configure BFD. Some files are created at make time, when you build 933BFD. Some files are automatically rebuilt at make time, but only if 934you configure with the @samp{--enable-maintainer-mode} option. Some 935files live in the object directory---the directory from which you run 936configure---and some live in the source directory. All files that live 937in the source directory are checked into the git repository. 938 939@table @file 940@item bfd.h 941@cindex @file{bfd.h} 942@cindex @file{bfd-in3.h} 943Lives in the object directory. Created at make time from 944@file{bfd-in2.h} via @file{bfd-in3.h}. @file{bfd-in3.h} is created at 945configure time from @file{bfd-in2.h}. There are automatic dependencies 946to rebuild @file{bfd-in3.h} and hence @file{bfd.h} if @file{bfd-in2.h} 947changes, so you can normally ignore @file{bfd-in3.h}, and just think 948about @file{bfd-in2.h} and @file{bfd.h}. 949 950@file{bfd.h} is built by replacing a few strings in @file{bfd-in2.h}. 951To see them, search for @samp{@@} in @file{bfd-in2.h}. They mainly 952control whether BFD is built for a 32 bit target or a 64 bit target. 953 954@item bfd-in2.h 955@cindex @file{bfd-in2.h} 956Lives in the source directory. Created from @file{bfd-in.h} and several 957other BFD source files. If you configure with the 958@samp{--enable-maintainer-mode} option, @file{bfd-in2.h} is rebuilt 959automatically when a source file changes. 960 961@item elf32-target.h 962@itemx elf64-target.h 963@cindex @file{elf32-target.h} 964@cindex @file{elf64-target.h} 965Live in the object directory. Created from @file{elfxx-target.h}. 966These files are versions of @file{elfxx-target.h} customized for either 967a 32 bit ELF target or a 64 bit ELF target. 968 969@item libbfd.h 970@cindex @file{libbfd.h} 971Lives in the source directory. Created from @file{libbfd-in.h} and 972several other BFD source files. If you configure with the 973@samp{--enable-maintainer-mode} option, @file{libbfd.h} is rebuilt 974automatically when a source file changes. 975 976@item libcoff.h 977@cindex @file{libcoff.h} 978Lives in the source directory. Created from @file{libcoff-in.h} and 979@file{coffcode.h}. If you configure with the 980@samp{--enable-maintainer-mode} option, @file{libcoff.h} is rebuilt 981automatically when a source file changes. 982 983@item targmatch.h 984@cindex @file{targmatch.h} 985Lives in the object directory. Created at make time from 986@file{config.bfd}. This file is used to map configuration triplets into 987BFD target vector variable names at run time. 988@end table 989 990@node BFD multiple compilations 991@section Files compiled multiple times in BFD 992Several files in BFD are compiled multiple times. By this I mean that 993there are header files which contain function definitions. These header 994files are included by other files, and thus the functions are compiled 995once per file which includes them. 996 997Preprocessor macros are used to control the compilation, so that each 998time the files are compiled the resulting functions are slightly 999different. Naturally, if they weren't different, there would be no 1000reason to compile them multiple times. 1001 1002This is a not a particularly good programming technique, and future BFD 1003work should avoid it. 1004 1005@itemize @bullet 1006@item 1007Since this technique is rarely used, even experienced C programmers find 1008it confusing. 1009 1010@item 1011It is difficult to debug programs which use BFD, since there is no way 1012to describe which version of a particular function you are looking at. 1013 1014@item 1015Programs which use BFD wind up incorporating two or more slightly 1016different versions of the same function, which wastes space in the 1017executable. 1018 1019@item 1020This technique is never required nor is it especially efficient. It is 1021always possible to use statically initialized structures holding 1022function pointers and magic constants instead. 1023@end itemize 1024 1025The following is a list of the files which are compiled multiple times. 1026 1027@table @file 1028@item aout-target.h 1029@cindex @file{aout-target.h} 1030Describes a few functions and the target vector for a.out targets. This 1031is used by individual a.out targets with different definitions of 1032@samp{N_TXTADDR} and similar a.out macros. 1033 1034@item aoutf1.h 1035@cindex @file{aoutf1.h} 1036Implements standard SunOS a.out files. In principle it supports 64 bit 1037a.out targets based on the preprocessor macro @samp{ARCH_SIZE}, but 1038since all known a.out targets are 32 bits, this code may or may not 1039work. This file is only included by a few other files, and it is 1040difficult to justify its existence. 1041 1042@item aoutx.h 1043@cindex @file{aoutx.h} 1044Implements basic a.out support routines. This file can be compiled for 1045either 32 or 64 bit support. Since all known a.out targets are 32 bits, 1046the 64 bit support may or may not work. I believe the original 1047intention was that this file would only be included by @samp{aout32.c} 1048and @samp{aout64.c}, and that other a.out targets would simply refer to 1049the functions it defined. Unfortunately, some other a.out targets 1050started including it directly, leading to a somewhat confused state of 1051affairs. 1052 1053@item coffcode.h 1054@cindex @file{coffcode.h} 1055Implements basic COFF support routines. This file is included by every 1056COFF target. It implements code which handles COFF magic numbers as 1057well as various hook functions called by the generic COFF functions in 1058@file{coffgen.c}. This file is controlled by a number of different 1059macros, and more are added regularly. 1060 1061@item coffswap.h 1062@cindex @file{coffswap.h} 1063Implements COFF swapping routines. This file is included by 1064@file{coffcode.h}, and thus by every COFF target. It implements the 1065routines which swap COFF structures between internal and external 1066format. The main control for this file is the external structure 1067definitions in the files in the @file{include/coff} directory. A COFF 1068target file will include one of those files before including 1069@file{coffcode.h} and thus @file{coffswap.h}. There are a few other 1070macros which affect @file{coffswap.h} as well, mostly describing whether 1071certain fields are present in the external structures. 1072 1073@item ecoffswap.h 1074@cindex @file{ecoffswap.h} 1075Implements ECOFF swapping routines. This is like @file{coffswap.h}, but 1076for ECOFF. It is included by the ECOFF target files (of which there are 1077only two). The control is the preprocessor macro @samp{ECOFF_32} or 1078@samp{ECOFF_64}. 1079 1080@item elfcode.h 1081@cindex @file{elfcode.h} 1082Implements ELF functions that use external structure definitions. This 1083file is included by two other files: @file{elf32.c} and @file{elf64.c}. 1084It is controlled by the @samp{ARCH_SIZE} macro which is defined to be 1085@samp{32} or @samp{64} before including it. The @samp{NAME} macro is 1086used internally to give the functions different names for the two target 1087sizes. 1088 1089@item elfcore.h 1090@cindex @file{elfcore.h} 1091Like @file{elfcode.h}, but for functions that are specific to ELF core 1092files. This is included only by @file{elfcode.h}. 1093 1094@item elfxx-target.h 1095@cindex @file{elfxx-target.h} 1096This file is the source for the generated files @file{elf32-target.h} 1097and @file{elf64-target.h}, one of which is included by every ELF target. 1098It defines the ELF target vector. 1099 1100@item netbsd.h 1101@cindex @file{netbsd.h} 1102Used by all netbsd aout targets. Several other files include it. 1103 1104@item peicode.h 1105@cindex @file{peicode.h} 1106Provides swapping routines and other hooks for PE targets. 1107@file{coffcode.h} will include this rather than @file{coffswap.h} for a 1108PE target. This defines PE specific versions of the COFF swapping 1109routines, and also defines some macros which control @file{coffcode.h} 1110itself. 1111@end table 1112 1113@node BFD relocation handling 1114@section BFD relocation handling 1115@cindex bfd relocation handling 1116@cindex relocations in bfd 1117 1118The handling of relocations is one of the more confusing aspects of BFD. 1119Relocation handling has been implemented in various different ways, all 1120somewhat incompatible, none perfect. 1121 1122@menu 1123* BFD relocation concepts:: BFD relocation concepts 1124* BFD relocation functions:: BFD relocation functions 1125* BFD relocation codes:: BFD relocation codes 1126* BFD relocation future:: BFD relocation future 1127@end menu 1128 1129@node BFD relocation concepts 1130@subsection BFD relocation concepts 1131 1132A relocation is an action which the linker must take when linking. It 1133describes a change to the contents of a section. The change is normally 1134based on the final value of one or more symbols. Relocations are 1135created by the assembler when it creates an object file. 1136 1137Most relocations are simple. A typical simple relocation is to set 32 1138bits at a given offset in a section to the value of a symbol. This type 1139of relocation would be generated for code like @code{int *p = &i;} where 1140@samp{p} and @samp{i} are global variables. A relocation for the symbol 1141@samp{i} would be generated such that the linker would initialize the 1142area of memory which holds the value of @samp{p} to the value of the 1143symbol @samp{i}. 1144 1145Slightly more complex relocations may include an addend, which is a 1146constant to add to the symbol value before using it. In some cases a 1147relocation will require adding the symbol value to the existing contents 1148of the section in the object file. In others the relocation will simply 1149replace the contents of the section with the symbol value. Some 1150relocations are PC relative, so that the value to be stored in the 1151section is the difference between the value of a symbol and the final 1152address of the section contents. 1153 1154In general, relocations can be arbitrarily complex. For example, 1155relocations used in dynamic linking systems often require the linker to 1156allocate space in a different section and use the offset within that 1157section as the value to store. 1158 1159When doing a relocatable link, the linker may or may not have to do 1160anything with a relocation, depending upon the definition of the 1161relocation. Simple relocations generally do not require any special 1162action. 1163 1164@node BFD relocation functions 1165@subsection BFD relocation functions 1166 1167In BFD, each section has an array of @samp{arelent} structures. Each 1168structure has a pointer to a symbol, an address within the section, an 1169addend, and a pointer to a @samp{reloc_howto_struct} structure. The 1170howto structure has a bunch of fields describing the reloc, including a 1171type field. The type field is specific to the object file format 1172backend; none of the generic code in BFD examines it. 1173 1174Originally, the function @samp{bfd_perform_relocation} was supposed to 1175handle all relocations. In theory, many relocations would be simple 1176enough to be described by the fields in the howto structure. For those 1177that weren't, the howto structure included a @samp{special_function} 1178field to use as an escape. 1179 1180While this seems plausible, a look at @samp{bfd_perform_relocation} 1181shows that it failed. The function has odd special cases. Some of the 1182fields in the howto structure, such as @samp{pcrel_offset}, were not 1183adequately documented. 1184 1185The linker uses @samp{bfd_perform_relocation} to do all relocations when 1186the input and output file have different formats (e.g., when generating 1187S-records). The generic linker code, which is used by all targets which 1188do not define their own special purpose linker, uses 1189@samp{bfd_get_relocated_section_contents}, which for most targets turns 1190into a call to @samp{bfd_generic_get_relocated_section_contents}, which 1191calls @samp{bfd_perform_relocation}. So @samp{bfd_perform_relocation} 1192is still widely used, which makes it difficult to change, since it is 1193difficult to test all possible cases. 1194 1195The assembler used @samp{bfd_perform_relocation} for a while. This 1196turned out to be the wrong thing to do, since 1197@samp{bfd_perform_relocation} was written to handle relocations on an 1198existing object file, while the assembler needed to create relocations 1199in a new object file. The assembler was changed to use the new function 1200@samp{bfd_install_relocation} instead, and @samp{bfd_install_relocation} 1201was created as a copy of @samp{bfd_perform_relocation}. 1202 1203Unfortunately, the work did not progress any farther, so 1204@samp{bfd_install_relocation} remains a simple copy of 1205@samp{bfd_perform_relocation}, with all the odd special cases and 1206confusing code. This again is difficult to change, because again any 1207change can affect any assembler target, and so is difficult to test. 1208 1209The new linker, when using the same object file format for all input 1210files and the output file, does not convert relocations into 1211@samp{arelent} structures, so it can not use 1212@samp{bfd_perform_relocation} at all. Instead, users of the new linker 1213are expected to write a @samp{relocate_section} function which will 1214handle relocations in a target specific fashion. 1215 1216There are two helper functions for target specific relocation: 1217@samp{_bfd_final_link_relocate} and @samp{_bfd_relocate_contents}. 1218These functions use a howto structure, but they @emph{do not} use the 1219@samp{special_function} field. Since the functions are normally called 1220from target specific code, the @samp{special_function} field adds 1221little; any relocations which require special handling can be handled 1222without calling those functions. 1223 1224So, if you want to add a new target, or add a new relocation to an 1225existing target, you need to do the following: 1226 1227@itemize @bullet 1228@item 1229Make sure you clearly understand what the contents of the section should 1230look like after assembly, after a relocatable link, and after a final 1231link. Make sure you clearly understand the operations the linker must 1232perform during a relocatable link and during a final link. 1233 1234@item 1235Write a howto structure for the relocation. The howto structure is 1236flexible enough to represent any relocation which should be handled by 1237setting a contiguous bitfield in the destination to the value of a 1238symbol, possibly with an addend, possibly adding the symbol value to the 1239value already present in the destination. 1240 1241@item 1242Change the assembler to generate your relocation. The assembler will 1243call @samp{bfd_install_relocation}, so your howto structure has to be 1244able to handle that. You may need to set the @samp{special_function} 1245field to handle assembly correctly. Be careful to ensure that any code 1246you write to handle the assembler will also work correctly when doing a 1247relocatable link. For example, see @samp{bfd_elf_generic_reloc}. 1248 1249@item 1250Test the assembler. Consider the cases of relocation against an 1251undefined symbol, a common symbol, a symbol defined in the object file 1252in the same section, and a symbol defined in the object file in a 1253different section. These cases may not all be applicable for your 1254reloc. 1255 1256@item 1257If your target uses the new linker, which is recommended, add any 1258required handling to the target specific relocation function. In simple 1259cases this will just involve a call to @samp{_bfd_final_link_relocate} 1260or @samp{_bfd_relocate_contents}, depending upon the definition of the 1261relocation and whether the link is relocatable or not. 1262 1263@item 1264Test the linker. Test the case of a final link. If the relocation can 1265overflow, use a linker script to force an overflow and make sure the 1266error is reported correctly. Test a relocatable link, whether the 1267symbol is defined or undefined in the relocatable output. For both the 1268final and relocatable link, test the case when the symbol is a common 1269symbol, when the symbol looked like a common symbol but became a defined 1270symbol, when the symbol is defined in a different object file, and when 1271the symbol is defined in the same object file. 1272 1273@item 1274In order for linking to another object file format, such as S-records, 1275to work correctly, @samp{bfd_perform_relocation} has to do the right 1276thing for the relocation. You may need to set the 1277@samp{special_function} field to handle this correctly. Test this by 1278doing a link in which the output object file format is S-records. 1279 1280@item 1281Using the linker to generate relocatable output in a different object 1282file format is impossible in the general case, so you generally don't 1283have to worry about that. The GNU linker makes sure to stop that from 1284happening when an input file in a different format has relocations. 1285 1286Linking input files of different object file formats together is quite 1287unusual, but if you're really dedicated you may want to consider testing 1288this case, both when the output object file format is the same as your 1289format, and when it is different. 1290@end itemize 1291 1292@node BFD relocation codes 1293@subsection BFD relocation codes 1294 1295BFD has another way of describing relocations besides the howto 1296structures described above: the enum @samp{bfd_reloc_code_real_type}. 1297 1298Every known relocation type can be described as a value in this 1299enumeration. The enumeration contains many target specific relocations, 1300but where two or more targets have the same relocation, a single code is 1301used. For example, the single value @samp{BFD_RELOC_32} is used for all 1302simple 32 bit relocation types. 1303 1304The main purpose of this relocation code is to give the assembler some 1305mechanism to create @samp{arelent} structures. In order for the 1306assembler to create an @samp{arelent} structure, it has to be able to 1307obtain a howto structure. The function @samp{bfd_reloc_type_lookup}, 1308which simply calls the target vector entry point 1309@samp{reloc_type_lookup}, takes a relocation code and returns a howto 1310structure. 1311 1312The function @samp{bfd_get_reloc_code_name} returns the name of a 1313relocation code. This is mainly used in error messages. 1314 1315Using both howto structures and relocation codes can be somewhat 1316confusing. There are many processor specific relocation codes. 1317However, the relocation is only fully defined by the howto structure. 1318The same relocation code will map to different howto structures in 1319different object file formats. For example, the addend handling may be 1320different. 1321 1322Most of the relocation codes are not really general. The assembler can 1323not use them without already understanding what sorts of relocations can 1324be used for a particular target. It might be possible to replace the 1325relocation codes with something simpler. 1326 1327@node BFD relocation future 1328@subsection BFD relocation future 1329 1330Clearly the current BFD relocation support is in bad shape. A 1331wholescale rewrite would be very difficult, because it would require 1332thorough testing of every BFD target. So some sort of incremental 1333change is required. 1334 1335My vague thoughts on this would involve defining a new, clearly defined, 1336howto structure. Some mechanism would be used to determine which type 1337of howto structure was being used by a particular format. 1338 1339The new howto structure would clearly define the relocation behaviour in 1340the case of an assembly, a relocatable link, and a final link. At 1341least one special function would be defined as an escape, and it might 1342make sense to define more. 1343 1344One or more generic functions similar to @samp{bfd_perform_relocation} 1345would be written to handle the new howto structure. 1346 1347This should make it possible to write a generic version of the relocate 1348section functions used by the new linker. The target specific code 1349would provide some mechanism (a function pointer or an initial 1350conversion) to convert target specific relocations into howto 1351structures. 1352 1353Ideally it would be possible to use this generic relocate section 1354function for the generic linker as well. That is, it would replace the 1355@samp{bfd_generic_get_relocated_section_contents} function which is 1356currently normally used. 1357 1358For the special case of ELF dynamic linking, more consideration needs to 1359be given to writing ELF specific but ELF target generic code to handle 1360special relocation types such as GOT and PLT. 1361 1362@node BFD ELF support 1363@section BFD ELF support 1364@cindex elf support in bfd 1365@cindex bfd elf support 1366 1367The ELF object file format is defined in two parts: a generic ABI and a 1368processor specific supplement. The ELF support in BFD is split in a 1369similar fashion. The processor specific support is largely kept within 1370a single file. The generic support is provided by several other files. 1371The processor specific support provides a set of function pointers and 1372constants used by the generic support. 1373 1374@menu 1375* BFD ELF sections and segments:: ELF sections and segments 1376* BFD ELF generic support:: BFD ELF generic support 1377* BFD ELF processor specific support:: BFD ELF processor specific support 1378* BFD ELF core files:: BFD ELF core files 1379* BFD ELF future:: BFD ELF future 1380@end menu 1381 1382@node BFD ELF sections and segments 1383@subsection ELF sections and segments 1384 1385The ELF ABI permits a file to have either sections or segments or both. 1386Relocatable object files conventionally have only sections. 1387Executables conventionally have both. Core files conventionally have 1388only program segments. 1389 1390ELF sections are similar to sections in other object file formats: they 1391have a name, a VMA, file contents, flags, and other miscellaneous 1392information. ELF relocations are stored in sections of a particular 1393type; BFD automatically converts these sections into internal relocation 1394information. 1395 1396ELF program segments are intended for fast interpretation by a system 1397loader. They have a type, a VMA, an LMA, file contents, and a couple of 1398other fields. When an ELF executable is run on a Unix system, the 1399system loader will examine the program segments to decide how to load 1400it. The loader will ignore the section information. Loadable program 1401segments (type @samp{PT_LOAD}) are directly loaded into memory. Other 1402program segments are interpreted by the loader, and generally provide 1403dynamic linking information. 1404 1405When an ELF file has both program segments and sections, an ELF program 1406segment may encompass one or more ELF sections, in the sense that the 1407portion of the file which corresponds to the program segment may include 1408the portions of the file corresponding to one or more sections. When 1409there is more than one section in a loadable program segment, the 1410relative positions of the section contents in the file must correspond 1411to the relative positions they should hold when the program segment is 1412loaded. This requirement should be obvious if you consider that the 1413system loader will load an entire program segment at a time. 1414 1415On a system which supports dynamic paging, such as any native Unix 1416system, the contents of a loadable program segment must be at the same 1417offset in the file as in memory, modulo the memory page size used on the 1418system. This is because the system loader will map the file into memory 1419starting at the start of a page. The system loader can easily remap 1420entire pages to the correct load address. However, if the contents of 1421the file were not correctly aligned within the page, the system loader 1422would have to shift the contents around within the page, which is too 1423expensive. For example, if the LMA of a loadable program segment is 1424@samp{0x40080} and the page size is @samp{0x1000}, then the position of 1425the segment contents within the file must equal @samp{0x80} modulo 1426@samp{0x1000}. 1427 1428BFD has only a single set of sections. It does not provide any generic 1429way to examine both sections and segments. When BFD is used to open an 1430object file or executable, the BFD sections will represent ELF sections. 1431When BFD is used to open a core file, the BFD sections will represent 1432ELF program segments. 1433 1434When BFD is used to examine an object file or executable, any program 1435segments will be read to set the LMA of the sections. This is because 1436ELF sections only have a VMA, while ELF program segments have both a VMA 1437and an LMA. Any program segments will be copied by the 1438@samp{copy_private} entry points. They will be printed by the 1439@samp{print_private} entry point. Otherwise, the program segments are 1440ignored. In particular, programs which use BFD currently have no direct 1441access to the program segments. 1442 1443When BFD is used to create an executable, the program segments will be 1444created automatically based on the section information. This is done in 1445the function @samp{assign_file_positions_for_segments} in @file{elf.c}. 1446This function has been tweaked many times, and probably still has 1447problems that arise in particular cases. 1448 1449There is a hook which may be used to explicitly define the program 1450segments when creating an executable: the @samp{bfd_record_phdr} 1451function in @file{bfd.c}. If this function is called, BFD will not 1452create program segments itself, but will only create the program 1453segments specified by the caller. The linker uses this function to 1454implement the @samp{PHDRS} linker script command. 1455 1456@node BFD ELF generic support 1457@subsection BFD ELF generic support 1458 1459In general, functions which do not read external data from the ELF file 1460are found in @file{elf.c}. They operate on the internal forms of the 1461ELF structures, which are defined in @file{include/elf/internal.h}. The 1462internal structures are defined in terms of @samp{bfd_vma}, and so may 1463be used for both 32 bit and 64 bit ELF targets. 1464 1465The file @file{elfcode.h} contains functions which operate on the 1466external data. @file{elfcode.h} is compiled twice, once via 1467@file{elf32.c} with @samp{ARCH_SIZE} defined as @samp{32}, and once via 1468@file{elf64.c} with @samp{ARCH_SIZE} defined as @samp{64}. 1469@file{elfcode.h} includes functions to swap the ELF structures in and 1470out of external form, as well as a few more complex functions. 1471 1472Linker support is found in @file{elflink.c}. The 1473linker support is only used if the processor specific file defines 1474@samp{elf_backend_relocate_section}, which is required to relocate the 1475section contents. If that macro is not defined, the generic linker code 1476is used, and relocations are handled via @samp{bfd_perform_relocation}. 1477 1478The core file support is in @file{elfcore.h}, which is compiled twice, 1479for both 32 and 64 bit support. The more interesting cases of core file 1480support only work on a native system which has the @file{sys/procfs.h} 1481header file. Without that file, the core file support does little more 1482than read the ELF program segments as BFD sections. 1483 1484The BFD internal header file @file{elf-bfd.h} is used for communication 1485among these files and the processor specific files. 1486 1487The default entries for the BFD ELF target vector are found mainly in 1488@file{elf.c}. Some functions are found in @file{elfcode.h}. 1489 1490The processor specific files may override particular entries in the 1491target vector, but most do not, with one exception: the 1492@samp{bfd_reloc_type_lookup} entry point is always processor specific. 1493 1494@node BFD ELF processor specific support 1495@subsection BFD ELF processor specific support 1496 1497By convention, the processor specific support for a particular processor 1498will be found in @file{elf@var{nn}-@var{cpu}.c}, where @var{nn} is 1499either 32 or 64, and @var{cpu} is the name of the processor. 1500 1501@menu 1502* BFD ELF processor required:: Required processor specific support 1503* BFD ELF processor linker:: Processor specific linker support 1504* BFD ELF processor other:: Other processor specific support options 1505@end menu 1506 1507@node BFD ELF processor required 1508@subsubsection Required processor specific support 1509 1510When writing a @file{elf@var{nn}-@var{cpu}.c} file, you must do the 1511following: 1512 1513@itemize @bullet 1514@item 1515Define either @samp{TARGET_BIG_SYM} or @samp{TARGET_LITTLE_SYM}, or 1516both, to a unique C name to use for the target vector. This name should 1517appear in the list of target vectors in @file{targets.c}, and will also 1518have to appear in @file{config.bfd} and @file{configure.ac}. Define 1519@samp{TARGET_BIG_SYM} for a big-endian processor, 1520@samp{TARGET_LITTLE_SYM} for a little-endian processor, and define both 1521for a bi-endian processor. 1522@item 1523Define either @samp{TARGET_BIG_NAME} or @samp{TARGET_LITTLE_NAME}, or 1524both, to a string used as the name of the target vector. This is the 1525name which a user of the BFD tool would use to specify the object file 1526format. It would normally appear in a linker emulation parameters 1527file. 1528@item 1529Define @samp{ELF_ARCH} to the BFD architecture (an element of the 1530@samp{bfd_architecture} enum, typically @samp{bfd_arch_@var{cpu}}). 1531@item 1532Define @samp{ELF_MACHINE_CODE} to the magic number which should appear 1533in the @samp{e_machine} field of the ELF header. As of this writing, 1534these magic numbers are assigned by Caldera; if you want to get a magic 1535number for a particular processor, try sending a note to 1536@email{registry@@caldera.com}. In the BFD sources, the magic numbers are 1537found in @file{include/elf/common.h}; they have names beginning with 1538@samp{EM_}. 1539@item 1540Define @samp{ELF_MAXPAGESIZE} to the maximum size of a virtual page in 1541memory. This can normally be found at the start of chapter 5 in the 1542processor specific supplement. For a processor which will only be used 1543in an embedded system, or which has no memory management hardware, this 1544can simply be @samp{1}. 1545@item 1546If the format should use @samp{Rel} rather than @samp{Rela} relocations, 1547define @samp{USE_REL}. This is normally defined in chapter 4 of the 1548processor specific supplement. 1549 1550In the absence of a supplement, it's easier to work with @samp{Rela} 1551relocations. @samp{Rela} relocations will require more space in object 1552files (but not in executables, except when using dynamic linking). 1553However, this is outweighed by the simplicity of addend handling when 1554using @samp{Rela} relocations. With @samp{Rel} relocations, the addend 1555must be stored in the section contents, which makes relocatable links 1556more complex. 1557 1558For example, consider C code like @code{i = a[1000];} where @samp{a} is 1559a global array. The instructions which load the value of @samp{a[1000]} 1560will most likely use a relocation which refers to the symbol 1561representing @samp{a}, with an addend that gives the offset from the 1562start of @samp{a} to element @samp{1000}. When using @samp{Rel} 1563relocations, that addend must be stored in the instructions themselves. 1564If you are adding support for a RISC chip which uses two or more 1565instructions to load an address, then the addend may not fit in a single 1566instruction, and will have to be somehow split among the instructions. 1567This makes linking awkward, particularly when doing a relocatable link 1568in which the addend may have to be updated. It can be done---the MIPS 1569ELF support does it---but it should be avoided when possible. 1570 1571It is possible, though somewhat awkward, to support both @samp{Rel} and 1572@samp{Rela} relocations for a single target; @file{elf64-mips.c} does it 1573by overriding the relocation reading and writing routines. 1574@item 1575Define howto structures for all the relocation types. 1576@item 1577Define a @samp{bfd_reloc_type_lookup} routine. This must be named 1578@samp{bfd_elf@var{nn}_bfd_reloc_type_lookup}, and may be either a 1579function or a macro. It must translate a BFD relocation code into a 1580howto structure. This is normally a table lookup or a simple switch. 1581@item 1582If using @samp{Rel} relocations, define @samp{elf_info_to_howto_rel}. 1583If using @samp{Rela} relocations, define @samp{elf_info_to_howto}. 1584Either way, this is a macro defined as the name of a function which 1585takes an @samp{arelent} and a @samp{Rel} or @samp{Rela} structure, and 1586sets the @samp{howto} field of the @samp{arelent} based on the 1587@samp{Rel} or @samp{Rela} structure. This is normally uses 1588@samp{ELF@var{nn}_R_TYPE} to get the ELF relocation type and uses it as 1589an index into a table of howto structures. 1590@end itemize 1591 1592You must also add the magic number for this processor to the 1593@samp{prep_headers} function in @file{elf.c}. 1594 1595You must also create a header file in the @file{include/elf} directory 1596called @file{@var{cpu}.h}. This file should define any target specific 1597information which may be needed outside of the BFD code. In particular 1598it should use the @samp{START_RELOC_NUMBERS}, @samp{RELOC_NUMBER}, 1599@samp{FAKE_RELOC}, @samp{EMPTY_RELOC} and @samp{END_RELOC_NUMBERS} 1600macros to create a table mapping the number used to identify a 1601relocation to a name describing that relocation. 1602 1603While not a BFD component, you probably also want to make the binutils 1604program @samp{readelf} parse your ELF objects. For this, you need to add 1605code for @code{EM_@var{cpu}} as appropriate in @file{binutils/readelf.c}. 1606 1607@node BFD ELF processor linker 1608@subsubsection Processor specific linker support 1609 1610The linker will be much more efficient if you define a relocate section 1611function. This will permit BFD to use the ELF specific linker support. 1612 1613If you do not define a relocate section function, BFD must use the 1614generic linker support, which requires converting all symbols and 1615relocations into BFD @samp{asymbol} and @samp{arelent} structures. In 1616this case, relocations will be handled by calling 1617@samp{bfd_perform_relocation}, which will use the howto structures you 1618have defined. @xref{BFD relocation handling}. 1619 1620In order to support linking into a different object file format, such as 1621S-records, @samp{bfd_perform_relocation} must work correctly with your 1622howto structures, so you can't skip that step. However, if you define 1623the relocate section function, then in the normal case of linking into 1624an ELF file the linker will not need to convert symbols and relocations, 1625and will be much more efficient. 1626 1627To use a relocation section function, define the macro 1628@samp{elf_backend_relocate_section} as the name of a function which will 1629take the contents of a section, as well as relocation, symbol, and other 1630information, and modify the section contents according to the relocation 1631information. In simple cases, this is little more than a loop over the 1632relocations which computes the value of each relocation and calls 1633@samp{_bfd_final_link_relocate}. The function must check for a 1634relocatable link, and in that case normally needs to do nothing other 1635than adjust the addend for relocations against a section symbol. 1636 1637The complex cases generally have to do with dynamic linker support. GOT 1638and PLT relocations must be handled specially, and the linker normally 1639arranges to set up the GOT and PLT sections while handling relocations. 1640When generating a shared library, random relocations must normally be 1641copied into the shared library, or converted to RELATIVE relocations 1642when possible. 1643 1644@node BFD ELF processor other 1645@subsubsection Other processor specific support options 1646 1647There are many other macros which may be defined in 1648@file{elf@var{nn}-@var{cpu}.c}. These macros may be found in 1649@file{elfxx-target.h}. 1650 1651Macros may be used to override some of the generic ELF target vector 1652functions. 1653 1654Several processor specific hook functions which may be defined as 1655macros. These functions are found as function pointers in the 1656@samp{elf_backend_data} structure defined in @file{elf-bfd.h}. In 1657general, a hook function is set by defining a macro 1658@samp{elf_backend_@var{name}}. 1659 1660There are a few processor specific constants which may also be defined. 1661These are again found in the @samp{elf_backend_data} structure. 1662 1663I will not define the various functions and constants here; see the 1664comments in @file{elf-bfd.h}. 1665 1666Normally any odd characteristic of a particular ELF processor is handled 1667via a hook function. For example, the special @samp{SHN_MIPS_SCOMMON} 1668section number found in MIPS ELF is handled via the hooks 1669@samp{section_from_bfd_section}, @samp{symbol_processing}, 1670@samp{add_symbol_hook}, and @samp{output_symbol_hook}. 1671 1672Dynamic linking support, which involves processor specific relocations 1673requiring special handling, is also implemented via hook functions. 1674 1675@node BFD ELF core files 1676@subsection BFD ELF core files 1677@cindex elf core files 1678 1679On native ELF Unix systems, core files are generated without any 1680sections. Instead, they only have program segments. 1681 1682When BFD is used to read an ELF core file, the BFD sections will 1683actually represent program segments. Since ELF program segments do not 1684have names, BFD will invent names like @samp{segment@var{n}} where 1685@var{n} is a number. 1686 1687A single ELF program segment may include both an initialized part and an 1688uninitialized part. The size of the initialized part is given by the 1689@samp{p_filesz} field. The total size of the segment is given by the 1690@samp{p_memsz} field. If @samp{p_memsz} is larger than @samp{p_filesz}, 1691then the extra space is uninitialized, or, more precisely, initialized 1692to zero. 1693 1694BFD will represent such a program segment as two different sections. 1695The first, named @samp{segment@var{n}a}, will represent the initialized 1696part of the program segment. The second, named @samp{segment@var{n}b}, 1697will represent the uninitialized part. 1698 1699ELF core files store special information such as register values in 1700program segments with the type @samp{PT_NOTE}. BFD will attempt to 1701interpret the information in these segments, and will create additional 1702sections holding the information. Some of this interpretation requires 1703information found in the host header file @file{sys/procfs.h}, and so 1704will only work when BFD is built on a native system. 1705 1706BFD does not currently provide any way to create an ELF core file. In 1707general, BFD does not provide a way to create core files. The way to 1708implement this would be to write @samp{bfd_set_format} and 1709@samp{bfd_write_contents} routines for the @samp{bfd_core} type; see 1710@ref{BFD target vector format}. 1711 1712@node BFD ELF future 1713@subsection BFD ELF future 1714 1715The current dynamic linking support has too much code duplication. 1716While each processor has particular differences, much of the dynamic 1717linking support is quite similar for each processor. The GOT and PLT 1718are handled in fairly similar ways, the details of -Bsymbolic linking 1719are generally similar, etc. This code should be reworked to use more 1720generic functions, eliminating the duplication. 1721 1722Similarly, the relocation handling has too much duplication. Many of 1723the @samp{reloc_type_lookup} and @samp{info_to_howto} functions are 1724quite similar. The relocate section functions are also often quite 1725similar, both in the standard linker handling and the dynamic linker 1726handling. Many of the COFF processor specific backends share a single 1727relocate section function (@samp{_bfd_coff_generic_relocate_section}), 1728and it should be possible to do something like this for the ELF targets 1729as well. 1730 1731The appearance of the processor specific magic number in 1732@samp{prep_headers} in @file{elf.c} is somewhat bogus. It should be 1733possible to add support for a new processor without changing the generic 1734support. 1735 1736The processor function hooks and constants are ad hoc and need better 1737documentation. 1738 1739@node BFD glossary 1740@section BFD glossary 1741@cindex glossary for bfd 1742@cindex bfd glossary 1743 1744This is a short glossary of some BFD terms. 1745 1746@table @asis 1747@item a.out 1748The a.out object file format. The original Unix object file format. 1749Still used on SunOS, though not Solaris. Supports only three sections. 1750 1751@item archive 1752A collection of object files produced and manipulated by the @samp{ar} 1753program. 1754 1755@item backend 1756The implementation within BFD of a particular object file format. The 1757set of functions which appear in a particular target vector. 1758 1759@item BFD 1760The BFD library itself. Also, each object file, archive, or executable 1761opened by the BFD library has the type @samp{bfd *}, and is sometimes 1762referred to as a bfd. 1763 1764@item COFF 1765The Common Object File Format. Used on Unix SVR3. Used by some 1766embedded targets, although ELF is normally better. 1767 1768@item DLL 1769A shared library on Windows. 1770 1771@item dynamic linker 1772When a program linked against a shared library is run, the dynamic 1773linker will locate the appropriate shared library and arrange to somehow 1774include it in the running image. 1775 1776@item dynamic object 1777Another name for an ELF shared library. 1778 1779@item ECOFF 1780The Extended Common Object File Format. Used on Alpha Digital Unix 1781(formerly OSF/1), as well as Ultrix and Irix 4. A variant of COFF. 1782 1783@item ELF 1784The Executable and Linking Format. The object file format used on most 1785modern Unix systems, including GNU/Linux, Solaris, Irix, and SVR4. Also 1786used on many embedded systems. 1787 1788@item executable 1789A program, with instructions and symbols, and perhaps dynamic linking 1790information. Normally produced by a linker. 1791 1792@item LMA 1793Load Memory Address. This is the address at which a section will be 1794loaded. Compare with VMA, below. 1795 1796@item object file 1797A binary file including machine instructions, symbols, and relocation 1798information. Normally produced by an assembler. 1799 1800@item object file format 1801The format of an object file. Typically object files and executables 1802for a particular system are in the same format, although executables 1803will not contain any relocation information. 1804 1805@item PE 1806The Portable Executable format. This is the object file format used for 1807Windows (specifically, Win32) object files. It is based closely on 1808COFF, but has a few significant differences. 1809 1810@item PEI 1811The Portable Executable Image format. This is the object file format 1812used for Windows (specifically, Win32) executables. It is very similar 1813to PE, but includes some additional header information. 1814 1815@item relocations 1816Information used by the linker to adjust section contents. Also called 1817relocs. 1818 1819@item section 1820Object files and executable are composed of sections. Sections have 1821optional data and optional relocation information. 1822 1823@item shared library 1824A library of functions which may be used by many executables without 1825actually being linked into each executable. There are several different 1826implementations of shared libraries, each having slightly different 1827features. 1828 1829@item symbol 1830Each object file and executable may have a list of symbols, often 1831referred to as the symbol table. A symbol is basically a name and an 1832address. There may also be some additional information like the type of 1833symbol, although the type of a symbol is normally something simple like 1834function or object, and should be confused with the more complex C 1835notion of type. Typically every global function and variable in a C 1836program will have an associated symbol. 1837 1838@item target vector 1839A set of functions which implement support for a particular object file 1840format. The @samp{bfd_target} structure. 1841 1842@item Win32 1843The current Windows API, implemented by Windows 95 and later and Windows 1844NT 3.51 and later, but not by Windows 3.1. 1845 1846@item XCOFF 1847The eXtended Common Object File Format. Used on AIX. A variant of 1848COFF, with a completely different symbol table implementation. 1849 1850@item VMA 1851Virtual Memory Address. This is the address a section will have when 1852an executable is run. Compare with LMA, above. 1853@end table 1854 1855@node Index 1856@unnumberedsec Index 1857@printindex cp 1858 1859@contents 1860@bye 1861