design.rst revision 280461
1.. _design: 2 3Linker Design 4============= 5 6Introduction 7------------ 8 9lld is a new generation of linker. It is not "section" based like traditional 10linkers which mostly just interlace sections from multiple object files into the 11output file. Instead, lld is based on "Atoms". Traditional section based 12linking work well for simple linking, but their model makes advanced linking 13features difficult to implement. Features like dead code stripping, reordering 14functions for locality, and C++ coalescing require the linker to work at a finer 15grain. 16 17An atom is an indivisible chunk of code or data. An atom has a set of 18attributes, such as: name, scope, content-type, alignment, etc. An atom also 19has a list of References. A Reference contains: a kind, an optional offset, an 20optional addend, and an optional target atom. 21 22The Atom model allows the linker to use standard graph theory models for linking 23data structures. Each atom is a node, and each Reference is an edge. The 24feature of dead code stripping is implemented by following edges to mark all 25live atoms, and then delete the non-live atoms. 26 27 28Atom Model 29---------- 30 31An atom is an indivisible chunk of code or data. Typically each user written 32function or global variable is an atom. In addition, the compiler may emit 33other atoms, such as for literal c-strings or floating point constants, or for 34runtime data structures like dwarf unwind info or pointers to initializers. 35 36A simple "hello world" object file would be modeled like this: 37 38.. image:: hello.png 39 40There are three atoms: main, a proxy for printf, and an anonymous atom 41containing the c-string literal "hello world". The Atom "main" has two 42references. One is the call site for the call to printf, and the other is a 43reference for the instruction that loads the address of the c-string literal. 44 45There are only four different types of atoms: 46 47 * DefinedAtom 48 95% of all atoms. This is a chunk of code or data 49 50 * UndefinedAtom 51 This is a place holder in object files for a reference to some atom 52 outside the translation unit.During core linking it is usually replaced 53 by (coalesced into) another Atom. 54 55 * SharedLibraryAtom 56 If a required symbol name turns out to be defined in a dynamic shared 57 library (and not some object file). A SharedLibraryAtom is the 58 placeholder Atom used to represent that fact. 59 60 It is similar to an UndefinedAtom, but it also tracks information 61 about the associated shared library. 62 63 * AbsoluteAtom 64 This is for embedded support where some stuff is implemented in ROM at 65 some fixed address. This atom has no content. It is just an address 66 that the Writer needs to fix up any references to point to. 67 68 69File Model 70---------- 71 72The linker views the input files as basically containers of Atoms and 73References, and just a few attributes of their own. The linker works with three 74kinds of files: object files, static libraries, and dynamic shared libraries. 75Each kind of file has reader object which presents the file in the model 76expected by the linker. 77 78Object File 79~~~~~~~~~~~ 80 81An object file is just a container of atoms. When linking an object file, a 82reader is instantiated which parses the object file and instantiates a set of 83atoms representing all content in the .o file. The linker adds all those atoms 84to a master graph. 85 86Static Library (Archive) 87~~~~~~~~~~~~~~~~~~~~~~~~ 88 89This is the traditional unix static archive which is just a collection of object 90files with a "table of contents". When linking with a static library, by default 91nothing is added to the master graph of atoms. Instead, if after merging all 92atoms from object files into a master graph, if any "undefined" atoms are left 93remaining in the master graph, the linker reads the table of contents for each 94static library to see if any have the needed definitions. If so, the set of 95atoms from the specified object file in the static library is added to the 96master graph of atoms. 97 98Dynamic Library (Shared Object) 99~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 100 101Dynamic libraries are different than object files and static libraries in that 102they don't directly add any content. Their purpose is to check at build time 103that the remaining undefined references can be resolved at runtime, and provide 104a list of dynamic libraries (SO_NEEDED) that will be needed at runtime. The way 105this is modeled in the linker is that a dynamic library contributes no atoms to 106the initial graph of atoms. Instead, (like static libraries) if there are 107"undefined" atoms in the master graph of all atoms, then each dynamic library is 108checked to see if exports the required symbol. If so, a "shared library" atom is 109instantiated by the by the reader which the linker uses to replace the 110"undefined" atom. 111 112Linking Steps 113------------- 114 115Through the use of abstract Atoms, the core of linking is architecture 116independent and file format independent. All command line parsing is factored 117out into a separate "options" abstraction which enables the linker to be driven 118with different command line sets. 119 120The overall steps in linking are: 121 122 #. Command line processing 123 124 #. Parsing input files 125 126 #. Resolving 127 128 #. Passes/Optimizations 129 130 #. Generate output file 131 132The Resolving and Passes steps are done purely on the master graph of atoms, so 133they have no notion of file formats such as mach-o or ELF. 134 135 136Input Files 137~~~~~~~~~~~ 138 139Existing developer tools using different file formats for object files. 140A goal of lld is to be file format independent. This is done 141through a plug-in model for reading object files. The lld::Reader is the base 142class for all object file readers. A Reader follows the factory method pattern. 143A Reader instantiates an lld::File object (which is a graph of Atoms) from a 144given object file (on disk or in-memory). 145 146Every Reader subclass defines its own "options" class (for instance the mach-o 147Reader defines the class ReaderOptionsMachO). This options class is the 148one-and-only way to control how the Reader operates when parsing an input file 149into an Atom graph. For instance, you may want the Reader to only accept 150certain architectures. The options class can be instantiated from command 151line options, or it can be subclassed and the ivars programmatically set. 152 153ELF Section Groups 154~~~~~~~~~~~~~~~~~~ 155Reference : `ELF Section Groups <http://mentorembedded.github.io/cxx-abi/abi/prop-72-comdat.html>`_ 156 157C++ has many situations where the compiler may need to emit code or data, 158but may not be able to identify a unique compilation unit where it should be 159emitted. The approach chosen by the C++ ABI group to deal with this problem, is 160to allow the compiler to emit the required information in multiple compilation 161units, in a form which allows the linker to remove all but one copy. This is 162essentially the feature called COMDAT in several existing implementations. 163 164The COMDAT sections in ELF are modeled by using '.group' sections in the input 165files. Each '.group' section is associated with a signature. The '.group' 166section has a list of members that are part of the the '.group' which the linker 167selects to appear in the input file(Whichever .group section appeared first 168in the link). References to any of the '.group' members can also appear from 169outside the '.group'. 170 171In lld the the '.group' sections with COMDAT are identified by contentType( 172typeGroupComdat). The '.group' members are identified by using 173**kindGroupChild** references. 174 175The point to be noted here is the 'group child' members would need to be emitted 176in the output file **iff** the group was selected by the resolver. 177 178This is modeled in lld by removing the 'group child' members from the 179definedAtom List. 180 181Any reference to the group-child from **outside the group** is referenced using 182a 'undefined' atom. 183 184Resolving 185~~~~~~~~~ 186 187The resolving step takes all the atoms' graphs from each object file and 188combines them into one master object graph. Unfortunately, it is not as simple 189as appending the atom list from each file into one big list. There are many 190cases where atoms need to be coalesced. That is, two or more atoms need to be 191coalesced into one atom. This is necessary to support: C language "tentative 192definitions", C++ weak symbols for templates and inlines defined in headers, 193replacing undefined atoms with actual definition atoms, and for merging copies 194of constants like c-strings and floating point constants. 195 196The linker support coalescing by-name and by-content. By-name is used for 197tentative definitions and weak symbols. By-content is used for constant data 198that can be merged. 199 200The resolving process maintains some global linking "state", including a "symbol 201table" which is a map from llvm::StringRef to lld::Atom*. With these data 202structures, the linker iterates all atoms in all input files. For each atom, it 203checks if the atom is named and has a global or hidden scope. If so, the atom 204is added to the symbol table map. If there already is a matching atom in that 205table, that means the current atom needs to be coalesced with the found atom, or 206it is a multiple definition error. 207 208When all initial input file atoms have been processed by the resolver, a scan is 209made to see if there are any undefined atoms in the graph. If there are, the 210linker scans all libraries (both static and dynamic) looking for definitions to 211replace the undefined atoms. It is an error if any undefined atoms are left 212remaining. 213 214Dead code stripping (if requested) is done at the end of resolving. The linker 215does a simple mark-and-sweep. It starts with "root" atoms (like "main" in a main 216executable) and follows each references and marks each Atom that it visits as 217"live". When done, all atoms not marked "live" are removed. 218 219The result of the Resolving phase is the creation of an lld::File object. The 220goal is that the lld::File model is **the** internal representation 221throughout the linker. The file readers parse (mach-o, ELF, COFF) into an 222lld::File. The file writers (mach-o, ELF, COFF) taken an lld::File and produce 223their file kind, and every Pass only operates on an lld::File. This is not only 224a simpler, consistent model, but it enables the state of the linker to be dumped 225at any point in the link for testing purposes. 226 227 228Passes 229~~~~~~ 230 231The Passes step is an open ended set of routines that each get a change to 232modify or enhance the current lld::File object. Some example Passes are: 233 234 * stub (PLT) generation 235 236 * GOT instantiation 237 238 * order_file optimization 239 240 * branch island generation 241 242 * branch shim generation 243 244 * Objective-C optimizations (Darwin specific) 245 246 * TLV instantiation (Darwin specific) 247 248 * DTrace probe processing (Darwin specific) 249 250 * compact unwind encoding (Darwin specific) 251 252 253Some of these passes are specific to Darwin's runtime environments. But many of 254the passes are applicable to any OS (such as generating branch island for out of 255range branch instructions). 256 257The general structure of a pass is to iterate through the atoms in the current 258lld::File object, inspecting each atom and doing something. For instance, the 259stub pass, looks for call sites to shared library atoms (e.g. call to printf). 260It then instantiates a "stub" atom (PLT entry) and a "lazy pointer" atom for 261each proxy atom needed, and these new atoms are added to the current lld::File 262object. Next, all the noted call sites to shared library atoms have their 263References altered to point to the stub atom instead of the shared library atom. 264 265 266Generate Output File 267~~~~~~~~~~~~~~~~~~~~ 268 269Once the passes are done, the output file writer is given current lld::File 270object. The writer's job is to create the executable content file wrapper and 271place the content of the atoms into it. 272 273lld uses a plug-in model for writing output files. All concrete writers (e.g. 274ELF, mach-o, etc) are subclasses of the lld::Writer class. 275 276Unlike the Reader class which has just one method to instantiate an lld::File, 277the Writer class has multiple methods. The crucial method is to generate the 278output file, but there are also methods which allow the Writer to contribute 279Atoms to the resolver and specify passes to run. 280 281An example of contributing 282atoms is that if the Writer knows a main executable is being linked and such 283an executable requires a specially named entry point (e.g. "_main"), the Writer 284can add an UndefinedAtom with that special name to the resolver. This will 285cause the resolver to issue an error if that symbol is not defined. 286 287Sometimes a Writer supports lazily created symbols, such as names for the start 288of sections. To support this, the Writer can create a File object which vends 289no initial atoms, but does lazily supply atoms by name as needed. 290 291Every Writer subclass defines its own "options" class (for instance the mach-o 292Writer defines the class WriterOptionsMachO). This options class is the 293one-and-only way to control how the Writer operates when producing an output 294file from an Atom graph. For instance, you may want the Writer to optimize 295the output for certain OS versions, or strip local symbols, etc. The options 296class can be instantiated from command line options, or it can be subclassed 297and the ivars programmatically set. 298 299 300lld::File representations 301------------------------- 302 303Just as LLVM has three representations of its IR model, lld has three 304representations of its File/Atom/Reference model: 305 306 * In memory, abstract C++ classes (lld::Atom, lld::Reference, and lld::File). 307 308 * textual (in YAML) 309 310 * binary format ("native") 311 312Binary File Format 313~~~~~~~~~~~~~~~~~~ 314 315In theory, lld::File objects could be written to disk in an existing Object File 316format standard (e.g. ELF). Instead we choose to define a new binary file 317format. There are two main reasons for this: fidelity and performance. In order 318for lld to work as a linker on all platforms, its internal model must be rich 319enough to model all CPU and OS linking features. But if we choose an existing 320Object File format as the lld binary format, that means an on going need to 321retrofit each platform specific feature needed from alternate platforms into the 322existing Object File format. Having our own "native" binary format side steps 323that issue. We still need to be able to binary encode all the features, but 324once the in-memory model can represent the feature, it is straight forward to 325binary encode it. 326 327The reason to use a binary file format at all, instead of a textual file format, 328is speed. You want the binary format to be as fast as possible to read into the 329in-memory model. Given that we control the in-memory model and the binary 330format, the obvious way to make reading super fast it to make the file format be 331basically just an array of atoms. The reader just mmaps in the file and looks 332at the header to see how many atoms there are and instantiate that many atom 333objects with the atom attribute information coming from that array. The trick 334is designing this in a way that can be extended as the Atom mode evolves and new 335attributes are added. 336 337The native object file format starts with a header that lists how many "chunks" 338are in the file. A chunk is an array of "ivar data". The native file reader 339instantiates an array of Atom objects (with one large malloc call). Each atom 340contains just a pointer to its vtable and a pointer to its ivar data. All 341methods on lld::Atom are virtual, so all the method implementations return 342values based on the ivar data to which it has a pointer. If a new linking 343features is added which requires a change to the lld::Atom model, a new native 344reader class (e.g. version 2) is defined which knows how to read the new feature 345information from the new ivar data. The old reader class (e.g. version 1) is 346updated to do its best to model (the lack of the new feature) given the old ivar 347data in existing native object files. 348 349With this model for the native file format, files can be read and turned 350into the in-memory graph of lld::Atoms with just a few memory allocations. 351And the format can easily adapt over time to new features. 352 353The binary file format follows the ReaderWriter patterns used in lld. The lld 354library comes with the classes: ReaderNative and WriterNative. So, switching 355between file formats is as easy as switching which Reader subclass is used. 356 357 358Textual representations in YAML 359~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 360 361In designing a textual format we want something easy for humans to read and easy 362for the linker to parse. Since an atom has lots of attributes most of which are 363usually just the default, we should define default values for every attribute so 364that those can be omitted from the text representation. Here is the atoms for a 365simple hello world program expressed in YAML:: 366 367 target-triple: x86_64-apple-darwin11 368 369 atoms: 370 - name: _main 371 scope: global 372 type: code 373 content: [ 55, 48, 89, e5, 48, 8d, 3d, 00, 00, 00, 00, 30, c0, e8, 00, 00, 374 00, 00, 31, c0, 5d, c3 ] 375 fixups: 376 - offset: 07 377 kind: pcrel32 378 target: 2 379 - offset: 0E 380 kind: call32 381 target: _fprintf 382 383 - type: c-string 384 content: [ 73, 5A, 00 ] 385 386 ... 387 388The biggest use for the textual format will be writing test cases. Writing test 389cases in C is problematic because the compiler may vary its output over time for 390its own optimization reasons which my inadvertently disable or break the linker 391feature trying to be tested. By writing test cases in the linkers own textual 392format, we can exactly specify every attribute of every atom and thus target 393specific linker logic. 394 395The textual/YAML format follows the ReaderWriter patterns used in lld. The lld 396library comes with the classes: ReaderYAML and WriterYAML. 397 398 399Testing 400------- 401 402The lld project contains a test suite which is being built up as new code is 403added to lld. All new lld functionality should have a tests added to the test 404suite. The test suite is `lit <http://llvm.org/cmds/lit.html/>`_ driven. Each 405test is a text file with comments telling lit how to run the test and check the 406result To facilitate testing, the lld project builds a tool called lld-core. 407This tool reads a YAML file (default from stdin), parses it into one or more 408lld::File objects in memory and then feeds those lld::File objects to the 409resolver phase. The output of the resolver is written as a native object file. 410It is then read back in using the native object file reader and then pass to the 411YAML writer. This round-about path means that all three representations 412(in-memory, binary, and text) are exercised, and any new feature has to work in 413all the representations to pass the test. 414 415 416Resolver testing 417~~~~~~~~~~~~~~~~ 418 419Basic testing is the "core linking" or resolving phase. That is where the 420linker merges object files. All test cases are written in YAML. One feature of 421YAML is that it allows multiple "documents" to be encoding in one YAML stream. 422That means one text file can appear to the linker as multiple .o files - the 423normal case for the linker. 424 425Here is a simple example of a core linking test case. It checks that an 426undefined atom from one file will be replaced by a definition from another 427file:: 428 429 # RUN: lld-core %s | FileCheck %s 430 431 # 432 # Test that undefined atoms are replaced with defined atoms. 433 # 434 435 --- 436 atoms: 437 - name: foo 438 definition: undefined 439 --- 440 atoms: 441 - name: foo 442 scope: global 443 type: code 444 ... 445 446 # CHECK: name: foo 447 # CHECK: scope: global 448 # CHECK: type: code 449 # CHECK-NOT: name: foo 450 # CHECK: ... 451 452 453Passes testing 454~~~~~~~~~~~~~~ 455 456Since Passes just operate on an lld::File object, the lld-core tool has the 457option to run a particular pass (after resolving). Thus, you can write a YAML 458test case with carefully crafted input to exercise areas of a Pass and the check 459the resulting lld::File object as represented in YAML. 460 461 462Design Issues 463------------- 464 465There are a number of open issues in the design of lld. The plan is to wait and 466make these design decisions when we need to. 467 468 469Debug Info 470~~~~~~~~~~ 471 472Currently, the lld model says nothing about debug info. But the most popular 473debug format is DWARF and there is some impedance mismatch with the lld model 474and DWARF. In lld there are just Atoms and only Atoms that need to be in a 475special section at runtime have an associated section. Also, Atoms do not have 476addresses. The way DWARF is spec'ed different parts of DWARF are supposed to go 477into specially named sections and the DWARF references function code by address. 478 479CPU and OS specific functionality 480~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 481 482Currently, lld has an abstract "Platform" that deals with any CPU or OS specific 483differences in linking. We just keep adding virtual methods to the base 484Platform class as we find linking areas that might need customization. At some 485point we'll need to structure this better. 486 487 488File Attributes 489~~~~~~~~~~~~~~~ 490 491Currently, lld::File just has a path and a way to iterate its atoms. We will 492need to add more attributes on a File. For example, some equivalent to the 493target triple. There is also a number of cached or computed attributes that 494could make various Passes more efficient. For instance, on Darwin there are a 495number of Objective-C optimizations that can be done by a Pass. But it would 496improve the plain C case if the Objective-C optimization Pass did not have to 497scan all atoms looking for any Objective-C data structures. This could be done 498if the lld::File object had an attribute that said if the file had any 499Objective-C data in it. The Resolving phase would then be required to "merge" 500that attribute as object files are added. 501