1.. _exception_handling: 2 3========================== 4Exception Handling in LLVM 5========================== 6 7.. contents:: 8 :local: 9 10Introduction 11============ 12 13This document is the central repository for all information pertaining to 14exception handling in LLVM. It describes the format that LLVM exception 15handling information takes, which is useful for those interested in creating 16front-ends or dealing directly with the information. Further, this document 17provides specific examples of what exception handling information is used for in 18C and C++. 19 20Itanium ABI Zero-cost Exception Handling 21---------------------------------------- 22 23Exception handling for most programming languages is designed to recover from 24conditions that rarely occur during general use of an application. To that end, 25exception handling should not interfere with the main flow of an application's 26algorithm by performing checkpointing tasks, such as saving the current pc or 27register state. 28 29The Itanium ABI Exception Handling Specification defines a methodology for 30providing outlying data in the form of exception tables without inlining 31speculative exception handling code in the flow of an application's main 32algorithm. Thus, the specification is said to add "zero-cost" to the normal 33execution of an application. 34 35A more complete description of the Itanium ABI exception handling runtime 36support of can be found at `Itanium C++ ABI: Exception Handling 37<http://www.codesourcery.com/cxx-abi/abi-eh.html>`_. A description of the 38exception frame format can be found at `Exception Frames 39<http://refspecs.freestandards.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html>`_, 40with details of the DWARF 4 specification at `DWARF 4 Standard 41<http://dwarfstd.org/Dwarf4Std.php>`_. A description for the C++ exception 42table formats can be found at `Exception Handling Tables 43<http://www.codesourcery.com/cxx-abi/exceptions.pdf>`_. 44 45Setjmp/Longjmp Exception Handling 46--------------------------------- 47 48Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics 49`llvm.eh.sjlj.setjmp`_ and `llvm.eh.sjlj.longjmp`_ to handle control flow for 50exception handling. 51 52For each function which does exception processing --- be it ``try``/``catch`` 53blocks or cleanups --- that function registers itself on a global frame 54list. When exceptions are unwinding, the runtime uses this list to identify 55which functions need processing. 56 57Landing pad selection is encoded in the call site entry of the function 58context. The runtime returns to the function via `llvm.eh.sjlj.longjmp`_, where 59a switch table transfers control to the appropriate landing pad based on the 60index stored in the function context. 61 62In contrast to DWARF exception handling, which encodes exception regions and 63frame information in out-of-line tables, SJLJ exception handling builds and 64removes the unwind frame context at runtime. This results in faster exception 65handling at the expense of slower execution when no exceptions are thrown. As 66exceptions are, by their nature, intended for uncommon code paths, DWARF 67exception handling is generally preferred to SJLJ. 68 69Overview 70-------- 71 72When an exception is thrown in LLVM code, the runtime does its best to find a 73handler suited to processing the circumstance. 74 75The runtime first attempts to find an *exception frame* corresponding to the 76function where the exception was thrown. If the programming language supports 77exception handling (e.g. C++), the exception frame contains a reference to an 78exception table describing how to process the exception. If the language does 79not support exception handling (e.g. C), or if the exception needs to be 80forwarded to a prior activation, the exception frame contains information about 81how to unwind the current activation and restore the state of the prior 82activation. This process is repeated until the exception is handled. If the 83exception is not handled and no activations remain, then the application is 84terminated with an appropriate error message. 85 86Because different programming languages have different behaviors when handling 87exceptions, the exception handling ABI provides a mechanism for 88supplying *personalities*. An exception handling personality is defined by 89way of a *personality function* (e.g. ``__gxx_personality_v0`` in C++), 90which receives the context of the exception, an *exception structure* 91containing the exception object type and value, and a reference to the exception 92table for the current function. The personality function for the current 93compile unit is specified in a *common exception frame*. 94 95The organization of an exception table is language dependent. For C++, an 96exception table is organized as a series of code ranges defining what to do if 97an exception occurs in that range. Typically, the information associated with a 98range defines which types of exception objects (using C++ *type info*) that are 99handled in that range, and an associated action that should take place. Actions 100typically pass control to a *landing pad*. 101 102A landing pad corresponds roughly to the code found in the ``catch`` portion of 103a ``try``/``catch`` sequence. When execution resumes at a landing pad, it 104receives an *exception structure* and a *selector value* corresponding to the 105*type* of exception thrown. The selector is then used to determine which *catch* 106should actually process the exception. 107 108LLVM Code Generation 109==================== 110 111From a C++ developer's perspective, exceptions are defined in terms of the 112``throw`` and ``try``/``catch`` statements. In this section we will describe the 113implementation of LLVM exception handling in terms of C++ examples. 114 115Throw 116----- 117 118Languages that support exception handling typically provide a ``throw`` 119operation to initiate the exception process. Internally, a ``throw`` operation 120breaks down into two steps. 121 122#. A request is made to allocate exception space for an exception structure. 123 This structure needs to survive beyond the current activation. This structure 124 will contain the type and value of the object being thrown. 125 126#. A call is made to the runtime to raise the exception, passing the exception 127 structure as an argument. 128 129In C++, the allocation of the exception structure is done by the 130``__cxa_allocate_exception`` runtime function. The exception raising is handled 131by ``__cxa_throw``. The type of the exception is represented using a C++ RTTI 132structure. 133 134Try/Catch 135--------- 136 137A call within the scope of a *try* statement can potentially raise an 138exception. In those circumstances, the LLVM C++ front-end replaces the call with 139an ``invoke`` instruction. Unlike a call, the ``invoke`` has two potential 140continuation points: 141 142#. where to continue when the call succeeds as per normal, and 143 144#. where to continue if the call raises an exception, either by a throw or the 145 unwinding of a throw 146 147The term used to define a the place where an ``invoke`` continues after an 148exception is called a *landing pad*. LLVM landing pads are conceptually 149alternative function entry points where an exception structure reference and a 150type info index are passed in as arguments. The landing pad saves the exception 151structure reference and then proceeds to select the catch block that corresponds 152to the type info of the exception object. 153 154The LLVM `landingpad instruction <LangRef.html#i_landingpad>`_ is used to convey 155information about the landing pad to the back end. For C++, the ``landingpad`` 156instruction returns a pointer and integer pair corresponding to the pointer to 157the *exception structure* and the *selector value* respectively. 158 159The ``landingpad`` instruction takes a reference to the personality function to 160be used for this ``try``/``catch`` sequence. The remainder of the instruction is 161a list of *cleanup*, *catch*, and *filter* clauses. The exception is tested 162against the clauses sequentially from first to last. The selector value is a 163positive number if the exception matched a type info, a negative number if it 164matched a filter, and zero if it matched a cleanup. If nothing is matched, the 165behavior of the program is `undefined`_. If a type info matched, then the 166selector value is the index of the type info in the exception table, which can 167be obtained using the `llvm.eh.typeid.for`_ intrinsic. 168 169Once the landing pad has the type info selector, the code branches to the code 170for the first catch. The catch then checks the value of the type info selector 171against the index of type info for that catch. Since the type info index is not 172known until all the type infos have been gathered in the backend, the catch code 173must call the `llvm.eh.typeid.for`_ intrinsic to determine the index for a given 174type info. If the catch fails to match the selector then control is passed on to 175the next catch. 176 177Finally, the entry and exit of catch code is bracketed with calls to 178``__cxa_begin_catch`` and ``__cxa_end_catch``. 179 180* ``__cxa_begin_catch`` takes an exception structure reference as an argument 181 and returns the value of the exception object. 182 183* ``__cxa_end_catch`` takes no arguments. This function: 184 185 #. Locates the most recently caught exception and decrements its handler 186 count, 187 188 #. Removes the exception from the *caught* stack if the handler count goes to 189 zero, and 190 191 #. Destroys the exception if the handler count goes to zero and the exception 192 was not re-thrown by throw. 193 194 .. note:: 195 196 a rethrow from within the catch may replace this call with a 197 ``__cxa_rethrow``. 198 199Cleanups 200-------- 201 202A cleanup is extra code which needs to be run as part of unwinding a scope. C++ 203destructors are a typical example, but other languages and language extensions 204provide a variety of different kinds of cleanups. In general, a landing pad may 205need to run arbitrary amounts of cleanup code before actually entering a catch 206block. To indicate the presence of cleanups, a `landingpad 207instruction <LangRef.html#i_landingpad>`_ should have a *cleanup* 208clause. Otherwise, the unwinder will not stop at the landing pad if there are no 209catches or filters that require it to. 210 211.. note:: 212 213 Do not allow a new exception to propagate out of the execution of a 214 cleanup. This can corrupt the internal state of the unwinder. Different 215 languages describe different high-level semantics for these situations: for 216 example, C++ requires that the process be terminated, whereas Ada cancels both 217 exceptions and throws a third. 218 219When all cleanups are finished, if the exception is not handled by the current 220function, resume unwinding by calling the `resume 221instruction <LangRef.html#i_resume>`_, passing in the result of the 222``landingpad`` instruction for the original landing pad. 223 224Throw Filters 225------------- 226 227C++ allows the specification of which exception types may be thrown from a 228function. To represent this, a top level landing pad may exist to filter out 229invalid types. To express this in LLVM code the `landingpad 230instruction <LangRef.html#i_landingpad>`_ will have a filter clause. The clause 231consists of an array of type infos. ``landingpad`` will return a negative value 232if the exception does not match any of the type infos. If no match is found then 233a call to ``__cxa_call_unexpected`` should be made, otherwise 234``_Unwind_Resume``. Each of these functions requires a reference to the 235exception structure. Note that the most general form of a ``landingpad`` 236instruction can have any number of catch, cleanup, and filter clauses (though 237having more than one cleanup is pointless). The LLVM C++ front-end can generate 238such ``landingpad`` instructions due to inlining creating nested exception 239handling scopes. 240 241.. _undefined: 242 243Restrictions 244------------ 245 246The unwinder delegates the decision of whether to stop in a call frame to that 247call frame's language-specific personality function. Not all unwinders guarantee 248that they will stop to perform cleanups. For example, the GNU C++ unwinder 249doesn't do so unless the exception is actually caught somewhere further up the 250stack. 251 252In order for inlining to behave correctly, landing pads must be prepared to 253handle selector results that they did not originally advertise. Suppose that a 254function catches exceptions of type ``A``, and it's inlined into a function that 255catches exceptions of type ``B``. The inliner will update the ``landingpad`` 256instruction for the inlined landing pad to include the fact that ``B`` is also 257caught. If that landing pad assumes that it will only be entered to catch an 258``A``, it's in for a rude awakening. Consequently, landing pads must test for 259the selector results they understand and then resume exception propagation with 260the `resume instruction <LangRef.html#i_resume>`_ if none of the conditions 261match. 262 263Exception Handling Intrinsics 264============================= 265 266In addition to the ``landingpad`` and ``resume`` instructions, LLVM uses several 267intrinsic functions (name prefixed with ``llvm.eh``) to provide exception 268handling information at various points in generated code. 269 270.. _llvm.eh.typeid.for: 271 272llvm.eh.typeid.for 273------------------ 274 275.. code-block:: llvm 276 277 i32 @llvm.eh.typeid.for(i8* %type_info) 278 279 280This intrinsic returns the type info index in the exception table of the current 281function. This value can be used to compare against the result of 282``landingpad`` instruction. The single argument is a reference to a type info. 283 284.. _llvm.eh.sjlj.setjmp: 285 286llvm.eh.sjlj.setjmp 287------------------- 288 289.. code-block:: llvm 290 291 i32 @llvm.eh.sjlj.setjmp(i8* %setjmp_buf) 292 293For SJLJ based exception handling, this intrinsic forces register saving for the 294current function and stores the address of the following instruction for use as 295a destination address by `llvm.eh.sjlj.longjmp`_. The buffer format and the 296overall functioning of this intrinsic is compatible with the GCC 297``__builtin_setjmp`` implementation allowing code built with the clang and GCC 298to interoperate. 299 300The single parameter is a pointer to a five word buffer in which the calling 301context is saved. The front end places the frame pointer in the first word, and 302the target implementation of this intrinsic should place the destination address 303for a `llvm.eh.sjlj.longjmp`_ in the second word. The following three words are 304available for use in a target-specific manner. 305 306.. _llvm.eh.sjlj.longjmp: 307 308llvm.eh.sjlj.longjmp 309-------------------- 310 311.. code-block:: llvm 312 313 void @llvm.eh.sjlj.longjmp(i8* %setjmp_buf) 314 315For SJLJ based exception handling, the ``llvm.eh.sjlj.longjmp`` intrinsic is 316used to implement ``__builtin_longjmp()``. The single parameter is a pointer to 317a buffer populated by `llvm.eh.sjlj.setjmp`_. The frame pointer and stack 318pointer are restored from the buffer, then control is transferred to the 319destination address. 320 321llvm.eh.sjlj.lsda 322----------------- 323 324.. code-block:: llvm 325 326 i8* @llvm.eh.sjlj.lsda() 327 328For SJLJ based exception handling, the ``llvm.eh.sjlj.lsda`` intrinsic returns 329the address of the Language Specific Data Area (LSDA) for the current 330function. The SJLJ front-end code stores this address in the exception handling 331function context for use by the runtime. 332 333llvm.eh.sjlj.callsite 334--------------------- 335 336.. code-block:: llvm 337 338 void @llvm.eh.sjlj.callsite(i32 %call_site_num) 339 340For SJLJ based exception handling, the ``llvm.eh.sjlj.callsite`` intrinsic 341identifies the callsite value associated with the following ``invoke`` 342instruction. This is used to ensure that landing pad entries in the LSDA are 343generated in matching order. 344 345Asm Table Formats 346================= 347 348There are two tables that are used by the exception handling runtime to 349determine which actions should be taken when an exception is thrown. 350 351Exception Handling Frame 352------------------------ 353 354An exception handling frame ``eh_frame`` is very similar to the unwind frame 355used by DWARF debug info. The frame contains all the information necessary to 356tear down the current frame and restore the state of the prior frame. There is 357an exception handling frame for each function in a compile unit, plus a common 358exception handling frame that defines information common to all functions in the 359unit. 360 361Exception Tables 362---------------- 363 364An exception table contains information about what actions to take when an 365exception is thrown in a particular part of a function's code. There is one 366exception table per function, except leaf functions and functions that have 367calls only to non-throwing functions. They do not need an exception table. 368