README revision 109998
10SN/ANotes: 2001-09-24 22362SN/A----------------- 30SN/A 40SN/AThis "description" (if one chooses to call it that) needed some major updating 50SN/Aso here goes. This update addresses a change being made at the same time to 60SN/AOpenSSL, and it pretty much completely restructures the underlying mechanics of 72362SN/Athe "ENGINE" code. So it serves a double purpose of being a "ENGINE internals 80SN/Afor masochists" document *and* a rather extensive commit log message. (I'd get 92362SN/Alynched for sticking all this in CHANGES or the commit mails :-). 100SN/A 110SN/AENGINE_TABLE underlies this restructuring, as described in the internal header 120SN/A"eng_int.h", implemented in eng_table.c, and used in each of the "class" files; 130SN/Atb_rsa.c, tb_dsa.c, etc. 140SN/A 150SN/AHowever, "EVP_CIPHER" underlies the motivation and design of ENGINE_TABLE so 160SN/AI'll mention a bit about that first. EVP_CIPHER (and most of this applies 170SN/Aequally to EVP_MD for digests) is both a "method" and a algorithm/mode 180SN/Aidentifier that, in the current API, "lingers". These cipher description + 190SN/Aimplementation structures can be defined or obtained directly by applications, 200SN/Aor can be loaded "en masse" into EVP storage so that they can be catalogued and 212362SN/Asearched in various ways, ie. two ways of encrypting with the "des_cbc" 222362SN/Aalgorithm/mode pair are; 232362SN/A 240SN/A(i) directly; 250SN/A const EVP_CIPHER *cipher = EVP_des_cbc(); 260SN/A EVP_EncryptInit(&ctx, cipher, key, iv); 270SN/A [ ... use EVP_EncryptUpdate() and EVP_EncryptFinal() ...] 280SN/A 290SN/A(ii) indirectly; 300SN/A OpenSSL_add_all_ciphers(); 311664SN/A cipher = EVP_get_cipherbyname("des_cbc"); 320SN/A EVP_EncryptInit(&ctx, cipher, key, iv); 330SN/A [ ... etc ... ] 340SN/A 350SN/AThe latter is more generally used because it also allows ciphers/digests to be 360SN/Alooked up based on other identifiers which can be useful for automatic cipher 370SN/Aselection, eg. in SSL/TLS, or by user-controllable configuration. 380SN/A 390SN/AThe important point about this is that EVP_CIPHER definitions and structures are 400SN/Apassed around with impunity and there is no safe way, without requiring massive 410SN/Arewrites of many applications, to assume that EVP_CIPHERs can be reference 420SN/Acounted. One an EVP_CIPHER is exposed to the caller, neither it nor anything it 430SN/Acomes from can "safely" be destroyed. Unless of course the way of getting to 440SN/Asuch ciphers is via entirely distinct API calls that didn't exist before. 450SN/AHowever existing API usage cannot be made to understand when an EVP_CIPHER 460SN/Apointer, that has been passed to the caller, is no longer being used. 470SN/A 480SN/AThe other problem with the existing API w.r.t. to hooking EVP_CIPHER support 490SN/Ainto ENGINE is storage - the OBJ_NAME-based storage used by EVP to register 500SN/Aciphers simultaneously registers cipher *types* and cipher *implementations* - 510SN/Athey are effectively the same thing, an "EVP_CIPHER" pointer. The problem with 520SN/Ahooking in ENGINEs is that multiple ENGINEs may implement the same ciphers. The 530SN/Asolution is necessarily that ENGINE-provided ciphers simply are not registered, 540SN/Astored, or exposed to the caller in the same manner as existing ciphers. This is 550SN/Aespecially necessary considering the fact ENGINE uses reference counts to allow 560SN/Afor cleanup, modularity, and DSO support - yet EVP_CIPHERs, as exposed to 570SN/Acallers in the current API, support no such controls. 580SN/A 590SN/AAnother sticking point for integrating cipher support into ENGINE is linkage. 600SN/AAlready there is a problem with the way ENGINE supports RSA, DSA, etc whereby 610SN/Athey are available *because* they're part of a giant ENGINE called "openssl". 620SN/AIe. all implementations *have* to come from an ENGINE, but we get round that by 630SN/Ahaving a giant ENGINE with all the software support encapsulated. This creates 640SN/Alinker hassles if nothing else - linking a 1-line application that calls 2 basic 650SN/ARSA functions (eg. "RSA_free(RSA_new());") will result in large quantities of 660SN/AENGINE code being linked in *and* because of that DSA, DH, and RAND also. If we 670SN/Acontinue with this approach for EVP_CIPHER support (even if it *was* possible) 680SN/Awe would lose our ability to link selectively by selectively loading certain 690SN/Aimplementations of certain functionality. Touching any part of any kind of 700SN/Acrypto would result in massive static linkage of everything else. So the 710SN/Asolution is to change the way ENGINE feeds existing "classes", ie. how the 720SN/Ahooking to ENGINE works from RSA, DSA, DH, RAND, as well as adding new hooking 730SN/Afor EVP_CIPHER, and EVP_MD. 740SN/A 751664SN/AThe way this is now being done is by mostly reverting back to how things used to 760SN/Awork prior to ENGINE :-). Ie. RSA now has a "RSA_METHOD" pointer again - this 770SN/Awas previously replaced by an "ENGINE" pointer and all RSA code that required 780SN/Athe RSA_METHOD would call ENGINE_get_RSA() each time on its ENGINE handle to 790SN/Atemporarily get and use the ENGINE's RSA implementation. Apart from being more 800SN/Aefficient, switching back to each RSA having an RSA_METHOD pointer also allows 810SN/Aus to conceivably operate with *no* ENGINE. As we'll see, this removes any need 820SN/Afor a fallback ENGINE that encapsulates default implementations - we can simply 830SN/Ahave our RSA structure pointing its RSA_METHOD pointer to the software 840SN/Aimplementation and have its ENGINE pointer set to NULL. 850SN/A 860SN/AA look at the EVP_CIPHER hooking is most explanatory, the RSA, DSA (etc) cases 870SN/Aturn out to be degenerate forms of the same thing. The EVP storage of ciphers, 880SN/Aand the existing EVP API functions that return "software" implementations and 890SN/Adescriptions remain untouched. However, the storage takes more meaning in terms 900SN/Aof "cipher description" and less meaning in terms of "implementation". When an 910SN/AEVP_CIPHER_CTX is actually initialised with an EVP_CIPHER method and is about to 920SN/Abegin en/decryption, the hooking to ENGINE comes into play. What happens is that 930SN/Acipher-specific ENGINE code is asked for an ENGINE pointer (a functional 940SN/Areference) for any ENGINE that is registered to perform the algo/mode that the 950SN/Aprovided EVP_CIPHER structure represents. Under normal circumstances, that 960SN/AENGINE code will return NULL because no ENGINEs will have had any cipher 970SN/Aimplementations *registered*. As such, a NULL ENGINE pointer is stored in the 980SN/AEVP_CIPHER_CTX context, and the EVP_CIPHER structure is left hooked into the 990SN/Acontext and so is used as the implementation. Pretty much how things work now 100except we'd have a redundant ENGINE pointer set to NULL and doing nothing. 101 102Conversely, if an ENGINE *has* been registered to perform the algorithm/mode 103combination represented by the provided EVP_CIPHER, then a functional reference 104to that ENGINE will be returned to the EVP_CIPHER_CTX during initialisation. 105That functional reference will be stored in the context (and released on 106cleanup) - and having that reference provides a *safe* way to use an EVP_CIPHER 107definition that is private to the ENGINE. Ie. the EVP_CIPHER provided by the 108application will actually be replaced by an EVP_CIPHER from the registered 109ENGINE - it will support the same algorithm/mode as the original but will be a 110completely different implementation. Because this EVP_CIPHER isn't stored in the 111EVP storage, nor is it returned to applications from traditional API functions, 112there is no associated problem with it not having reference counts. And of 113course, when one of these "private" cipher implementations is hooked into 114EVP_CIPHER_CTX, it is done whilst the EVP_CIPHER_CTX holds a functional 115reference to the ENGINE that owns it, thus the use of the ENGINE's EVP_CIPHER is 116safe. 117 118The "cipher-specific ENGINE code" I mentioned is implemented in tb_cipher.c but 119in essence it is simply an instantiation of "ENGINE_TABLE" code for use by 120EVP_CIPHER code. tb_digest.c is virtually identical but, of course, it is for 121use by EVP_MD code. Ditto for tb_rsa.c, tb_dsa.c, etc. These instantiations of 122ENGINE_TABLE essentially provide linker-separation of the classes so that even 123if ENGINEs implement *all* possible algorithms, an application using only 124EVP_CIPHER code will link at most code relating to EVP_CIPHER, tb_cipher.c, core 125ENGINE code that is independant of class, and of course the ENGINE 126implementation that the application loaded. It will *not* however link any 127class-specific ENGINE code for digests, RSA, etc nor will it bleed over into 128other APIs, such as the RSA/DSA/etc library code. 129 130ENGINE_TABLE is a little more complicated than may seem necessary but this is 131mostly to avoid a lot of "init()"-thrashing on ENGINEs (that may have to load 132DSOs, and other expensive setup that shouldn't be thrashed unnecessarily) *and* 133to duplicate "default" behaviour. Basically an ENGINE_TABLE instantiation, for 134example tb_cipher.c, implements a hash-table keyed by integer "nid" values. 135These nids provide the uniquenness of an algorithm/mode - and each nid will hash 136to a potentially NULL "ENGINE_PILE". An ENGINE_PILE is essentially a list of 137pointers to ENGINEs that implement that particular 'nid'. Each "pile" uses some 138caching tricks such that requests on that 'nid' will be cached and all future 139requests will return immediately (well, at least with minimal operation) unless 140a change is made to the pile, eg. perhaps an ENGINE was unloaded. The reason is 141that an application could have support for 10 ENGINEs statically linked 142in, and the machine in question may not have any of the hardware those 10 143ENGINEs support. If each of those ENGINEs has a "des_cbc" implementation, we 144want to avoid every EVP_CIPHER_CTX setup from trying (and failing) to initialise 145each of those 10 ENGINEs. Instead, the first such request will try to do that 146and will either return (and cache) a NULL ENGINE pointer or will return a 147functional reference to the first that successfully initialised. In the latter 148case it will also cache an extra functional reference to the ENGINE as a 149"default" for that 'nid'. The caching is acknowledged by a 'uptodate' variable 150that is unset only if un/registration takes place on that pile. Ie. if 151implementations of "des_cbc" are added or removed. This behaviour can be 152tweaked; the ENGINE_TABLE_FLAG_NOINIT value can be passed to 153ENGINE_set_table_flags(), in which case the only ENGINEs that tb_cipher.c will 154try to initialise from the "pile" will be those that are already initialised 155(ie. it's simply an increment of the functional reference count, and no real 156"initialisation" will take place). 157 158RSA, DSA, DH, and RAND all have their own ENGINE_TABLE code as well, and the 159difference is that they all use an implicit 'nid' of 1. Whereas EVP_CIPHERs are 160actually qualitatively different depending on 'nid' (the "des_cbc" EVP_CIPHER is 161not an interoperable implementation of "aes_256_cbc"), RSA_METHODs are 162necessarily interoperable and don't have different flavours, only different 163implementations. In other words, the ENGINE_TABLE for RSA will either be empty, 164or will have a single ENGING_PILE hashed to by the 'nid' 1 and that pile 165represents ENGINEs that implement the single "type" of RSA there is. 166 167Cleanup - the registration and unregistration may pose questions about how 168cleanup works with the ENGINE_PILE doing all this caching nonsense (ie. when the 169application or EVP_CIPHER code releases its last reference to an ENGINE, the 170ENGINE_PILE code may still have references and thus those ENGINEs will stay 171hooked in forever). The way this is handled is via "unregistration". With these 172new ENGINE changes, an abstract ENGINE can be loaded and initialised, but that 173is an algorithm-agnostic process. Even if initialised, it will not have 174registered any of its implementations (to do so would link all class "table" 175code despite the fact the application may use only ciphers, for example). This 176is deliberately a distinct step. Moreover, registration and unregistration has 177nothing to do with whether an ENGINE is *functional* or not (ie. you can even 178register an ENGINE and its implementations without it being operational, you may 179not even have the drivers to make it operate). What actually happens with 180respect to cleanup is managed inside eng_lib.c with the "engine_cleanup_***" 181functions. These functions are internal-only and each part of ENGINE code that 182could require cleanup will, upon performing its first allocation, register a 183callback with the "engine_cleanup" code. The other part of this that makes it 184tick is that the ENGINE_TABLE instantiations (tb_***.c) use NULL as their 185initialised state. So if RSA code asks for an ENGINE and no ENGINE has 186registered an implementation, the code will simply return NULL and the tb_rsa.c 187state will be unchanged. Thus, no cleanup is required unless registration takes 188place. ENGINE_cleanup() will simply iterate across a list of registered cleanup 189callbacks calling each in turn, and will then internally delete its own storage 190(a STACK). When a cleanup callback is next registered (eg. if the cleanup() is 191part of a gracefull restart and the application wants to cleanup all state then 192start again), the internal STACK storage will be freshly allocated. This is much 193the same as the situation in the ENGINE_TABLE instantiations ... NULL is the 194initialised state, so only modification operations (not queries) will cause that 195code to have to register a cleanup. 196 197What else? The bignum callbacks and associated ENGINE functions have been 198removed for two obvious reasons; (i) there was no way to generalise them to the 199mechanism now used by RSA/DSA/..., because there's no such thing as a BIGNUM 200method, and (ii) because of (i), there was no meaningful way for library or 201application code to automatically hook and use ENGINE supplied bignum functions 202anyway. Also, ENGINE_cpy() has been removed (although an internal-only version 203exists) - the idea of providing an ENGINE_cpy() function probably wasn't a good 204one and now certainly doesn't make sense in any generalised way. Some of the 205RSA, DSA, DH, and RAND functions that were fiddled during the original ENGINE 206changes have now, as a consequence, been reverted back. This is because the 207hooking of ENGINE is now automatic (and passive, it can interally use a NULL 208ENGINE pointer to simply ignore ENGINE from then on). 209 210Hell, that should be enough for now ... comments welcome: geoff@openssl.org 211 212