1 2OpenSSL ASN1 Revision 3===================== 4 5This document describes some of the issues relating to the new ASN1 code. 6 7Previous OpenSSL ASN1 problems 8============================= 9 10OK why did the OpenSSL ASN1 code need revising in the first place? Well 11there are lots of reasons some of which are included below... 12 131. The code is difficult to read and write. For every single ASN1 structure 14(e.g. SEQUENCE) four functions need to be written for new, free, encode and 15decode operations. This is a very painful and error prone operation. Very few 16people have ever written any OpenSSL ASN1 and those that have usually wish 17they hadn't. 18 192. Partly because of 1. the code is bloated and takes up a disproportionate 20amount of space. The SEQUENCE encoder is particularly bad: it essentially 21contains two copies of the same operation, one to compute the SEQUENCE length 22and the other to encode it. 23 243. The code is memory based: that is it expects to be able to read the whole 25structure from memory. This is fine for small structures but if you have a 26(say) 1Gb PKCS#7 signedData structure it isn't such a good idea... 27 284. The code for the ASN1 IMPLICIT tag is evil. It is handled by temporarily 29changing the tag to the expected one, attempting to read it, then changing it 30back again. This means that decode buffers have to be writable even though they 31are ultimately unchanged. This gets in the way of constification. 32 335. The handling of EXPLICIT isn't much better. It adds a chunk of code into 34the decoder and encoder for every EXPLICIT tag. 35 366. APPLICATION and PRIVATE tags aren't even supported at all. 37 387. Even IMPLICIT isn't complete: there is no support for implicitly tagged 39types that are not OPTIONAL. 40 418. Much of the code assumes that a tag will fit in a single octet. This is 42only true if the tag is 30 or less (mercifully tags over 30 are rare). 43 449. The ASN1 CHOICE type has to be largely handled manually, there aren't any 45macros that properly support it. 46 4710. Encoders have no concept of OPTIONAL and have no error checking. If the 48passed structure contains a NULL in a mandatory field it will not be encoded, 49resulting in an invalid structure. 50 5111. It is tricky to add ASN1 encoders and decoders to external applications. 52 53Template model 54============== 55 56One of the major problems with revision is the sheer volume of the ASN1 code. 57Attempts to change (for example) the IMPLICIT behaviour would result in a 58modification of *every* single decode function. 59 60I decided to adopt a template based approach. I'm using the term 'template' 61in a manner similar to SNACC templates: it has nothing to do with C++ 62templates. 63 64A template is a description of an ASN1 module as several constant C structures. 65It describes in a machine readable way exactly how the ASN1 structure should 66behave. If this template contains enough detail then it is possible to write 67versions of new, free, encode, decode (and possibly others operations) that 68operate on templates. 69 70Instead of having to write code to handle each operation only a single 71template needs to be written. If new operations are needed (such as a 'print' 72operation) only a single new template based function needs to be written 73which will then automatically handle all existing templates. 74 75Plans for revision 76================== 77 78The revision will consist of the following steps. Other than the first two 79these can be handled in any order. 80 81o Design and write template new, free, encode and decode operations, initially 82memory based. *DONE* 83 84o Convert existing ASN1 code to template form. *IN PROGRESS* 85 86o Convert an existing ASN1 compiler (probably SNACC) to output templates 87in OpenSSL form. 88 89o Add support for BIO based ASN1 encoders and decoders to handle large 90structures, initially blocking I/O. 91 92o Add support for non blocking I/O: this is quite a bit harder than blocking 93I/O. 94 95o Add new ASN1 structures, such as OCSP, CRMF, S/MIME v3 (CMS), attribute 96certificates etc etc. 97 98Description of major changes 99============================ 100 101The BOOLEAN type now takes three values. 0xff is TRUE, 0 is FALSE and -1 is 102absent. The meaning of absent depends on the context. If for example the 103boolean type is DEFAULT FALSE (as in the case of the critical flag for 104certificate extensions) then -1 is FALSE, if DEFAULT TRUE then -1 is TRUE. 105Usually the value will only ever be read via an API which will hide this from 106an application. 107 108There is an evil bug in the old ASN1 code that mishandles OPTIONAL with 109SEQUENCE OF or SET OF. These are both implemented as a STACK structure. The 110old code would omit the structure if the STACK was NULL (which is fine) or if 111it had zero elements (which is NOT OK). This causes problems because an empty 112SEQUENCE OF or SET OF will result in an empty STACK when it is decoded but when 113it is encoded it will be omitted resulting in different encodings. The new code 114only omits the encoding if the STACK is NULL, if it contains zero elements it 115is encoded and empty. There is an additional problem though: because an empty 116STACK was omitted, sometimes the corresponding *_new() function would 117initialize the STACK to empty so an application could immediately use it, if 118this is done with the new code (i.e. a NULL) it wont work. Therefore a new 119STACK should be allocated first. One instance of this is the X509_CRL list of 120revoked certificates: a helper function X509_CRL_add0_revoked() has been added 121for this purpose. 122 123The X509_ATTRIBUTE structure used to have an element called 'set' which took 124the value 1 if the attribute value was a SET OF or 0 if it was a single. Due 125to the behaviour of CHOICE in the new code this has been changed to a field 126called 'single' which is 0 for a SET OF and 1 for single. The old field has 127been deleted to deliberately break source compatibility. Since this structure 128is normally accessed via higher level functions this shouldn't break too much. 129 130The X509_REQ_INFO certificate request info structure no longer has a field 131called 'req_kludge'. This used to be set to 1 if the attributes field was 132(incorrectly) omitted. You can check to see if the field is omitted now by 133checking if the attributes field is NULL. Similarly if you need to omit 134the field then free attributes and set it to NULL. 135 136The top level 'detached' field in the PKCS7 structure is no longer set when 137a PKCS#7 structure is read in. PKCS7_is_detached() should be called instead. 138The behaviour of PKCS7_get_detached() is unaffected. 139 140The values of 'type' in the GENERAL_NAME structure have changed. This is 141because the old code use the ASN1 initial octet as the selector. The new 142code uses the index in the ASN1_CHOICE template. 143 144The DIST_POINT_NAME structure has changed to be a true CHOICE type. 145 146typedef struct DIST_POINT_NAME_st { 147int type; 148union { 149 STACK_OF(GENERAL_NAME) *fullname; 150 STACK_OF(X509_NAME_ENTRY) *relativename; 151} name; 152} DIST_POINT_NAME; 153 154This means that name.fullname or name.relativename should be set 155and type reflects the option. That is if name.fullname is set then 156type is 0 and if name.relativename is set type is 1. 157 158With the old code using the i2d functions would typically involve: 159 160unsigned char *buf, *p; 161int len; 162/* Find length of encoding */ 163len = i2d_SOMETHING(x, NULL); 164/* Allocate buffer */ 165buf = OPENSSL_malloc(len); 166if(buf == NULL) { 167 /* Malloc error */ 168} 169/* Use temp variable because &p gets updated to point to end of 170 * encoding. 171 */ 172p = buf; 173i2d_SOMETHING(x, &p); 174 175 176Using the new i2d you can also do: 177 178unsigned char *buf = NULL; 179int len; 180len = i2d_SOMETHING(x, &buf); 181if(len < 0) { 182 /* Malloc error */ 183} 184 185and it will automatically allocate and populate a buffer with the 186encoding. After this call 'buf' will point to the start of the 187encoding which is len bytes long. 188