1<!doctype html public "-//w3c//dtd html 4.0 transitional//en"> 2<html> 3<head> 4 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> 5 <meta name="GENERATOR" content="Mozilla/4.76 [en] (X11; U; FreeBSD 4.3-RELEASE i386) [Netscape]"> 6</head> 7<body> 8 9<center> 10<h1> 11 Security Interface for Berkeley DB</h1></center> 12 13<center><i>Susan LoVerso</i> 14<br><i>Rev 1.6</i> 15<br><i>2002 Feb 26</i></center> 16 17<p>We provide an interface allowing secure access to Berkeley DB. 18Our goal is to allow users to have encrypted secure databases. In 19this document, the term <i>ciphering</i> means the act of encryption or 20decryption. They are equal but opposite actions and the same issues 21apply to both just in the opposite direction. 22<h3> 23Requirements</h3> 24The overriding requirement is to provide a simple mechanism to allow users 25to have a secure database. A secure database means that all of the 26pages of a database will be encrypted, and all of the log files will be 27encrypted. 28<p>Falling out from this work will be a simple mechanism to allow users 29to request that we checksum their data for additional error detection (without 30encryption/decryption). 31<p>We expect that data in process memory or stored in shared memory, potentially 32backed by disk, is not encrypted or secure. 33<h2> 34<a NAME="DB Modifications"></a>DB Method Interface Modifications</h2> 35With a logging environment, all database changes are recorded in the log 36files. Therefore, users requiring secure databases in such environments 37also require secure log files. 38<p>A prior thought had been to allow different passwords on the environment 39and the databases within. However, such a scheme, then requires that 40the password be logged in order for recovery to be able to restore the 41database. Therefore, any application having the password for the 42log could get the password for any databases by reading the log. 43So having a different password on a database does not gain any additional 44security and it makes certain things harder and more complex. Some 45of those more complex things include the need to handle database and env 46passwords differently since they'd need to be stored and accessed from 47different places. Also resolving the issue of how <i>db_checkpoint</i> 48or <i>db_sync</i>, which flush database pages to disk, would find the passwords 49of various databases without any dbps was unsolved. The feature didn't 50gain anything and caused significant pain. Therefore the decision 51is that there will be a single password protecting an environment and all 52the logs and some databases within that environment. We do allow 53users to have a secure environment and clear databases. Users that 54want secure databases within a secure environment must set a flag. 55<p>Users wishing to enable encryption on a database in a secure environment 56or enable just checksumming on their database pages will use new flags 57to <a href="/docs/api_c/db_set_flags.html">DB->set_flags()</a>. 58Providing ciphering over an entire environment is accomplished by adding 59a single environment method: <a href="/docs/api_c/env_set_encrypt.html">DBENV->set_encrypt()</a>. 60Providing encryption for a database (not part of an environment) is accomplished 61by adding a new database method: <a href="/docs/api_c/db_set_encrypt.html">DB->set_encrypt()</a>. 62<p>Both of the <i>set_encrypt</i> methods must be called before their respective 63<i>open</i> calls. The environment method must be before the environment 64open because we must know about security before there is any possibility 65of writing any log records out. The database method must be before 66the database open in order to read the root page. The planned interfaces 67for these methods are: 68<pre>DBENV->set_encrypt(DBENV *dbenv, /* DB_ENV structure */ 69 char *passwd /* Password */ 70 u_int32_t flags); /* Flags */</pre> 71 72<pre>DB->set_encrypt(DB *dbp, /* DB structure */ 73 char *passwd /* Password */ 74 u_int32_t flags); /* Flags */</pre> 75The flags accepted by these functions are: 76<pre>#define DB_ENCRYPT_AES 0x00000001 /* Use the AES encryption algorithm */</pre> 77Passwords are NULL-terminated strings. NULL or zero length strings 78are illegal. These flags enable the checksumming and encryption using 79the particular algorithms we have chosen for this implementation. 80The flags are named such that there is a logical naming pattern if additional 81checksum or encryption algorithms are used. If a user gives a flag of zero, 82it will behave in a manner similar to DB_UNKNOWN. It will be illegal if 83they are creating the environment or database, as an algorithm must be 84specified. If they are joining an existing environment or opening an existing 85database, they will use whatever algorithm is in force at the time. 86Using DB_ENCRYPT_AES automatically implies SHA1 checksumming. 87<p>These functions will perform several initialization steps. We 88will allocate crypto_handle for our env handle and set up our function 89pointers. We will allocate space and copy the password into our env 90handle password area. Similar to <i>DB->set_cachesize</i>, calling 91<i>DB->set_encrypt</i> 92will actually reflect back into the local environment created by DB. 93<p>Lastly, we will add a new flag, DB_OVERWRITE, to the <a href="/docs/api_c/env_remove.html">DBENV->remove</a> 94method. The purpose of this flag is to force all of the memory used 95by the shared regions to be overwritten before removal. We will use 96<i>rm_overwrite</i>, 97a function that overwrites and syncs a file 3 times with varying bit patterns 98to really remove a file. Additionally, this flag will force a sync 99of the overwritten regions to disk, if the regions are backed by the file 100system. That way there is no residual information left in the clear 101in memory or freed disk blocks. Although we expect that this flag 102will be used by customers using security, primarily, its action is not 103dependent on passwords or a secure setup, and so can be used by anyone. 104<h4> 105Initialization of the Environment</h4> 106The setup of the security subsystem will be similar to replication initialization 107since it is a sort of subsystem, but it does not have its own region. 108When the environment handle is created via <i>db_env_create</i>, we initialize 109our <i>set_encrypt</i> method to be the RPC or local version. Therefore 110the <i>DB_ENV</i> structure needs a new pointer: 111<pre> void *crypto_handle; /* Security handle */</pre> 112The crypto handle will really point to a new <i>__db_cipher</i> structure 113that will contain a set of functions and a pointer to the in-memory information 114needed by the specific encryption algorithm. It will look like: 115<pre>typedef struct __db_cipher { 116 int (*init)__P((...)); /* Alg-specific initialization function */ 117 int (*encrypt)__P((...)); /* Alg-specific encryption algorithm */ 118 int (*decrypt)__P((...)); /* Alg-specific decryption function */ 119 void *data; /* Pointer to alg-specific information (AES_CIPHER) */ 120 u_int32_t flags; /* Cipher flags */ 121} DB_CIPHER;</pre> 122 123<pre>#define DB_MAC_KEY 20 /* Size of the MAC key */ 124typedef struct __aes_cipher { 125 keyInstance encrypt_ki; /* Encrypt keyInstance temp. */ 126 keyInstance decrypt_ki; /* Decrypt keyInstance temp. */ 127 u_int8_t mac_key[DB_MAC_KEY]; /* MAC key */ 128 u_int32_t flags; /* AES-specific flags */ 129} AES_CIPHER;</pre> 130It should be noted that none of these structures have their own mutex. 131We hold the environment region locked while we are creating this, but once 132this is set up, it is read-only forever. 133<p>During <a href="/docs/api_c/env_set_encrypt.html">dbenv->set_encrypt</a>, 134we set the encryption, decryption and checksumming methods to the appropriate 135functions based on the flags. This function will allocate us a crypto 136handle that we store in the <i>DB_ENV</i> structure just like all the 137other subsystems. For now, only AES ciphering functions and SHA1 138checksumming functions are supported. Also we will copy the password 139into the <i>DB_ENV</i> structure. We ultimately need to keep the 140password in the environment's shared memory region or compare this one 141against the one that is there, if we are joining an existing environment, 142but we do not have it yet because open has not yet been called. We 143will allocate a structure that will be used in initialization and set up 144the function pointers to point to the algorithm-specific functions. 145<p>In the <i>__env_open</i> path, in <i>__db_e_attach</i>, if we 146are creating the region and the <i>dbenv->passwd</i> field is set, we need 147to use the length of the password in the initial computation of the environment's 148size. This guarantees sufficient space for storing the password in 149shared memory. Then we will call a new function to initialize the 150security region, <i>__crypto_region_init</i> in <i>__env_open</i>. 151If we are the creator, we will allocate space in the shared region to store 152the password and copy the password into that space. Or, if we are 153not the creator we will compare the password stored in the dbenv with the 154one in shared memory. Additionally, we will compare the ciphering 155algorithm to the one stored in the shared region.We'll smash the dbenv 156password and free it. If they do not match, we return an error. 157If we are the creator we store the offset into the REGENV structure. 158Then <i>__crypto_region_init </i> will call the initialization function 159set up earlier based on the ciphering algorithm specified. For now 160we will call <i>__aes_init</i>. Additionally this function will allocate 161and set up the per-process state vector for this encryption's IVs. 162See <a href="#Generating the Initialization Vector">Generating the Initialization 163Vector</a> for a detailed description of the IV and state vector. 164<p>In the AES-specific initialization function, <i>__aes_init</i>, 165we will initialize it by calling 166<i>__aes_derivekeys</i> in order to fill 167in the keyInstance and mac_key fields in that structure. The REGENV 168structure will have one additional item 169<pre> roff_t passwd_off; /* Offset of passwd */</pre> 170 171<h4> 172Initializing a Database</h4> 173During <a href="/docs/api_c/db_set_encrypt.html">db->set_encrypt</a>, 174we set the encryption, decryption and checksumming methods to the appropriate 175functions based on the flags. Basically, we test that we are not 176in an existing environment and we haven't called open. Then we just 177call through the environment handle to set the password. 178<p>Also, we will need to add a flag in the database meta-data page that 179indicates that the database is encrypted and what its algorithm is. 180This will be used when the meta-page is read after reopening a file. We 181need this information on the meta-page in order to detect a user opening 182a secure database without a password. I propose using the first unused1 183byte (renaming it too) in the meta page for this purpose. 184<p>All pages will not be encrypted for the first 64 bytes of data. 185Database meta-pages will be encrypted on the first 512 bytes only. 186All meta-page types will have an IV and checksum added within the first 187512 bytes as well as a crypto magic number. This will expand the 188size of the meta-page from 256 bytes to 512 bytes. The page in/out routines, 189<i>__db_pgin</i> and <i>__db_pgout</i> know the page type of the page and 190will apply the 512 bytes ciphering to meta pages. In <i>__db_pgout</i>, 191if we have a crypto handle in our (private) environment, we will apply 192ciphering to either the entire page, or the first 512 bytes if it is a 193meta-page. In <i>__db_pgin</i>, we will decrypt if the page we have 194a crypto handle. 195<p>When multiple processes share a database, all must use the same password 196as the database creator. Using an existing database requires several conditions 197to be true. First, if the creator of the database did not create 198with security, then opening later with security is an error. Second, 199if the creator did create it with security, then opening later without 200security is an error. Third, we need to be able to test and check 201that when another process opens a secure database that the password they 202provided is the same as the one in use by the creator. 203<p>When reading the meta-page, in <i>__db_file_setup</i>, we do not go 204through the paging functions, but directly read via <i>__os_read</i>. 205It is at this point that we will determine if the user is configured correctly. 206If the meta-page we read has an IV and checksum, they better have a crypto 207handle. If they have a crypto handle, then the meta-page must have 208an IV and checksum. If both of those are true, we test the password. 209We compare the unencrypted magic number to the newly-decrypted crypto magic 210number and if they are not the same, then we report that the user gave 211us a bad password. 212<p>On a mostly unrelated topic, even when we go to very large pagesizes, 213the meta information will still be within a disk sector. So, after 214talking it over with Keith and Margo, we determined that unencrypted meta-pages 215still will not need a checksum. 216<h3> 217Encryption and Checksum Routines</h3> 218These routines are provided to us by Adam Stubblefield at Rice University 219(astubble@rice.edu). The functional interfaces are: 220<pre>__aes_derivekeys(DB_ENV *dbenv, /* dbenv */ 221 u_int8_t *passwd, /* Password */ 222 size_t passwd_len, /* Length of passwd */ 223 u_int8_t *mac_key, /* 20 byte array to store MAC key */ 224 keyInstance *encrypt_key, /* Encryption key of passwd */ 225 keyInstance *decrypt_key); /* Decryption key of passwd */</pre> 226This is the only function requiring the textual user password. From 227the password, this function generates a key used in the checksum function, 228<i>__db_chksum</i>. 229It also fills in <i>keyInstance</i> structures which are then used in the 230encryption and decryption routines. The keyInstance structures must 231already be allocated. These will be stored in the AES_CIPHER structure. 232<pre> __db_chksum(u_int8_t *data, /* Data to checksum */ 233 size_t data_len, /* Length of data */ 234 u_int8_t *mac_key, /* 20 byte array from __db_derive_keys */ 235 u_int8_t *checksum); /* 20 byte array to store checksum */</pre> 236This function generates a checksum on the data given. This function 237will do double-duty for users that simply want error detection on their 238pages. When users are using encryption, the <i>mac_key </i>will contain 239the 20-byte key set up in <i>__aes_derivekeys</i>. If they just want 240checksumming, then <i>mac_key</i> will be NULL. According to Adam, 241we can safely use the first N-bytes of the checksum. So for seeding 242the generator for initialization vectors, we'll hash the time and then 243send in the first 4 bytes for the seed. I believe we can probably 244do the same thing for checksumming log records. We can only use 4 245bytes for the checksum in the non-secure case. So when we want to 246verify the log checksum we can compute the mac but just compare the first 2474 bytes to the one we read. All locations where we generate or check 248log record checksums that currently call <i>__ham_func4</i> will now call 249<i>__db_chksum</i>. 250I believe there are 5 such locations, 251<i>__log_put, __log_putr, __log_newfile, 252__log_rep_put 253</i>and<i> __txn_force_abort.</i> 254<pre>__aes_encrypt(DB_ENV *dbenv, /* dbenv */ 255 keyInstance *key, /* Password key instance from __db_derive_keys */ 256 u_int8_t *iv, /* Initialization vector */ 257 u_int8_t *data, /* Data to encrypt */ 258 size_t data_len); /* Length of data to encrypt - 16 byte multiple */</pre> 259This is the function to encrypt data. It will be called to encrypt 260pages and log records. The <i>key</i> instance is initialized in 261<i>__aes_derivekeys</i>. 262The initialization vector, <i>iv</i>, is the 16 byte random value set up 263by the Mersenne Twister pseudo-random generator. Lastly, we pass 264in a pointer to the <i>data</i> to encrypt and its length in <i>data_len</i>. 265The <i>data_len</i> must be a multiple of 16 bytes. The encryption is done 266in-place so that when the encryption code returns our encrypted data is 267in the same location as the original data. 268<pre>__aes_decrypt(DB_ENV *dbenv, /* dbenv */ 269 keyInstance *key, /* Password key instance from __db_derive_keys */ 270 u_int8_t *iv, /* Initialization vector */ 271 u_int8_t *data, /* Data to decrypt */ 272 size_t data_len); /* Length of data to decrypt - 16 byte multiple */</pre> 273This is the function to decrypt the data. It is exactly the same 274as the encryption function except for the action it performs. All 275of the args and issues are the same. It also decrypts in place. 276<h3> 277<a NAME="Generating the Initialization Vector"></a>Generating the Initialization 278Vector</h3> 279Internally, we need to provide a unique initialization vector (IV) of 16 280bytes every time we encrypt any data with the same password. For 281the IV we are planning on using mt19937, the Mersenne Twister, a random 282number generator that has a period of 2**19937-1. This package can be found 283at <a href="http://www.math.keio.ac.jp/~matumoto/emt.html">http://www.math.keio.ac.jp/~matumoto/emt.html</a>. 284Tests show that although it repeats a single integer every once in a while, 285that after several million iterations, it doesn't repeat any 4 integers 286that we'd be stuffing into our 16-byte IV. We plan on seeding this 287generator with the time (tv_sec) hashed through SHA1 when we create the 288environment. This package uses a global state vector that contains 289624 unsigned long integers. We do not allow a 16-byte IV of zero. 290It is simpler just to reject any 4-byte value of 0 and if we get one, just 291call the generator again and get a different number. We need to detect 292holes in files and if we read an IV of zero that is a simple indication 293that we need to check for an entire page of zero. The IVs are stored 294on the page after encryption and are not encrypted themselves so it is 295not possible for an entire encrypted page to be read as all zeroes, unless 296it was a hole in a file. See <a href="#Holes in Files">Holes in Files</a> 297for more details. 298<p>We will not be holding any locks when we need to generate our IV but 299we need to protect access to the state vector and the index. Calls 300to the MT code will come while encrypting some data in <i>__aes_encrypt.</i> 301The MT code will assume that all necessary locks are held in the caller. 302We will have per-process state vectors that are set up when a process begins. 303That way we minimize the contention and only multi-threaded processes need 304acquire locks for the IV. We will have the state vector in the environment 305handle in heap memory, as well as the index and there will be a mutex protecting 306it for threaded access. This will be added to the <i>DB_ENV</i> 307structure: 308<pre> DB_MUTEX *mt_mutexp; /* Mersenne Twister mutex */ 309 int *mti; /* MT index */ 310 u_long *mt; /* MT state vector */</pre> 311This portion of the environment will be initialized at the end of _<i>_dbenv_open</i>, 312right after we initialize the other mutex for the <i>dblist</i>. When we 313allocate the space, we will generate our initial state vector. If we are 314multi-threaded we'll allocate and initialize our mutex also. 315<p>We need to make changes to the MT code to make it work in our namespace 316and to take a pointer to the location of the state vector and 317the index. There will be a wrapper function <i>__db_generate_iv</i> 318that DB will call and it will call the appropriate MT function. I 319am also going to change the default seed to use a hashed time instead of 320a hard coded value. I have looked at other implementations of the 321MT code available on the web site. The C++ version does a hash on 322the current time. I will modify our MT code to seed with the hashed 323time as well. That way the code to seed is contained within the MT 324code and we can just write the wrapper to get an IV. We will not 325be changing the core computational code of MT. 326<h2> 327DB Internal Issues</h2> 328 329<h4> 330When do we Cipher?</h4> 331All of the page ciphering is done in the <i>__db_pgin/__db_pgout</i> functions. 332We will encrypt after the method-specific function on page-out and decrypt 333before the method-specfic function on page-in. We do not hold any 334locks when entering these functions. We determine that we need to 335cipher based on the existence of the encryption flag in the dbp. 336<p>For ciphering log records, the encryption will be done as the first 337thing (or a new wrapper) in <i>__log_put. </i>See <a href="#Log Record Encryption">Log 338Record Encryption</a> for those details. 339<br> 340<h4> 341Page Changes</h4> 342The checksum and IV values will be stored prior to the first index of the 343page. We have a new P_INP macro that replaces use of inp[X] in the 344code. This macro takes a dbp as an argument and determines where 345our first index is based on whether we have DB_AM_CHKSUM and DB_AM_ENCRYPT 346set. If neither is set, then our first index is where it always was. 347 If just checksumming is set, then we reserve a 4-byte checksum. 348If encryption is set, then we reserve 36 bytes for our checksum/IV as well 349as some space to get proper alignment to encrypt on a 16-byte boundary. 350<p>Since several paging macros use inp[X] in them, those macros must now 351take a dbp. There are a lot of changes to make all the necessary 352paging macros take a dbp, although these changes are trivial in nature. 353<p>Also, there is a new function <i>__db_chk_meta</i> to perform checksumming 354and decryption checking on meta pages specifically. This function 355is where we check that the database algorithm matches what the user gave 356(or if they set DB_CIPHER_ANY then we set it), and other encryption related 357testing for bad combinations of what is in the file versus what is in the 358user structures. 359<h4> 360Verification</h4> 361The verification code will also need to be updated to deal with secure 362pages. Basically when the verification code reads in the meta page 363it will call <i>__db_chk_meta</i> to perform any checksumming and decryption. 364<h4> 365<a NAME="Holes in Files"></a>Holes in Files</h4> 366Holes in files will be dealt with rather simply. We need to be able 367to distinguish reading a hole in a file from an encrypted page that happened 368to encrypt to all zero's. If we read a hole in a file, we do not 369want to send that empty page through the decryption routine. This 370can be determined simply without incurring the performance penalty of comparing 371every byte on a page on every read until we get a non-zero byte. 372<br>The __db_pgin function is only given an invalid page P_INVALID in this 373case. So, if the page type, which is always unencrypted, is 374P_INVALID, then we do not perform any checksum verification or decryption. 375<h4> 376Errors and Recovery</h4> 377Dealing with a checksum error is tricky. Ultimately, if a checksum 378error occurs it is extremely likely that the user must do catastrophic 379recovery. There is no other failure return other than DB_RUNRECOVERY 380for indicating that the user should run catastrophic recovery. We 381do not want to add a new error return for applications to check because 382a lot of applications already look for and deal with DB_RUNRECOVERY as 383an error condition and we want to fit ourselves into that application model. 384We already indicate to the user that when they get that error, then they 385need to run recovery. If recovery fails, then they need to run catastrophic 386recovery. We need to get ourselves to the point where users will 387run catastrophic recovery. 388<p>If we get a checksum error, then we need to log a message stating a 389checksum error occurred on page N. In <i>__db_pgin</i>, we can check 390if logging is on in the environment. If so, we want to log the message. 391<p>When the application gets the DB_RUNRECOVERY error, they'll have to 392shut down their application and run recovery. When the recovery encounters 393the record indicating checksum failure, then normal recovery will fail 394and the user will have to perform catastrophic recovery. When catastrophic 395recovery encounters that record, it will simply ignore it. 396<h4> 397<a NAME="Log Record Encryption"></a>Log Record Encryption</h4> 398Log records will be ciphered. It might make sense to wrap <i>__log_put</i> 399to encrypt the DBT we send down. The <i>__log_put </i>function is 400where the checksum is computed before acquiring the region lock. 401But also this function is where we call <i>__rep_send_message</i> to send 402the DBT to the replication clients. Therefore, we need the DBT to 403be encrypted prior to there. We also need it encrypted before checksumming. 404I think <i>__log_put </i>will become <i>__log_put_internal</i>, and the 405new <i>__log_put</i> will encrypt if needed and then call <i>__log_put_internal 406</i>(the 407function formerly known as <i>__log_put</i>). Log records are kept 408in a shared memory region buffer prior to going out to disk. Records 409in the buffer will be encrypted. No locks are held at the time we 410will need to encrypt. 411<p>On reading the log, via log cursors, the log code stores log records 412in the log buffer. Records in that buffer will be encrypted, so decryption 413will occur no matter whether we are returning records from the buffer or 414if we are returning log records directly from the disk. Current checksum 415checking is done in 416<i>__logc_get_int.</i> Decryption will be done 417after the checksum is checked. 418<p>There are currently two nasty issues with encrypted log records. 419The first is that <i>__txn_force_abort</i> overwrites a commit record in 420the log buffer with an abort record. Well, our log buffer will be 421encrypted. Therefore, <i>__txn_force_abort</i> is going to need to 422do encryption of its new record. This can be accomplished by sending 423in the dbenv handle to the function. It is available to us in <i>__log_flush_commit</i> 424and we can just pass it in. I don't like putting log encryption in 425the txn code, but the layering violation is already there. 426<p>The second issue is that the encryption code requires data that is a 427multiple of 16 bytes and log record lengths are variable. We will 428need to pad log records to meet the requirement. Since the callers 429of <i>__log_put</i> set up the given DBT it is a logical place to pad if 430necessary. We will modify the gen_rec.awk script to have all of the generated 431logging functions pad for us if we have a crypto handle. This padding will 432also expand the size of log files. Anyone calling <i>log_put</i> and using 433security from the application will have to pad on their own or it will 434return an error. 435<p>When ciphering the log file, we will need a different header than the 436current one. The current header only has space for a 4 byte checksum. 437Our secure header will need space for the 16 byte IV and 20 byte checksum. 438This will blow up our log files when running securely since every single 439log record header will now consume 32 additional bytes. I believe 440that the log header does not need to be encrypted. It contains an 441offset, a length and our IV and checksum. Our IV and checksum are 442never encrypted. I don't believe there to be any risk in having the 443offset and length in the clear. 444<p>I would prefer not to have two types of log headers that are incompatible 445with each other. It is not acceptable to increase the log headers 446of all users from 12 bytes to 44 bytes. Such a change would also 447make log files incompatible with earlier releases. Worse even, is 448that the <i>cksum</i> field of the header is in between the offset and 449len. It would be really convenient if we could have just made a bigger 450cksum portion without affecting the location of the other fields. 451Oh well. Most customers will not be using encryption and we won't 452make them pay the price of the expanded header. Keith indicates that 453the log file format is changing with the next release so I will move the 454cksum field so it can at least be overlaid. 455<p>One method around this would be to have a single internal header that 456contains all the information both mechanisms need, but when we write out 457the header we choose which pieces to write. By appending the security 458information to the end of the existing structure, and adding a size field, 459we can modify a few places to use the size field to write out only the 460current first 12 bytes, or the entire security header needed. 461<h4> 462Replication</h4> 463Replication clients are going to need to start all of their individual 464environment handles with the same password. The log records are going 465to be sent to the clients decrypted and the clients will have to encrypt 466them on their way to the client log files. We cannot send encrypted 467log records to clients. The reason is that the checksum and IV are 468stored in the log header and the master only sends the log record itself 469to the client. Therefore, the client has no way to decrypt a log 470record from the master. Therefore, anyone wanting to use truly secure 471replication is going to have to have a secure transport mechanism. 472By not encrypting records, clients can theoretically have different passwords 473and DB won't care. 474<p>On the master side we must copy the DBT sent in. We encrypt the 475original and send to clients the clear record. On the client side, 476support for encryption is added into <i>__log_rep_put</i>. 477<h4> 478Sharing the Environment</h4> 479When multiple processes join the environment, all must use the same password 480as the creator. 481<p>Joining an existing environment requires several conditions to be true. 482First, if the creator of the environment did not create with security, 483then joining later with security is an error. Second, if the creator 484did create it with security, then joining later without security is an 485error. Third, we need to be able to test and check that when another 486process joins a secure environment that the password they provided is the 487same as the one in use by the creator. 488<p>The first two scenarios should be fairly trivial to determine, if we 489aren't creating the environment, we can compare what is there with what 490we have. In the third case, the <i>__crypto_region_init</i> function 491will see that the environment region has a valid passwd_off and we'll then 492compare that password to the one we have in our dbenv handle. In 493any case we'll smash the dbenv handle's passwd and free that memory before 494returning whether we have a password match or not. 495<p>We need to store the passwords themselves in the region because multiple 496calls to the <i>__aes_derivekeys </i>function with the same password yields 497different keyInstance contents. Therefore we don't have any way to 498check passwords other than retaining and comparing the actual passwords. 499<h4> 500Other APIs</h4> 501All of the other APIs will need interface enhancements to support the new 502security methods. The Java and C++ interfaces will likely be done 503by Michael Cahill and Sue will implement the Tcl and RPC changes. 504Tcl will need the changes for testing purposes but the interface should 505be public, not test-only. RPC should fully support security. 506The biggest risk that I can see is that the client will send the password 507to the server in the clear. Anyone sniffing the wires or running 508tcpdump or other packet grabbing code could grab that. Someone really 509interested in using security over RPC probably ought to add authentication 510and other measures to the RPC server as well. 511<h4> 512<a NAME="Utilities"></a>Utilities</h4> 513All should take a -P flag to specify a password for the environment or 514password. Those that take an env and a database might need something 515more to distinguish between env passwds and db passwds. Here is what we 516do for each utility: 517<ul> 518<li> 519berkeley_db_svc - Needs -P after each -h specified.</li> 520 521<li> 522db_archive - Needs -P if the env is encrypted.</li> 523 524<li> 525db_checkpoint - Needs -P if the env is encrypted.</li> 526 527<li> 528db_deadlock - No changes</li> 529 530<li> 531db_dump - Needs -P if the env or database is encrypted.</li> 532 533<li> 534db_load - Needs -P if the env or database is encrypted.</li> 535 536<li> 537db_printlog - Needs -P if the env is encrypted.</li> 538 539<li> 540db_recover - Needs -P if the env is encrypted.</li> 541 542<li> 543db_stat - Needs -P if the env or database is encrypted.</li> 544 545<li> 546db_upgrade - Needs -P if the env or database is encrypted.</li> 547 548<li> 549db_verify - Needs -P if the env or database is encrypted.</li> 550</ul> 551 552<h2> 553Testing</h2> 554All testing should be able to be accomplished via Tcl. The following 555tests (and probably others I haven't thought of yet) should be performed: 556<ul> 557<li> 558Basic functionality - basically a test001 but encrypted without an env</li> 559 560<li> 561Basic functionality, w/ env - like the previous test but with an env.</li> 562 563<li> 564Basic functionality, multiple processes - like first test, but make sure 565others can correctly join.</li> 566 567<li> 568Basic functionality, mult. processes - like above test, but initialize/close 569environment/database first so that the next test processes are all joiners 570of an existing env, but creator no longer exists and the shared region 571must be opened.</li> 572 573<li> 574Recovery test - Run recovery over an encrypted environment.</li> 575 576<li> 577Subdb test - Run with subdbs that are encrypted.</li> 578 579<li> 580Utility test - Verify the new options to all the utilities.</li> 581 582<li> 583Error handling - Test the basic setup errors for both env's and databases 584with multiple processes. They are:</li> 585 586<ol> 587<li> 588Attempt to set a NULL or zero-length passwd.</li> 589 590<li> 591Create Env w/ security and attempt to create database w/ its own password.</li> 592 593<li> 594Env/DB creates with security. Proc2 joins without - should get an 595error.</li> 596 597<li> 598Env/DB creates without security. Proc2 joins with - should get an 599error.</li> 600 601<li> 602Env/DB creates with security. Proc2 joins with different password 603- should get an error.</li> 604 605<li> 606Env/DB creates with security. Closes. Proc2 reopens with different 607password - should get an error.</li> 608 609<li> 610Env/DB creates with security. Closes. Tcl overwrites a page 611of the database with garbage. Proc2 reopens with the correct password. 612Code should detect checksum error.</li> 613 614<li> 615Env/DB creates with security. Open a 2nd identical DB with a different 616password. Put the exact same data into both databases. Close. 617Overwrite the identical page of DB1 with the one from DB2. Reopen 618the database with correct DB1 password. Code should detect an encryption 619error on that page.</li> 620</ol> 621</ul> 622 623<h2> 624Risks</h2> 625There are several holes in this design. It is important to document 626them clearly. 627<p>The first is that all of the pages are stored in memory and possibly 628the file system in the clear. The password is stored in the shared 629data regions in the clear. Therefore if an attacker can read the 630process memory, they can do whatever they want. If the attacker can 631read system memory or swap they can access the data as well. Since 632everything in the shared data regions (with the exception of the buffered 633log) will be in the clear, it is important to realize that file backed 634regions will be written in the clear, including the portion of the regions 635containing passwords. We recommend to users that they use system 636memory instead of file backed shared memory. 637</body> 638</html> 639