1<html>
2<head>
3<title>Challenges in Embedded Database System Administration</title>
4</head>
5<body bgcolor=white>
6<center>
7<h1>Challenges in Embedded Database System Administration</h1>
8<h3>Margo Seltzer, Harvard University</h3>
9<h3>Michael Olson, Sleepycat Software, Inc.</h3>
10<em>{margo,mao}@sleepycat.com</em>
11</center>
12<p>
13Database configuration and maintenance have historically been complex tasks,
14often
15requiring expert knowledge of database design and application
16behavior.
17In an embedded environment, it is not feasible to require such
18expertise and ongoing database maintenance.
19This paper discusses the database administration
20challenges posed by embedded systems and describes how the
21Berkeley DB architecture addresses these challenges.
22
23<h2>1. Introduction</h2>
24
25Embedded systems provide a combination of opportunities and challenges
26in application and system configuration and management.
27As an embedded system is most often dedicated to a single application or
28small set of tasks, the operating conditions of the system are
29typically better understood than those of general purpose computing
30environments.
31Similarly, as embedded systems are dedicated to a small set of tasks,
32one would expect that the software to manage them should be small
33and simple.
34On the other hand, once an embedded system is deployed, it must
35continue to function without interruption and without administrator
36intervention.
37<p>
38Database administration consists of two components,
39initial configuration and ongoing maintenance.
40Initial configuration consists of database design, manifestation,
41and tuning.
42The instantiation of the design includes decomposing the design
43into tables, relations, or objects and designating proper indices
44and their implementations (e.g., Btrees, hash tables, etc.).
45Tuning a design requires selecting a location for the log and
46data files, selecting appropriate database page sizes, specifying
47the size of in-memory caches, and specifying the limits of 
48multi-threading and concurrency.
49As embedded systems define a specific environment and set of tasks,
50requiring expertise during the initial system
51configuration process is acceptable, and we focus our efforts on
52the ongoing maintenance of the system.
53In this way, our emphasis differs from other projects such as
54Microsoft's AutoAdmin project <a href="#Chaud982">[3]</a>, and the "no-knobs"
55administration that is identified as an area of important future
56research by the Asilomar authors<a href="#Bern98">[1]</a>.
57<p> 
58In this paper, we focus on what the authors
59of the Asilomar report call "gizmo" databases <a href="#Bern98"> [1]</a>,
60databases
61that reside in devices such as smart cards, toasters, or telephones.
62The key characteristics of such databases are that their
63functionality is completely transparent to users, no one ever
64performs explicit database operations or
65database maintenance, the database may crash at any time and
66must recover instantly, the device may undergo a hard reset at
67any time, requiring that the database return to its initial
68state, and the semantic integrity of the database must be maintained
69at all times.
70In Section 2, we provide more detail on the sorts of tasks
71typically performed by database administrators (DBAs) that must
72be automated in an embedded system.
73<p>
74The rest of this paper is structured as follows.
75In Section 2, we outline the requirements for embedded database support.
76In Section 3, we discuss how Berkeley DB
77is conducive to the hands-off management
78required in embedded systems.
79In Section 4, we discuss novel features that 
80enhance Berkeley
81DB's suitability for the embedded applications.
82In Section 5, we discuss issues of footprint size.
83In Section 6 we discuss related work, and we conclude
84in Section 7.
85
86<h2>2. Embedded Database Requirements</h2>
87Historically, much of the commercial database industry has been driven
88by the requirements of high performance online transaction
89processing (OLTP), complex query processing, and the industry
90standard benchmarks that have emerged (e.g., TPC-C <a href="#TPCC">[9]</a>,
91TPC-D <a href="#TPCD">[10]</a>) to
92allow for system comparisons.
93As embedded systems typically perform fairly simple queries,
94such metrics are not nearly as relevant for embedded database
95systems as are ease of maintenance, robustness, and small footprint.
96Of these three requirements, robustness and ease of maintenance
97are the key issues.
98Users must trust the data stored in their devices and must not need
99to manually perform anything resembling system administration in order
100to get their unit to work properly.
101Fortunately, ease of use and robustness are important side
102effects of simplicity and good design.
103These, in turn, lead to a small size, providing the third
104requirement of an embedded system.
105<h3>2.1 The User Perspective</h3>
106<p>
107In the embedded database arena, it is the ongoing maintenance tasks
108that must be automated, not necessarily the initial system configuration.
109There are five tasks
110that are traditionally performed by DBAs,
111but must be performed automatically
112in embedded database systems.
113These tasks are
114log archival and reclamation,
115backup,
116data compaction/reorganization,
117automatic and rapid recovery, and
118reinitialization from scratch.
119<P>
120Log archival and backup are tightly coupled.
121Database backups are part of any
122large database installation, and log archival is analogous to incremental
123backup.
124It is not clear what the implications of backup and archival are in
125an embedded system.
126Consumers do not back up their VCRs or refrigerators, yet they do
127(or should) back up their personal computers or personal digital
128assistants.
129For the remainder of this paper, we assume that backups, in some form,
130are required for gizmo databases (imagine having to reprogram, manually,
131the television viewing access pattern learned by some set-top television
132systems today).
133Furthermore, we require that those backups are nearly instantaneous or
134completely transparent,
135as users should not be aware that their gizmos are being backed up
136and should not have to explicitly initiate such backups.
137<p>
138Data compaction or reorganization has traditionally required periodic
139dumping and restoration of
140database tables and the recreation of indices.
141In an embedded system, such reorganization must happen automatically.
142<p>
143Recovery issues are similar in embedded and traditional environments
144with a few exceptions.
145While a few seconds or even a minute recovery is acceptable
146for a large server installation, no one is willing to wait
147for their telephone or television to reboot.
148As with archival, recovery must be nearly instantaneous in an embedded product.
149Secondly, it is often the case that a system will be completely
150reinitialized, rather than simply rebooted.
151In this case, the embedded database must be restored to its initial
152state, freeing all its resources.
153This is not typically a requirement of large server systems.
154<h3>2.2 The Developer Perspective</h3>
155<p>
156In addition to the maintenance-free operation required of the
157embedded systems, there are a number of requirements that fall
158out of the constrained resources typically found in the "gizmos"
159using gizmo databases. These requirements are:
160small footprint,
161short code-path,
162programmatic interface for tight application coupling and
163to avoid the overhead (in both time and size) of 
164interfaces such as SQL and ODBC,
165application configurability and flexibility,
166support for complete memory-resident operation (e.g., these systems
167must run on gizmos without file systems), and
168support for multi-threading.
169<p>
170A small footprint and short code-path are self-explanatory, however
171what is not as obvious is that the programmatic interface requirement
172is the logical result of them.
173Traditional interfaces such as ODBC and SQL add significant
174size overhead and frequently add multiple context/thread switches
175per operation, not to mention several IPC calls.
176An embedded product is less likely to require the complex
177query processing that SQL enables.
178Instead, in the embedded space, the ability for an application
179to configure the database for the specific tasks in question
180is more important than a general query interface.
181<p>
182As some systems do not provide storage other than RAM and ROM,
183it is essential that an embedded database work seemlessly
184in memory-only environments.
185Similarly, many of today's embedded operating systems provide a
186single address space architecture, so a simple, multi-threaded
187capability is essential for application requiring any concurrency.
188<p>
189In general, embedded applications run on gizmos whose native
190operating system support varies tremendously.
191For example, the embedded OS may or may
192not support user-level processing or multi-threading.
193Even if it does, a particular embedded
194application may or may not need it.
195Not all applications need more than one thread of control.
196An embedded database must provide mechanisms to developers
197without deciding policy.
198For example, the threading model in an application is a matter of policy,
199and depends
200not on the database software, but on the hardware, operating
201system, and the application's feature set.
202Therefore, the data manager must provide for the use of multi-threading,
203but not require it.
204
205<h2>3. Berkeley DB: A Database for Embedded Systems</h2>
206Berkeley DB is the result of implementing database functionality 
207using the UNIX tool-based philosophy.
208The current Berkeley DB package, as distributed by Sleepycat
209Software, is a descendant of the hash and btree access methods
210distributed with 4.4BSD and its descendents.
211The original package (referred to as DB-1.85),
212while intended as a public domain replacement for dbm and
213its followers (e.g., ndbm, gdbm, etc), rapidly became widely
214used as an efficient, easy-to-use data store. 
215It was incorporated into a number of Open Source packages including
216Perl, Sendmail, Kerberos, and the GNU C-library.
217<p>
218Versions 2.X and higher are distributed by Sleepycat Software and
219add functionality for concurrency, logging, transactions, and
220recovery.
221Each piece of additional functionality is implemented as an independent
222module, which means that the subsystems can be used outside the
223context of Berkeley DB.  For example, the locking subsystem can
224easily be used to implement locking for a non-DB application and
225the shared memory buffer pool can be used for any application
226caching data in main memory.
227This subsystem design allows a designer to pick and choose
228the functionality necessary for the application, minimizing
229memory footprint and maximizing performance.
230This addresses the small footprint and short code-path criteria
231mentioned in the previous section.
232<p>
233As Berkeley DB grew out of a replacement for dbm, its primary
234implementation language has always been C and its interface has
235been programmatic.  The C interface is the native interface,
236unlike many database systems where the programmatic API is simply
237a layer on top of an already-costly query interface (e.g. embedded
238SQL).
239Berkeley DB's heritage is also apparent in its data model; it has
240none.
241The database stores unstructured key/data pairs, specified as
242variable length byte strings.
243This leaves schema design and representation issues the responsibility
244of the application, which is ideal for an embedded environment.
245Applications retain full control over specification of their data
246types, representation, index values, and index relationships.
247In other words, Berkeley DB provides a robust, high-performance,
248keyed storage system, not a particular database management system.
249We have designed for simplicity and performance, trading off
250complex, general purpose support that is better encapsulated in
251applications.
252<p>
253Another element of Berkeley DB's programmatic interface is its
254customizability; applications can specify Btree comparison and
255prefix compression functions, hash functions, error routines,
256and recovery models.
257This means that embedded applications can tailor the underlying
258database to best suit their data demands.
259Similarly, the utilities traditionally bundled with a database
260manager (e.g., recovery, dump/restore, archive) are implemented
261as tiny wrapper programs around library routines.  This means
262that it is not necessary to run separate applications for the
263utilities.  Instead, independent threads can act as utility
264daemons, or regular query threads can perform utility functions.
265Many of the current products built on Berkeley DB are bundled as
266a single large server with independent threads that perform functions
267such as checkpoint, deadlock detection, and performance monitoring.
268<p>
269As mentioned earlier, living in an embedded environment requires
270flexible management of storage.
271Berkeley DB does not require any preallocation of disk space
272for log or data files.
273While many commercial database systems take complete control
274of a raw device, Berkeley DB uses a normal file system, and
275can therefore, safely and easily share a data space with other
276programs.
277All databases and log files are native files of the host environment,
278so whatever utilities are provided by the environment can be used
279to manage database files as well.
280<p>
281Berkeley DB provides three different memory models for its
282management of shared information.
283Applications can use the IEEE Std 1003.1b-1993 (POSIX) <tt>mmap</tt>
284interface to share
285data, they can use system shared memory, as frequently provided
286by the shmget family of interfaces, or they can use per-process
287heap memory (e.g., malloc).
288Applications that require no permanent storage and do not provide
289shared memory facilities can still use Berkeley DB by requesting
290strictly private memory and specifying that all databases be
291memory-resident.
292This provides pure-memory operation.
293<p>
294Lastly, Berkeley DB is designed for rapid startup -- recovery can
295happen automatically as part of system initialization.
296This means that Berkeley DB works correctly in environments where
297gizmos are suddenly shut down and restarted.
298
299<h2>4. Extensions for Embedded Environments </h2>
300While the Berkeley DB library has been designed for use in
301embedded systems, all the features described above are useful
302in more conventional systems as well.
303In this section, we discuss a number of features and "automatic
304knobs" that are specifically geared
305toward the more constrained environments found in gizmo databases.
306
307<h3>4.1 Automatic compression</h3>
308Following the programmatic interface design philosophy, we 
309support application-specific (or default) compression routines.
310These can be geared toward the particular data types present
311in the application's dataset, thus providing better compression
312than a general purpose routine.
313Note that the application could instead specify an encryption
314function and create encrypted databases instead of compressed ones.
315Alternately, the application might specify a function that performs
316both compression and encryption.
317<p>
318As applications are also permitted to specify comparison and hash
319functions, the application can chose to organize its data based
320either on uncompressed and clear-text data or compressed and encrypted
321data.
322If the application indicates that data should be compared in its
323processed form (i.e., compressed and encrypted), then the compression
324and encryption are performed on individual data items and the in-memory
325representation retains these characteristics.
326However, if the application indicates that data should be compared in
327its original form, then entire pages are transformed upon being read
328into or written out of the main memory buffer cache.
329These two alternatives provide the flexibility to trade space
330and security for performance.
331
332<h3>4.2 In-memory logging & transactions</h3>
333One of the four key properties of transaction systems is durability.
334This means that transaction systems are designed for permanent storage
335(most commonly disk).  However, as mentioned above, embedded systems
336do not necessarily contain any such storage.
337Nevertheless, transactions can be useful in this environment to
338preserve the semantic integrity of the underlying storage.
339Berkeley DB optionally provides logging functionality and
340transaction support regardless of whether the database and logs
341are on disk or in memory.
342
343<h3>4.3 Remote Logs</h3>
344While we do not expect users to backup their television sets and
345toasters, it is conceivable that a set-top box provided by a
346cable carrier should, in fact, be backed up by that cable carrier.
347The ability to store logs remotely can provide "information appliance"
348functionality, and can also be used in conjunction with local logs
349to enhance reliability.
350Furthermore, remote logs provide for catastrophic recovery, e.g., loss
351of the gizmo, destruction of the gizmo, etc.
352
353<h3>4.4 Application References to Database Buffers</h3>
354
355Typically, when data is returned to the user, it must be copied
356from the data manager's buffer cache (or data page) into the
357application's memory.
358However, in an embedded environment, the robustness of the
359total software package is of paramount importance, not the
360isolation between the application and the data manager.
361As a result, it is possible for the data manager to avoid
362copies by giving applications direct references to data items
363in a shared memory cache.
364This is a significant performance optimization that can be
365allowed when the application and data manager are tightly
366integrated.
367
368<h3>4.5 Recoverable database creation/deletion</h3>
369
370In a conventional database management system, the creation of
371database tables (relations) and indices are heavyweight operations
372that are not recoverable.
373This is not acceptable in a complex embedded environment where
374instantaneous recovery and robust operation in the face of
375all types of database operations is essential.
376While Berkeley DB files can be removed using normal file system
377utilities, we provide transaction protected utilities that
378allow us to recover both database creation and deletion.
379
380<h3>4.6 Adaptive concurrency control</h3>
381The Berkeley DB package uses page-level locking by default.
382This trades off fine grain concurrency control for simplicity
383during recovery. (Finer grain concurrency control can be
384obtained by reducing the page size in the database.)
385However, when multiple threads/processes perform page-locking
386in the presence of writing operations, there is the
387potential for deadlock.
388As some environments do not need or desire the overhead of
389logging and transactions, it is important to provide the
390ability for concurrent access without the potential for
391deadlock.
392<p>
393Berkeley DB provides an option to perform coarser grain,
394deadlock-free locking.
395Rather than locking on pages, locking is performed at the
396interface to the database.
397Multiple readers or a single writer are allowed to be
398active in the database at any instant in time, with
399conflicting requests queued automatically.
400The presence of cursors, through which applications can both
401read and write data, complicates this design.
402If a cursor is currently being used for reading, but will later
403be used to write, the system will be deadlock prone if no
404special precautions are taken.
405To handle this situation, we require that, when a cursor is
406created, the application specify any future intention to write.
407If there is an intention to write, the cursor is granted an
408intention-to-write lock which does not conflict with readers,
409but does conflict with other intention-to-write locks and write
410locks.
411The end result is that the application is limited to a single
412potentially writing cursor accessing the database at any point
413in time.
414<p>
415Under periods of low contention (but potentially high throughput),
416the normal page-level locking provides the best overall throughput.
417However, as contention rises, so does the potential for deadlock.
418As some cross-over point, switching to the less concurrent, but
419deadlock-free locking protocol will result in higher throughput
420as operations must never be retried.
421Given the operating conditions of an embedded database manager,
422it is useful to make this change automatically as the system
423itself detects high contention.
424
425<h3>4.7 Adaptive synchronization</h3>
426
427In addition to the logical locks that protect the integrity of the
428database pages, Berkeley DB must synchronize access to shared memory
429data structures, such as the lock table, in-memory buffer pool, and
430in-memory log buffer.
431Each independent module uses a single mutex to protect its shared
432data structures, under the assumption that operations that require
433the mutex are very short and the potential for conflict is
434low.
435Unfortunately, in highly concurrent environments with multiple processors
436present, this assumption is not always true.
437When this assumption becomes invalid (that is, we observe significant
438contention for the subsystem mutexes), we can switch over to a finer-grained
439concurrency model for the mutexes.
440Once again, there is a performance trade-off.  Fine-grain mutexes
441impose a penalty of approximately 25% (due to the increased number
442of mutexes required for each operation), but allow for higher throughput.
443Using fine-grain mutexes under low contention would cause a decrease
444in performance, so it is important to monitor the system carefully,
445so that the change can be executed only when it will increase system
446throughput without jeopardizing latency.
447
448<h2>5. Footprint of an Embedded System</h2>
449While traditional systems compete on price-performance, the
450embedded players will compete on price, features, and footprint.
451The earlier sections have focused on features; in this section
452we focus on footprint.
453<p>
454Oracle reports that Oracle Lite 3.0 requires 350 KB to 750 KB
455of memory and approximately 2.5 MB of hard disk space <a href="#Oracle">[7]</a>.
456This includes drivers for interfaces such as ODBC and JDBC.
457In contrast, Berkeley DB ranges in size from 75 KB to under 200 KB,
458foregoing heavyweight interfaces such as ODBC and JDBC and
459providing a variety of deployed sizes that can be used depending
460on application needs.  At the low end, applications requiring
461a simple single-user access method can choose from either extended
462linear hashing, B+ trees, or record-number based retrieval and
463pay only the 75 KB space requirement.
464Applications requiring all three access methods will observe the
465110 KB footprint.
466At the high end, a fully recoverable, high-performance system
467occupies less than a quarter megabyte of memory.
468This is a system you can easily incorporate in your toaster oven.
469Table 1 shows the per-module break down of the entire Berkeley DB
470library.  Note that this does not include memory used to cache database
471pages.
472
473<table border>
474<tr><th colspan=4>Object sizes in bytes</th></tr>
475<tr><th align=left>Subsystem</th><th align=center>Text</th><th align=center>Data</th><th align=center>Bss</th></tr>
476<tr><td>Btree-specific routines</td><td align=right>28812</td><td align=right>0</td><td align=right>0</td></tr>
477<tr><td>Recno-specific routines</td><td align=right>7211</td><td align=right>0</td><td align=right>0</td></tr>
478<tr><td>Hash-specific routines</td><td align=right>23742</td><td align=right>0</td><td align=right>0</td></tr>
479<tr><td colspan=4></td></tr>
480<tr><td>Memory Pool</td><td align=right>14535</td><td align=right>0</td><td align=right>0</td></tr>
481<tr><td>Access method common code</td><td align=right>23252</td><td align=right>0</td><td align=right>0</td></tr>
482<tr><td>OS compatibility library</td><td align=right>4980</td><td align=right>52</td><td align=right>0</td></tr>
483<tr><td>Support utilities</td><td align=right>6165</td><td align=right>0</td><td align=right>0</td></tr>
484<tr><td colspan=4></td></tr>
485<tr><th>All modules for Btree access method only</th><td align=right>77744</td><td align=right>52</td><td align=right>0</td></tr>
486<tr><th>All modules for Recno access method only</th><td align=right>84955</td><td align=right>52</td><td align=right>0</td></tr>
487<tr><th>All modules for Hash access method only</th><td align=right>72674</td><td align=right>52</td><td align=right>0</td></tr>
488<tr><td colspan=4></td></tr>
489<tr><th align=left>All Access Methods</th><td align=right>108697</td><td align=right>52</td><td align=right>0</td></tr>
490<tr><td colspan=4><br></td></tr>
491<tr><td>Locking</td><td align=right>12533</td><td align=right>0</td><td align=right>0</td></tr>
492<tr><td colspan=4></td></tr>
493<tr><td>Recovery</td><td align=right>26948</td><td align=right>8</td><td align=right>4</td></tr>
494<tr><td>Logging</td><td align=right>37367</td><td align=right>0</td><td align=right>0</td></tr>
495<tr><td colspan=4></td></tr>
496<tr><th align=left>Full Package</th><td align=right>185545</td><td align=right>60</td><td align=right>4</td></tr>
497<tr><br></tr>
498</table>
499
500<h2>6. Related Work</h2>
501
502Every three to five years, leading researchers in the database
503community convene to identify future directions in database
504research.
505They produce a report of this meeting, named for the year and
506location of the meeting.
507The most recent of these reports, the 1998 Asilomar report,
508identifies the embedded database market as one of the
509high growth areas in database research <a href="#Bern98">[1]</a>.
510Not surprisingly, market analysts identify the embedded database
511market as a high-growth area in the commercial sector as well <a href="#Host98">
512[5]</a>.
513<p>
514The Asilomar report identifies a new class of database applications, which they
515term "gizmo" databases, small databases embedded in tiny mobile
516appliances, e.g., smart-cards, telephones, personal digital assistants.
517Such databases must be self-managing, secure and reliable.
518Thus, the idea is that gizmo databases require plug and play data
519management with no database administrator (DBA), no human settable
520parameters, and the ability to adapt to changing conditions.
521More specifically, the Asilomar authors claim that the goal is
522self-tuning, including defining the physical DB design, the
523logical DB design, and automatic reports and utilities <a href="#Bern98">[1]</a>
524To date,
525few researchers have accepted this challenge, and there is a dearth
526of research literature on the subject.
527<p>
528Our approach to embedded database administration is fundamentally
529different than that described by the Asilomar authors.
530We adopt their terminology, but view the challenge in supporting
531gizmo databases to be that of self-sustenance <em>after</em> initial
532deployment.  Therefore, we find it, not only acceptable, but
533desirable to assume that application developers control initial
534database design and configuration.  To the best of our knowledge,
535none of the published work in this area addresses this approach.
536<p>
537As the research community has not provided guidance in this
538arena, most work in embedded database administration has fallen
539to the commercial vendors.
540These vendors fall into two camps, companies selling databases
541specifically designed for embedding or programmatic access
542and the major database vendors (e.g., Oracle, Informix, Sybase).
543<p>
544The embedded vendors all acknowledge the need for automatic
545administration, but fail to identify precisely how their
546products actually accomplish this.
547A notable exception is Interbase whose white paper
548comparison with Sybase and Microsoft's SQL servers
549explicitly address features of maintenance ease.
550Interbase claims that as they use no log files, there is
551no need for log reclamation, checkpoint tuning, or other
552tasks associated with log management.  However, Interbase
553uses Transaction Information Pages, and it is unclear
554how these are reused or reclaimed <a href="#Interbase">[6]</a>.
555Additionally, with a log-free system, they must use
556a FORCE policy (write all pages to disk at commit),
557as defined by Haerder and Reuter <a href="#Haerder">[4]</a>.  This has
558serious performance consequences for disk-based systems.
559The approach described in this paper does use logs and
560therefore requires log reclamation,
561but provides hooks so the application may reclaim logs
562safely and programmatically.
563While Berkeley DB does require checkpoints, the goal of
564tuning the checkpoint interval is to bound recovery time.
565Since the checkpoint interval in Berkeley DB can be expressed
566by the amount of log data written, it requires no tuning.
567The application designer sets a target recovery time, and
568selects the amount of log data that can be read in that interval
569and specifies the checkpoint interval appropriately.  Even as
570load changes, the time to recover does not.
571<p>
572The backup approaches taken by Interbase and Berkeley DB
573are similar in that they both allow online backup, but
574rather different in their affect on transactions running
575during backup.  As Interbase performs backups as transactions
576<a href="#Interbase">[6]</a>, concurrent queries can suffer potentially long
577delays.  Berkeley DB uses native operating system system utilities
578and recovery for backups, so there is no interference with
579concurrent activity, other than potential contention on disk
580arms.
581<p>
582There are a number of database vendors selling in
583the embedded market (e.g., Raima, 
584Centura, Pervasive, Faircom), but none highlight
585the special requirements of embedded database
586applications.
587On the other end of the spectrum, the major vendors,
588Oracle, Sybase, Microsoft, are all becoming convinced
589of the importance of the embedded market.
590As mentioned earlier, Oracle has announced its
591Oracle Lite server for embedded use.
592Sybase has announced its UltraLite platform for "application-optimized,
593high-performance, SQL database engine for professional
594application developers building solutions for mobile and embedded platforms."
595<a href="#Sybase">[8]</a>.
596We believe that SQL is incompatible with the
597gizmo database environment or truly embedded systems for which Berkeley
598DB is most suitable.
599Microsoft research is taking a different approach, developing
600technology to assist in automating initial database design and
601index specification <a href="#Chaud98">[2]</a><a href="#Chaud982">[3]</a>.
602As mentioned earlier, we believe that such configuration is, not only
603acceptable in the embedded market, but desirable so that applications
604can tune their database management for the target environment.
605<h2>7. Conclusions</h2>
606The coming wave of embedded systems poses a new set of challenges
607for data management.
608The traditional server-based, big footprint systems designed for
609high performance on big iron are not the right approach in this
610environment.
611Instead, application developers need small, fast, versatile systems
612that can be tailored to a specific environment.
613In this paper, we have identified several of the key issues in
614providing these systems and shown how Berkeley DB provides
615many of the characteristics necessary for such applications.
616
617<h2>8. References</h2>
618<p>
619[1] <a name="Bern98"> Bernstein, P., Brodie, M., Ceri, S., DeWitt, D., Franklin, M.,
620Garcia-Molina, H., Gray, J., Held, J., Hellerstein, J.,
621Jagadish, H., Lesk, M., Maier, D., Naughton, J.,
622Pirahesh, H., Stonebraker, M., Ullman, J.,
623"The Asilomar Report on Database Research,"
624SIGMOD Record 27(4): 74-80, 1998.
625</a>
626<p>
627[2] <a name="Chaud98"> Chaudhuri, S., Narasayya, V.,
628"AutoAdmin 'What-If' Index Analysis Utility,"
629<em>Proceedings of the ACM SIGMOD Conference</em>, Seattle, 1998.
630</a>
631<p>
632[3] <a name="Chaud982"> Chaudhuri, S., Narasayya, V.,
633"An Efficient, Cost-Driver Index Selection Tool for Microsoft SQL Server,"
634<em>Proceedings of the 23rd VLDB Conference</em>, Athens, Greece, 1997.
635</a>
636<p>
637[4] <a name="Harder"> Haerder, T., Reuter, A.,
638"Principles of Transaction-Oriented Database Recovery,"
639<em>Computing Surveys 15</em>,4 (1983), 237-318.
640</a>
641<p>
642[5] <a name="Host98"> Hostetler, M., "Cover Is Off A New Type of Database,"
643Embedded DB News,
644http://www.theadvisors.com/embeddeddbnews.htm,
6455/6/98.
646</a>
647<p>
648[6] <a name="Interbase"> Interbase, "A Comparison of Borland InterBase 4.0
649Sybase SQL Server and Microsoft SQL Server,"
650http://web.interbase.com/products/doc_info_f.html.
651</a>
652<p>
653[7] <a name="Oracle"> Oracle, "Oracle Delivers New Server, Application Suite
654to Power the Web for Mission-Critical Business,"
655http://www.oracle.com.sg/partners/news/newserver.htm,
656May 1998.
657</a>
658<p>
659[8] <a name="Sybase"> Sybase, Sybase UltraLite, http://www.sybase.com/products/ultralite/beta.
660</a>
661<p>
662[9] <a name="TPCC"> Transaction Processing Council, "TPC-C Benchmark Specification,
663Version 3.4," San Jose, CA, August 1998.
664</a>
665<p>
666[10] <a name="TPCD"> Transaction Processing Council, "TPC-D Benchmark Specification,
667Version 2.1," San Jose, CA, April 1999.
668</a>
669</body>
670</html>
671
672
673