TDBM: A DBM Library with Atomic Transactions

The dbm database library [1] introduced disk-based extensible hashing to UNIX. The library consists of functions to use a simple database consisting of kef/value pairs. A number of work-alikes have been developed, offering additional features [5] and free source code [1_4,25]. _Recently, a new package was developed that also offers improved performance [19]. None of these implementations, however, provide fault-tolerant behaviour. ' In many applications, a single high-level operation may cause many database items to 6s rrFdated, created, or deleted. If the application crashes while processing the operation, the database could be left in an inconsistent state. Current versions of dbmto nof handle this problem.-. Existing dbm implementations do not support concunent access, even though the T.^ .of lightweight processes in a UNIX environment is growing. To address -these deficiencies, tdbm was developed. Tdbm is a transaction procissing-database with a dbmlike interface. It provides nested atomic transactions, volatile and parsistent databases, and support for very large objects and distributed operation. This paper describes the design and implementation of tdbm and examines its performance. In the UNIX environment, the dbm database Cunent versions of dbm, however, do not meet liþtutyt [1] has become widely used to provide the requirements of these types of applications. disk-based extensible hashing for a variety oi appliMost importantly; they do not guarantee oonsistency cations. The library consists of functioni to uie a in the face of crashes. Existing dbm implementasimple database consisting of items (key/value pairs). tions cannot be used in a multi-threaded application, A number of work-alikes have been developed, even though the use of lightweight processes in a offering additional features [5] and free source code UNIX environment is growing. Also, no assistance 11'4,25-1. Recently, a new p_ackage was developed for implementing distributed and replicated databases that also offers improved performance [19] and thire is given. are plans to add a transaction mechanism to this To meet these requirements, tdbm (dbm with package [20]' transactions) *a* o.uetapa?. rau* provides nested As an integral part of our distributed system atomic transactions [13], volatile and persistent dataresearch, an effrcient and reliable database was bases, support for very large data, stores the database required. In these and many other applications, a within a single UNIX file, and provides assistance single high-level operation may resu.lf in several for managing distributed databases. Tdbm can be objects being updated, created, or deleted. If the configured to operate either as a conventional UNIX application or host system crashes while processing library or as part of a multi-threaded application. the operation, the database must not be ieft in aa The EAN object store [17], used by the gAN X.SOO inconsistent state. directory service [16], is based on tdbm. Many distributed applications have a server In the next section, the major design decisions component that can handle many client requests associated with tdbm are examined. In Section 3, simultaneously. For example, in the case oi the we look at the implementation of tdbm and in SecX.500 Directory Service [4], a server called the tion 4 an evaluation of the performancp of tdbn is Directory System Agent is most naturally implegiven. Finally, the paper concludes with some mented as a multi-thieaded application, wiih one or thoughts about our experiences with tdbm and posmore thrÞads servicing each client request. To maxsible extensions and improvements. The manual imize the level of concurrency, the database should page for the library appears in the appendix. permit simultaneous read-only and update operations @ble by a grant from oslware, Inc.

[1]  G. Neufeld,et al.  The UBC OSI Distributed Application Programming Environment , 1991 .

[2]  David B. Lomet,et al.  Bounded index exponential hashing , 1983, TODS.

[3]  E. B. Moss,et al.  Nested Transactions: An Approach to Reliable Distributed Computing , 1985 .

[4]  Guy M. Lohman,et al.  Differential files: their application to the maintenance of large databases , 1976, TODS.

[5]  Betty Salzberg,et al.  File Structures: An Analytic Approach , 1988 .

[6]  Andreas Reuter,et al.  Principles of transaction-oriented database recovery , 1983, CSUR.

[7]  Per-Åke Larson,et al.  Linear Hashing with Partial Expansions , 1980, VLDB.

[8]  Peter D. Smith,et al.  Files & databases: an introduction , 1986 .

[9]  J. Mitchell,et al.  Issues in the design and use of a distributed file system , 1980, OPSR.

[10]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[11]  Witold Litwin,et al.  Linear Hashing: A new Algorithm for Files and Tables Addressing , 1980, ICOD.

[12]  Per-Åke Larson,et al.  Linear hashing with separators—a dynamic hashing scheme achieving one-access , 1988, TODS.

[13]  Margo I. Seltzer,et al.  A New Hashing Package for UNIX , 1991, USENIX Winter.

[14]  Per-Åke Larson,et al.  Dynamic hashing , 1978, BIT.

[15]  Gerald W. Neufeld,et al.  A transactional API for the EAN X.500 directory service , 1992, CASCON.

[16]  Margo I. Seltzer,et al.  LIBTP: Portable, Modular Transactions for UNIX , 1992 .

[17]  Ronald Fagin,et al.  Extendible hashing—a fast access method for dynamic files , 1979, ACM Trans. Database Syst..