Safe and Efficient Persistent Heaps.

Abstract Persistentdatacontinueto existaftertheprogramsthat createormanipulate themterminate.Files and database records are familiar examples, but they only allow specifi c datatypesto be made persistent. Object-oriented databases and persistent programming languagesare examples of systems that require arbitrary persistent datatypes. Persistent heaps allowarbitrary persistent datatypes to be dynamically allocated and stored and are essentialcomponents of such systems.Data stored in persistent heaps are valuable, and must be protected from both machinefailures and programmer errors. This safety requirement may conflict with the need toprovide high throughput and low latency access to the data. This conflict may lead tosacrificing safety for performance.My thesis is that it is possible to build persistent heaps so that safety does not needto be sacrificed for performance. This dissertation demonstrates the thesi s in two parts:Part I presents the design of Sidney, a safe persistent heap, along with the details of itsimplementation,andPartIIpresentsaperformanceevaluationthatdemonstratesthatSidneysatisfies the claim of the thesis.Sidney's design uses transactions and garbage collection to provide safe heap manage-ment of persistent data. Good performance is achieved by combining traditional systemstechniquesfortransactionswithanovelconcurrentgarbagecollectiontechnique,replicatingcollection.Sidney's implementation is the first to provide concurrent collection of a transactionalheap. Replicating collection allows a much simpler implementation than previous (unim-plemented) designs based on other concurrent collection techniques.The performance evaluation characterizes Sidney's performance and compares it toother approaches, including a persistent malloc-and-free implementation. It shows thatreplicatingcollectionallowstheuse ofgarbagecollectionwithoutsacrifici ngthethroughputandlatencycharacteristics ofexplicitdeallocation. Infact, notonlydoes theSidneyprovidebetter safety than persistent malloc-and-free, it also provides better performance.xi

[1]  Mahadev Satyanarayanan,et al.  Lightweight recoverable virtual memory , 1993, SOSP '93.

[2]  Ronald Morrison,et al.  An Approach to Persistent Programming , 1989, Comput. J..

[3]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[4]  Darko Stefanovic,et al.  A comparative performance evaluation of write barrier implementation , 1992, OOPSLA.

[5]  Steffen Grarup,et al.  Incremental Mature Garbage Collection Using the Train Algorithm , 1995, ECOOP.

[6]  Antony L. Hosking,et al.  Lightweight support for fine-grained persistence of stock hardware , 1995 .

[7]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[8]  Richard D. Greenblatt,et al.  A LISP machine , 1974, CAW '80.

[9]  Eric Cooper,et al.  Improving the performance of SML garbage collection using application-specific virtual memory management , 1992, LFP '92.

[10]  Antony L. Hosking,et al.  Object fault handling for persistent programming languages: a performance evaluation , 1993, OOPSLA '93.

[11]  Andrew W. Appel,et al.  Simple generational garbage collection and fast allocation , 1989, Softw. Pract. Exp..

[12]  Michael J. Carey,et al.  Persistence in the E Language: Issues and Implementation , 1989, Softw. Pract. Exp..

[13]  Scott Nettles,et al.  Replication-Based Incremental Copying Collection , 1992, IWMM.

[14]  Andrew W. Appel,et al.  A Standard ML compiler , 1987, FPCA.

[15]  Elliot K. Kolodner,et al.  Atomic incremental garbage collection and recovery for a large stable heap , 1993, SIGMOD Conference.

[16]  Henry Lieberman,et al.  A real-time garbage collector based on the lifetimes of objects , 1983, CACM.

[17]  Scott Nettles,et al.  Concurrent replicating garbage collection , 1994, LFP '94.

[18]  David J. DeWitt,et al.  A Performance Study of Alternative Object Faulting and Pointer Swizzling Strategies , 1992, VLDB.

[19]  J. Gregory Morrisett,et al.  Procs and locks: a portable multiprocessing platform for standard ML of New Jersey , 1993, PPOPP '93.

[20]  Richard C. H. Connor,et al.  On the Integration of Object-Oriented and Process-Oriented Computation in Persistent Environments , 1988, OODBS.

[21]  J. Gregory Morrisett,et al.  Composing first-class transactions , 1994, TOPL.

[22]  Allen Newell,et al.  The psychology of human-computer interaction , 1983 .

[23]  R. Larsen,et al.  An introduction to mathematical statistics and its applications (2nd edition) , by R. J. Larsen and M. L. Marx. Pp 630. £17·95. 1987. ISBN 13-487166-9 (Prentice-Hall) , 1987, The Mathematical Gazette.

[24]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[25]  Thomas W. Christopher,et al.  Reference count garbage collection , 1984, Softw. Pract. Exp..

[26]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[27]  Benjamin G. Zorn,et al.  The measured cost of conservative garbage collection , 1993, Softw. Pract. Exp..

[28]  David K. Gifford,et al.  Concurrent compacting garbage collection of a persistent heap , 1993, SOSP '93.

[29]  Barbara Liskov,et al.  Guardians and Actions: Linguistic Support for Robust, Distributed Programs , 1983, TOPL.

[30]  Jean D. Gibbons,et al.  Concepts of Nonparametric Theory , 1981 .

[31]  Paul R. Wilson,et al.  Uniprocessor Garbage Collection Techniques , 1992, IWMM.

[32]  Jeannette M. Wing,et al.  Concurrent atomic garbage collection , 1990 .

[33]  Scott Nettles,et al.  Real-time replication garbage collection , 1993, PLDI '93.

[34]  Ronald Morrison,et al.  Persistent object management system , 1984, Softw. Pract. Exp..

[35]  R. G. G. Cattell,et al.  The Engineering Database Benchmark , 1994, The Benchmark Handbook.

[36]  J. Eliot B. Moss,et al.  Working with Persistent Objects: To Swizzle or Not to Swizzle , 1992, IEEE Trans. Software Eng..

[37]  Robin Milner,et al.  Definition of standard ML , 1990 .

[38]  David J. DeWitt,et al.  The EXODUS Extensible DBMS Project: An Overview , 1989 .

[39]  Hans-Juergen Boehm,et al.  Garbage collection in an uncooperative environment , 1988, Softw. Pract. Exp..

[40]  Henry G. Baker,et al.  List processing in real time on a serial computer , 1978, CACM.

[41]  J. Eliot B. Moss,et al.  Incremental Collection of Mature Objects , 1992, IWMM.

[42]  Alexander L. Wolf,et al.  Partition selection policies in object database garbage collection , 1994, SIGMOD '94.

[43]  Paul R. Wilson,et al.  Effective “static-graph” reorganization to improve locality in garbage-collected systems , 1991, PLDI '91.

[44]  George E. Collins,et al.  A method for overlapping and erasure of lists , 1960, CACM.

[45]  Benjamin G. Zorn,et al.  Memory allocation costs in large C and C++ programs , 1994, Softw. Pract. Exp..