The design and implementation of a log-structured file system

This paper presents a new technique for disk storage management called a log-structured file system. A log-structured file system writes all modifications to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery. The log is the only structure on disk; it contains indexing information so that files can be read back from the log efficiently. In order to maintain large free areas on disk for fast writing, we divide the log intosegmentsand use a segment cleaner to compress the live information from heavily fragmented segments. We present a series of simulations that demonstrate the efficiency of a simple cleaning policy based on cost and benefit. We have implemented a prototype log-structured file system called Sprite LFS; it outperforms current Unix file systems by an order of magnitude for small-file writes while matching or exceeding Unix performance for reads and large writes. Even when the overhead for cleaning is included, Sprite LFS can use 70% of the disk bandwidth for writing, whereas Unix file systems typically can use only 5–10%.

[1]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[2]  Henry G. Baker,et al.  List processing in real time on a serial computer , 1978, CACM.

[3]  Mahadev Satyanarayanan,et al.  A study of file sizes and functional lifetimes , 1981, SOSP.

[4]  Henry Lieberman,et al.  A real-time garbage collector based on the lifetimes of objects , 1983, CACM.

[5]  Barbara Liskov,et al.  Reliable object storage to support atomic actions , 1983, SOSP 1985.

[6]  R. S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[7]  Robert S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[8]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[9]  Dan Walsh,et al.  Design and implementation of the Sun network filesystem , 1985, USENIX Conference Proceedings.

[10]  John A. Kunze,et al.  A trace-driven analysis of the UNIX 4.2 BSD file system , 1985, SOSP '85.

[11]  Barbara Liskov,et al.  Reliable object storage to support atomic actions , 1983, SOSP '85.

[12]  John Kunze,et al.  A trace-driven analysis of the unix 4 , 1985, SOSP 1985.

[13]  Willy Zwaenepoel,et al.  File access performance of diskless workstations , 1986, TOCS.

[14]  Robert B. Hagmann A Crash Recovery Scheme for a Memory-Resident Database System , 1986, IEEE Transactions on Computers.

[15]  Robert B. Hagmann,et al.  Reimplementing the Cedar file system using logging and group commit , 1987, SOSP '87.

[16]  David R. Cheriton,et al.  Log files: an extended file service exploiting write-once storage , 1987, SOSP '87.

[17]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[18]  Andrew R. Cherenson,et al.  The Sprite network operating system , 1988, Computer.

[19]  Albert Chang,et al.  Evolution of Storage Facilities in AIX Version 3 for RISC System/6000 Processors , 1990, IBM J. Res. Dev..

[20]  John K. Ousterhout,et al.  Why Aren't Operating Systems Getting Faster As Fast as Hardware? , 1990, USENIX Summer.

[21]  Margo I. Seltzer,et al.  Disk Scheduling Revisited , 1990 .

[22]  T. J. Kowalski,et al.  Fsck—the UNIX file system check program , 1990 .

[23]  Sailesh Chutani,et al.  DEcorum File System Architectural Overview , 1990, USENIX Summer.

[24]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[25]  Steve R. Kleiman,et al.  Extent-like Performance from a UNIX File System , 1991, USENIX Winter.

[26]  Carl Staelin,et al.  An Implementation of a Log-Structured File System for UNIX , 1993, USENIX Winter.