FICUS: a very large scale reliable distributed file system

The dissertation presents the issues addressed in the design of Ficus, a large scale wide area distributed file system currently operational on a modest scale at UCLA. Key aspects of providing such a service include toleration of partial operation in virtually all areas; support for large scale, optimistic data replication; and a flexible, extensible modular design. Ficus incorporates a "stackable layers" modular architecture and full support for optimistic replication. Replication is provided by a pair of layers operating in concert above a traditional filing service. A "volume" abstraction and on-the-fly volume "grafting" mechanism are used to manage the large scale file name space. The replication service uses a family of novel algorithms to manage the propagation of changes to the filing environment. These algorithms are fully distributed, tolerate partial operation (including nontransitive communications), and display linear storage overhead and worst case quadratic message complexity.

[1]  Andy J. Wellings,et al.  The pulse distributed file system , 1985, Softw. Pract. Exp..

[2]  Steve R. Kleiman,et al.  Vnodes: An Architecture for Multiple File System Types in Sun UNIX , 1986, USENIX Summer.

[3]  Barbara T. Blaustein,et al.  System architecture for partition-tolerant distributed databases , 1985, IEEE Transactions on Computers.

[4]  Rick Floyd Short-Term File Reference Patterns in a UNIX Environment, , 1986 .

[5]  Arthur L. Liestman,et al.  A survey of gossiping and broadcasting in communication networks , 1988, Networks.

[6]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[7]  B. Clifford Neuman The need for closure in large distributed systems , 1989, OPSR.

[8]  Martín Abadi,et al.  A logic of authentication , 1989, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[9]  Bruce Walker,et al.  The LOCUS distributed operating system , 1983, SOSP '83.

[10]  Alfred Z. Spector,et al.  Weighted voting for directories : a comprehensive study , 1984 .

[11]  R. S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[12]  Arthur J. Bernstein A Loosely Coupled Distributed System for Reliably Storing Data , 1985, IEEE Transactions on Software Engineering.

[13]  Robbert van Renesse,et al.  Voting with ghosts , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[14]  Bruce J. Walker,et al.  The LOCUS Distributed System Architecture , 1986 .

[15]  Sergio Zarur Faissol Operation of distributed database systems under network partitions , 1981 .

[16]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[17]  John Heidemann,et al.  Architecture of the Ficus Scalable Replicated File System , 1991 .

[18]  Michael Stonebraker,et al.  Concurrency Control and Consistency of Multiple Copies of Data in Distributed Ingres , 1979, IEEE Transactions on Software Engineering.

[19]  Eugene H. Spafford,et al.  The internet worm: crisis and aftermath , 1989 .

[20]  David R. Cheriton,et al.  Decentralizing a global naming service for improved performance and fault tolerance , 1989, TOCS.

[21]  Maurice Herlihy,et al.  Dynamic quorum adjustment for partitioned data , 1987, TODS.

[22]  Barbara T. Blaustein,et al.  Updating Replicated Data During Communications Failures , 1985, VLDB.

[23]  David S. H. Rosenthal,et al.  Evolving the Vnode interface , 1990, USENIX Summer.

[24]  John Heidemann,et al.  An extensible, stackable method of file system development , 1990 .

[25]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[26]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[27]  Carla Schlatter Ellis,et al.  The Roe File System , 1983, Symposium on Reliability in Distributed Software and Database Systems.

[28]  Hector Garcia-Molina,et al.  Data-Pach: Integrating Inconsistent Copies of a Database After a Partition , 1983, Symposium on Reliability in Distributed Software and Database Systems.

[29]  Rick Floyd,et al.  Directory Reference Patterns in a UNIX Environment. , 1986 .

[30]  William J. Bolosky,et al.  Mach: A New Kernel Foundation for UNIX Development , 1986, USENIX Summer.

[31]  Kenneth P. Birman,et al.  Low cost management of replicated data in fault-tolerant distributed systems , 1986, TOCS.

[32]  Andrew P. Black,et al.  Sessions: A technique and its application to the UNIX file system , 1987, 1987 IEEE Third International Conference on Data Engineering.

[33]  Dennis M. Ritchie,et al.  A stream input-output system , 1990 .

[34]  Kenneth P. Birman,et al.  Deceit: a flexible distributed file system , 1990, [1990] Proceedings. Workshop on the Management of Replicated Data.

[35]  Krishnamurthy Vidyasankar,et al.  An Optimistic Resiliency Control Scheme for Distributed Database Systems , 1987, WDAG.

[36]  Alley Stoughton,et al.  Detection of Mutual Inconsistency in Distributed Systems , 1983, IEEE Transactions on Software Engineering.

[37]  Gregory R. Andrews,et al.  A file replication facility for berkeley unix , 1987, Softw. Pract. Exp..

[38]  John A. Kunze,et al.  A trace-driven analysis of the UNIX 4.2 BSD file system , 1985, SOSP '85.

[39]  Farokh B. Bastani,et al.  A fault tolerant replicated storage system , 1987, 1987 IEEE Third International Conference on Data Engineering.

[40]  Larry L. Peterson,et al.  The x-Kernel: An Architecture for Implementing Network Protocols , 1991, IEEE Trans. Software Eng..

[41]  Meichun Hsu,et al.  Two Pase Gossip: Managing Distributed Event Histories , 1989, Inf. Sci..

[42]  B. I. Strom,et al.  Consistency of Redundant Databases in a Weak Coupled Distributed Computer Conferencing System , 1981, Berkeley Workshop.

[43]  Larry L. Peterson,et al.  RPC in the x-Kernel: evaluating new design techniques , 1989, SOSP '89.

[44]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[45]  Craig Everhart,et al.  BDesign and Specification of the Cellular Andrew Environment , 1988 .

[46]  Ethan V. Munson,et al.  GAFFES: The Design of a Globally Distributed File System , 1987 .

[47]  Derek L. Eager,et al.  Achieving robustness in distributed database systems , 1983, TODS.

[48]  John S. Heidemann,et al.  Implementation of the Ficus Replicated File System , 1990, USENIX Summer.

[49]  C Rose,et al.  Inside Macintosh , 1985 .

[50]  Oivind Kure Optimization of file migration in distributed systems , 1988 .

[51]  Edsger W. Dijkstra,et al.  The structure of the “THE”-multiprogramming system , 1968, CACM.

[52]  M. Malik,et al.  Operating Systems , 1992, Lecture Notes in Computer Science.

[53]  Brian Randell,et al.  Update and Merge of Partitioned Distributed Systems , 1985 .

[54]  Larry L. Peterson,et al.  The x-kernel: a platform for accessing internet resources , 1990, Computer.

[55]  Calton Pu,et al.  Replication in Distributed Systems: The Eden Experience , 1986, Fall Joint Computer Conference.

[56]  John K. Ousterhout,et al.  Why Aren't Operating Systems Getting Faster As Fast as Hardware? , 1990, USENIX Summer.

[57]  Simon R. Wiseman Garbage collection in distributed systems , 1988 .

[58]  Susan B. Davidson,et al.  Optimism and consistency in partitioned distributed database systems , 1984, TODS.

[59]  Walter A. Burkhard,et al.  The Gemini replicated file system test-bed , 1987, 1987 IEEE Third International Conference on Data Engineering.

[60]  Michael J. Fischer,et al.  Sacrificing serializability to attain high availability of data in an unreliable network , 1982, PODS.

[61]  Alan Jay Smith Analysis of Long Term File Reference Patterns for Application to File Migration Algorithms , 1981, IEEE Transactions on Software Engineering.

[62]  Walter A. Burkhard,et al.  Consistency and recovery control for replicated files , 1985, SOSP '85.

[63]  James E. Allchin A Suite of Robust Algorithms For Maintaining Replicated Data Using Weak Consistency Conditions , 1983, Symposium on Reliability in Distributed Software and Database Systems.

[64]  Mahadev Satyanarayanan On the influence of scale in a distributed system , 1988, Proceedings. [1989] 11th International Conference on Software Engineering.

[65]  Arthur J. Bernstein,et al.  Efficient solutions to the replicated log and dictionary problems , 1984, PODC '84.

[66]  J. Heidemann,et al.  A Layered Approach to File System Development , 1991 .

[67]  Hector Garcia-Molina,et al.  Protocols for dynamic vote reassignment , 1986, PODC '86.

[68]  Clarence A. Ellis,et al.  Consistency and correctness of duplicate database systems , 1977, SOSP '77.

[69]  Charles W. Kaufman,et al.  Using History Information to Process Delayed Database Updates , 1986, VLDB.

[70]  Michael L. Kazar,et al.  Synchronization and Caching Issues in the Andrew File System , 1988, USENIX Winter.

[71]  Andrew R. Cherenson,et al.  The Sprite network operating system , 1988, Computer.

[72]  C RosenEric Vulnerabilities of network control protocols , 1981 .

[73]  David D. Wright,et al.  On merging partitioned databases , 1983, SIGMOD '83.

[74]  Sailesh Chutani,et al.  DEcorum File System Architectural Overview , 1990, USENIX Summer.

[75]  J. D. Day,et al.  A principle for resilient sharing of distributed resources , 1976, ICSE '76.

[76]  Kenneth P. Birman,et al.  Replication and fault-tolerance in the ISIS system , 1985, SOSP '85.