ZettaDS: A Light-weight Distributed Storage System for Cluster

We have designed and implemented the Zetta data storage system (ZettaDS), a light-weight scalable distributed data storage system for cluster. While sharing many common characters with some of modern distributed data storage systems such as single meta server architecture, running on inexpensive commodity components, our system is a very light-weight one and aims to handle lots of small files efficiently. The emphases of our design are on scalability of storage capacity and manageability. Throughput and performance are considered secondary. Furthermore, ZettaDS is designed to minimize the resource consumption due to running on a non-dedicated system.The paper describes the details and rationales of the design and implementation. Also, we evaluate our system by some experiments. The results demonstrate that our system can use the storage spaces more efficiently and achieve better transfer performance when facing a large number of small files.

[1]  Ian Foster,et al.  GridFTP Pipelining , 2007 .

[2]  Fabio Kon Distributed File Systems Past, Present and Future A Distributed File System for 2006 , 1996 .

[3]  Margo I. Seltzer,et al.  Berkeley DB , 1999, USENIX Annual Technical Conference, FREENIX Track.

[4]  Tao Yang,et al.  An Efficient Data Location Protocol for Self.organizing Storage Clusters , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[7]  W. Vogels File system usage in Windows NT 4.0 , 2000, OPSR.

[8]  Jacob R. Lorch,et al.  A five-year study of file-system metadata , 2007, TOS.

[9]  John H. Hartman,et al.  The Zebra striped network file system , 1995, TOCS.

[10]  Darrell D. E. Long,et al.  Swift: Using Distributed Disk Striping to Provide High I/O Data Rates , 1991, Comput. Syst..

[11]  GhemawatSanjay,et al.  The Google file system , 2003 .

[12]  E. L. Miller,et al.  Efficient Metadata Management in Large Distributed File Systems , .

[13]  Jeanna Neefe Matthews,et al.  Serverless network file systems , 1996, TOCS.

[14]  J. Howard Et El,et al.  Scale and performance in a distributed file system , 1988 .

[15]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[16]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).