Scalable network file systems with load balancing and fault tolerance for web services

Abstract Because of the rapid growth of the World Wide Web and the popularization of smart phones, tablets and personal computers, the number of web service users is increasing rapidly. As a result, large web services require additional disk space, and the required disk space increases with the number of web service users. Therefore, it is important to design and implement a powerful network file system for large web service providers. In this paper, we present three design issues for scalable network file systems. We use a variable number of objects within a bucket to decrease internal fragmentation in small files. We also propose a free space and access load-balancing mechanism to balance overall loading on the bucket servers. Finally, we propose a mechanism for caching frequently accessed data to lower the total disk I/O. These proposed mechanisms can effectively improve scalable network file system performance for large web services.

[1]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[2]  Jim Zelenka,et al.  A cost-effective, high-bandwidth storage architecture , 1998, ASPLOS VIII.

[3]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[4]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[5]  D. Nagle,et al.  The Panasas ActiveScale Storage Cluster – Delivering Scalable High Bandwidth Storage , 2004 .

[6]  Matthew T. O'Keefe,et al.  The Global File System , 1996 .

[7]  Gregory R. Ganger,et al.  Ursa minor: versatile cluster-based storage , 2005, FAST'05.

[8]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[9]  Darrell D. E. Long,et al.  Swift: Using Distributed Disk Striping to Provide High I/O Data Rates , 1991, Comput. Syst..

[10]  Randal C. Burns,et al.  Cluster delegation: high-performance, fault-tolerant data sharing in NFS , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[11]  Jeanna Neefe Matthews,et al.  Serverless network file systems , 1996, TOCS.

[12]  J. Howard Et El,et al.  Scale and performance in a distributed file system , 1988 .

[13]  Chandramohan A. Thekkath,et al.  Frangipani: a scalable distributed file system , 1997, SOSP.

[14]  Michael K. Reiter,et al.  Lazy verification in fault-tolerant distributed storage systems , 2005, 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05).