Quota enforcement for high-performance distributed storage systems

Storage systems manage quotas to ensure that no one user can use more than their share of storage, and that each user gets the storage they need. This is difficult for large, distributed systems, especially those used for high- performance computing applications, because resource allocation occurs on many nodes concurrently. While quota management is an important problem, no robust scalable solutions have been proposed to date. We present a solution that has less than 0.2% performance overhead while the system is below saturation, compared with not enforcing quota at all. It provides byte-level accuracy at all times, in the absence of failures and cheating. If nodes fail or cheat, we recover within a bounded period. In our scheme quota is enforced asynchronously by intelligent storage servers: storage clients contact a shared management service to obtain vouchers, which the clients can spend like cash at participating storage servers to allocate storage space. Like a digital cash system, the system periodically reconciles voucher usage to ensure that clients do not cheat by spending the same voucher at multiple storage servers. We report on a simulation study that validates this approach and evaluates its performance.

[1]  John T. Kohl,et al.  The Kerberos Network Authentication Service (V5 , 2004 .

[2]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[3]  Michael Walfish,et al.  Distributed Quota Enforcement for Spam Control , 2006, NSDI.

[4]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[5]  Richard A. Golding,et al.  Walking toward moving goalposts: agile management for evolving systems , 2006 .

[6]  Darrell D. E. Long,et al.  Swift: Using Distributed Disk Striping to Provide High I/O Data Rates , 1991, Comput. Syst..

[7]  G. Hardin,et al.  The Tragedy of the Commons , 1968, Green Planet Blues.

[8]  Robert M. Rees,et al.  IBM Storage Tank - A heterogeneous scalable SAN file system , 2003, IBM Syst. J..

[9]  David Robinson,et al.  NFS version 4 Protocol , 2000, RFC.

[10]  Thomas E. Anderson,et al.  A Comparison of File System Workloads , 2000, USENIX Annual Technical Conference, General Track.

[11]  Tao Yang,et al.  The Panasas ActiveScale Storage Cluster - Delivering Scalable High Bandwidth Storage , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[12]  Dan S. Wallach,et al.  Enforcing Fair Sharing of Peer-to-Peer Resources , 2003, IPTPS.

[13]  Bo Hong,et al.  File System Workload Analysis For Large Scientific Computing Applications , 2004, MSST.

[14]  J. Howard Et El,et al.  Scale and performance in a distributed file system , 1988 .

[15]  Tatsuaki Okamoto,et al.  Universal Electronic Cash , 1991, CRYPTO.

[16]  Amin Vahdat,et al.  SHARP: an architecture for secure resource peering , 2003, SOSP '03.

[17]  Garth A. Gibson,et al.  Filesystems for Network-Attached Secure Disks, , 1997 .

[18]  Brian D. Noble,et al.  Samsara: honor among thieves in peer-to-peer storage , 2003, SOSP '03.

[19]  Michael K. Reiter,et al.  Fault-scalable Byzantine fault-tolerant services , 2005, SOSP '05.

[20]  G. Hardin The Tragedy of the Commons , 2009 .

[21]  Antony I. T. Rowstron,et al.  PAST: a large-scale, persistent peer-to-peer storage utility , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[22]  Robbert van Renesse,et al.  Amoeba A Distributed Operating System for the 1990 s Sape , 1990 .