An Analytical Framework and Its Applications for Studying Brick Storage Reliability

The reliability of a large-scale storage system is influenced by a complex set of inter-dependent factors. This paper presents a comprehensive and extensible analytical framework that offers quantitative answers to many design tradeoffs. We apply the framework to a number of important design strategies that a designer and/or administrator must face in reality, including topology-aware replica placement, proactive replication that uses small background network bandwidth and unused disk space to create additional copies. We also quantify the impact of slow (but potentially more accurate) failure detection and lazy replacement of failed disks. We use detailed simulation to verify and refine our analytical model. These results demonstrate the versatility of the framework and serve as a solid step towards more quantitative studies of fundamental system tradeoffs between reliability, performance, and cost in large-scale distributed storage systems.

[1]  Kishor S. Trivedi,et al.  Accelerating Mean Time to Failure Computations , 1996, Perform. Evaluation.

[2]  Robbert van Renesse,et al.  Chain Replication for Supporting High Throughput and Availability , 2004, OSDI.

[3]  Scott A. Brandt,et al.  Reliability mechanisms for very large storage systems , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[4]  Jim Zelenka,et al.  A cost-effective, high-bandwidth storage architecture , 1998, ASPLOS VIII.

[5]  Miguel Castro,et al.  Farsite: federated, available, and reliable storage for an incompletely trusted environment , 2002, OPSR.

[6]  Roy Billinton,et al.  Reliability evaluation of engineering systems : concepts and techniques , 1992 .

[7]  Andreas Haeberlen,et al.  Proactive Replication for Data Durability , 2006, IPTPS.

[8]  Walter A. Burkhard,et al.  Disk array storage system reliability , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[9]  Chao Jin,et al.  RepStore: a self-managing and self-tuning storage backend with smart bricks , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[10]  Jim Gray,et al.  Empirical Measurements of Disk Failure Rates and Error Rates , 2007, ArXiv.

[11]  Roy Billinton,et al.  Reliability Evaluation of Engineering Systems , 1983 .

[12]  Suman Nath,et al.  Availability of multi-object operations , 2006 .

[13]  Arif Merchant,et al.  FAB: building distributed enterprise disk arrays from commodity components , 2004, ASPLOS XI.

[14]  Srinivasan Seshan,et al.  Subtleties in Tolerating Correlated Failures in Wide-area Storage Systems , 2006, NSDI.

[15]  Joseph Pasquale,et al.  Analysis of Long-Running Replicated Systems , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[16]  GhemawatSanjay,et al.  The Google file system , 2003 .

[17]  Wei Chen,et al.  On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[18]  Roger Wattenhofer,et al.  Optimizing file availability in a secure serverless distributed file system , 2001, Proceedings 20th IEEE Symposium on Reliable Distributed Systems.

[19]  Chao Jin,et al.  RepStore: a self-managing and self-tuning storage backend with smart bricks , 2004 .

[20]  Andreas Haeberlen,et al.  Efficient Replica Maintenance for Distributed Storage Systems , 2006, NSDI.

[21]  Efficient Reliable Internet Storage ∗ , 2004 .

[22]  Roger Wattenhofer,et al.  Competitive Hill-Climbing Strategies for Replica Placement in a Distributed File System , 2001, DISC.