Ensuring Cloud Data Reliability with Minimum Replication by Proactive Replica Checking

Data reliability and storage costs are two primary concerns for current Cloud storage systems. To ensure data reliability, the widely used multi-replica (typically three) replication strategy in current Clouds incurs a huge extra storage consumption, resulting in a huge storage cost for data-intensive applications in the Cloud in particular. In order to reduce the Cloud storage consumption while meeting the data reliability requirement, in this paper we present a cost-effective data reliability management mechanism named PRCR based on a generalized data reliability model. By using a proactive replica checking approach, while the running overhead for PRCR is negligible, PRCR ensures reliability of the massive Cloud data with the minimum replication, which can also serve as a cost effectiveness benchmark for replication based approaches. Our simulation indicates that, compared with the conventional three-replica strategy, PRCR can reduce from one-third to two-thirds of the Cloud storage space consumption, hence significantly lowering the storage cost in a Cloud.

[1]  Xiao Liu,et al.  A Highly Practical Approach toward Achieving Minimum Data Sets Storage Cost in the Cloud , 2013, IEEE Transactions on Parallel and Distributed Systems.

[2]  David A. Patterson,et al.  Designing Disk Arrays for High Data Reliability , 1993, J. Parallel Distributed Comput..

[3]  Matei Ripeanu,et al.  ThriftStore: Finessing Reliability Trade-Offs in Replicated Storage Systems , 2011, IEEE Transactions on Parallel and Distributed Systems.

[4]  Jinjun Chen,et al.  A Cost-Effective Mechanism for Cloud Data Reliability Management Based on Proactive Replica Checking , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[5]  S. Shah,et al.  Server class disk drives: how reliable are they? , 2004, Annual Symposium Reliability and Maintainability, 2004 - RAMS.

[6]  GhemawatSanjay,et al.  The Google file system , 2003 .

[7]  Eric Bauer,et al.  Reliability and Availability of Cloud Computing , 2012 .

[8]  Vijay K. Garg,et al.  Fault Tolerance in Distributed Systems Using Fused Data Structures , 2013, IEEE Transactions on Parallel and Distributed Systems.

[9]  Andreas Haeberlen,et al.  Efficient Replica Maintenance for Distributed Storage Systems , 2006, NSDI.

[10]  Ethan L. Miller,et al.  Disk infant mortality in large storage systems , 2005, 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[11]  Garth A. Gibson Redundant disk arrays: Reliable, parallel secondary storage. Ph.D. Thesis , 1990 .

[12]  Yun Yang,et al.  An energy-efficient data transfer strategy with link rate control for Cloud , 2015, Int. J. Auton. Adapt. Commun. Syst..

[13]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[14]  Eric Bauer,et al.  Reliability and Availability of Cloud Computing: Bauer/Cloud Computing , 2012 .

[15]  Eduardo Pinheiro,et al.  Failure Trends in a Large Disk Drive Population , 2007, FAST.

[16]  Kannan Ramchandran,et al.  A “Hitchhiker’s” Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers , 2014 .

[17]  Joseph Pasquale,et al.  Analysis of Long-Running Replicated Systems , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[18]  Ming Lei,et al.  Online Grid Replication Optimizers to Improve System Reliability , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[19]  Ricardo Bianchini,et al.  Dynamically Quantifying and Improving the Reliability of Distributed Storage Systems , 2008, 2008 Symposium on Reliable Distributed Systems.

[20]  Yun Yang,et al.  A Novel Cost-Effective Dynamic Data Replication Strategy for Reliability in Cloud Data Centres , 2011, 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing.