An efficient scheme to ensure data availability for a cloud service provider

With the emergence of information technologies, an overwhelming amount of data and information is generated everyday. Storing and processing this huge volume of data is named by a ubiquitous term: big data management. Cloud storage systems enhance reliability and availability of data by introducing redundancy, i.e., data replication, in the system, thereby protecting the data integrity from node failures which occur frequently in any large-scale storage system. However, efficiently determining the level of redundancy, i.e., number of data replicas, is not a trivial task for a cloud service provider (CSP). Traditional methods, which use a fixed number of replicas for all users regardless of the user's budget, do not achieve efficiency in terms of financial benefit of CSPs. This paper presents an efficient replication scheme that allows a CSP to determine the optimal number of replicas for each user depending on the user's budgetary constraint and the CSP's resource capacity while maximizing the financial benefit of the CSP. Numerical simulations were performed to assess the validity of our approach. The results show the scalability of the proposed scheme which can apply to real systems with an arbitrary number of users.

[1]  Henri Casanova,et al.  Benefits and Drawbacks of Redundant Batch Requests , 2007, Journal of Grid Computing.

[2]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[3]  Wenying Zeng,et al.  Research on cloud storage architecture and key technologies , 2009, ICIS.

[4]  Jordi Torres,et al.  Characterizing Cloud Federation for Enhancing Providers' Profit , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[5]  Karl Aberer,et al.  Cost-efficient and differentiated data availability guarantees in data clouds , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[6]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[7]  Calton Pu,et al.  Intelligent management of virtualized resources for database systems in cloud environment , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[8]  Cathy H. Xia,et al.  Learning Curves and Stochastic Models for Pricing and Provisioning Cloud Computing Services , 2011 .

[9]  Tao Yang,et al.  Versioned File Backup and Synchronization for Storage Clouds , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[10]  Tram Truong-Huu,et al.  A Novel Model for Competition and Cooperation among Cloud Providers , 2014, IEEE Transactions on Cloud Computing.