CHARM: A Cost-Efficient Multi-Cloud Data Hosting Scheme with High Availability

Nowadays, more and more enterprises and organizations are hosting their data into the cloud, in order to reduce the IT maintenance cost and enhance the data reliability. However, facing the numerous cloud vendors as well as their heterogenous pricing policies, customers maywell be perplexed with which cloud(s) are suitable for storing their data and what hosting strategy is cheaper.The general status quo is that customers usually put their data into a single cloud (which is subject to the vendor lock-in risk) and then simply trust to luck. Based on comprehensive analysis of various state-of-the-art cloud vendors, this paper proposes a novel data hosting scheme (named CHARM) which integrates two key functions desired. The first is selecting several suitable clouds and an appropriate redundancy strategy to store data with minimized monetary cost and guaranteed availability. The second is triggering a transition process to re-distribute data according to the variations of data access pattern and pricing of clouds. We evaluate the performance of CHARM using both trace-driven simulations and prototype experiments. The results show that compared with the major existing schemes, CHARM not only saves around 20 percent of monetary cost but also exhibits sound adaptability to data and price adjustments.

[1]  D. M. Chiu,et al.  Erasure code replication revisited , 2004, Proceedings. Fourth International Conference on Peer-to-Peer Computing, 2004. Proceedings..

[2]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[3]  Xiaosong Ma On the Feasibility of Data Loss Insurance for Personal Cloud Storage , 2014, HotCloud.

[4]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[5]  Xiaowei Yang,et al.  CloudCmp: comparing public cloud providers , 2010, IMC '10.

[6]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[7]  Michael I. Jordan,et al.  The SCADS Director: Scaling a Distributed Storage System Under Stringent Performance Requirements , 2011, FAST.

[8]  Mary Baker,et al.  Auditing to Keep Online Storage Services Honest , 2007, HotOS.

[9]  David A. Maltz,et al.  Cloudward bound: planning for beneficial migration of enterprise applications to the cloud , 2010, SIGCOMM '10.

[10]  Yang Tang,et al.  NCCloud: applying network coding for the storage repair in a cloud-of-clouds , 2012, FAST.

[11]  Suman Banerjee,et al.  An ensemble of replication and erasure codes for cloud file systems , 2013, 2013 Proceedings IEEE INFOCOM.

[12]  Ernst W. Biersack,et al.  Hierarchical codes: A flexible trade-off for erasure codes in peer-to-peer storage systems , 2010, Peer Peer Netw. Appl..

[13]  James S. Plank,et al.  Erasure Codes for Storage Systems: A Brief Primer , 2013, login Usenix Mag..

[14]  Miguel Correia,et al.  DepSky: Dependable and Secure Storage in a Cloud-of-Clouds , 2013, TOS.

[15]  Emmanuelle Anceaume,et al.  DataCube: A P2P Persistent Data Storage Architecture Based on Hybrid Redundancy Schema , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[16]  Karl Aberer,et al.  Scalia: An adaptive scheme for efficient multi-cloud storage , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[17]  Guangwen Yang,et al.  Understanding Data Characteristics and Access Patterns in a Cloud Storage System , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[18]  Rodrigo Rodrigues,et al.  High Availability in DHTs: Erasure Coding vs. Replication , 2005, IPTPS.

[19]  Li,et al.  Cloud Storage Technology and Its Applications , 2010 .

[20]  D. Martin Swany,et al.  Erasure Codes for Increasing the Availability of Grid Data Storage , 2006, Advanced Int'l Conference on Telecommunications and Int'l Conference on Internet and Web Applications and Services (AICT-ICIW'06).

[21]  Panos M. Pardalos,et al.  The maximum clique problem , 1994, J. Glob. Optim..

[22]  Ben Y. Zhao,et al.  Efficient Batched Synchronization in Dropbox-Like Cloud Storage Services , 2013, Middleware.

[23]  Amin Vahdat,et al.  scc: cluster storage provisioning informed by application characteristics and SLAs , 2012, FAST.

[24]  Hakim Weatherspoon,et al.  RACS: a case for cloud storage diversity , 2010, SoCC '10.

[25]  Chen Tian,et al.  Optimizing cost and performance for content multihoming , 2012, SIGCOMM '12.

[26]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[27]  Patrick Wendell,et al.  DONAR: decentralized server selection for cloud services , 2010, SIGCOMM '10.

[28]  David Wolinsky,et al.  Heading Off Correlated Failures through Independence-as-a-Service , 2014, OSDI.

[29]  Michael Vrable,et al.  BlueSky: a cloud-backed file system for the enterprise , 2012, FAST.

[30]  Hong Jiang,et al.  Meeting service level agreement cost-effectively for video-on-demand applications in the cloud , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[31]  Yunhao Liu,et al.  Towards Network-level Efficiency for Cloud Storage Services , 2014, Internet Measurement Conference.

[32]  Ethan L. Miller,et al.  Screaming fast Galois field arithmetic using intel SIMD instructions , 2013, FAST.