Cost Optimization for Dynamic Replication and Migration of Data in Cloud Data Centers

Cloud Storage Providers (CSPs) offer geographically data stores providing several storage classes with different prices. An important problem facing by cloud users is how to exploit these storage classes to serve an application with a time-varying workload on its objects at minimum cost. This cost consists of residential cost (i.e., storage, Put and Get costs) and potential migration cost (i.e., network cost). To address this problem, we first propose the optimal offline algorithm that leverages dynamic and linear programming techniques with the assumption of available exact knowledge of workload on objects. Due to the high time complexity of this algorithm and its requirement for a priori knowledge, we propose two online algorithms that make a trade-off between residential and migration costs and dynamically select storage classes across CSPs. The first online algorithm is deterministic with no need of any knowledge of workload and incurs no more than $2\gamma -1$2γ-1 times of the minimum cost obtained by the optimal offline algorithm, where $\gamma$γ is the ratio of the residential cost in the most expensive data store to the cheapest one in either network or storage cost. The second online algorithm is randomized that leverages “Receding Horizon Control” (RHC) technique with the exploitation of available future workload information for $w$w time slots. This algorithm incurs at most $1+\frac{\gamma }{w}$1+γw times the optimal cost. The effectiveness of the proposed algorithms is demonstrated through simulations using a workload synthesized based on characteristics of the Facebook workload.

[1]  George Mastorakis,et al.  An evaluation of cloud-based mobile services with limited capacity: a linear approach , 2016, Soft Computing.

[2]  Ciprian Dobre,et al.  Using Socio-Spatial Context in Mobile Cloud Process Offloading for Energy Conservation in Wireless Devices , 2019, IEEE Transactions on Cloud Computing.

[3]  Bo Li,et al.  Scaling social media applications into geo-distributed clouds , 2012, 2012 Proceedings IEEE INFOCOM.

[4]  Marty Humphrey,et al.  A Model and Decision Procedure for Data Storage in Cloud Computing , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[5]  Cory Hill,et al.  f4: Facebook's Warm BLOB Storage System , 2014, OSDI.

[6]  Murali S. Kodialam,et al.  Frugal storage for cloud file systems , 2012, EuroSys '12.

[7]  Marcos K. Aguilera,et al.  Online Migration for Geo-distributed Storage Systems , 2011, USENIX Annual Technical Conference.

[8]  Sanjeev Kumar,et al.  Finding a Needle in Haystack: Facebook's Photo Storage , 2010, OSDI.

[9]  Wessam Ajib,et al.  Social Network Analysis Inspired Content Placement with QoS in Cloud Based Content Delivery Networks , 2014, GLOBECOM 2014.

[10]  Amr El Abbadi,et al.  ElasTraS: An Elastic Transactional Data Store in the Cloud , 2009, HotCloud.

[11]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[12]  John V. Guttag,et al.  Power-demand routing in massive geo-distributed systems , 2010 .

[13]  Hakim Weatherspoon,et al.  RACS: a case for cloud storage diversity , 2010, SoCC '10.

[14]  Lachlan L. H. Andrew,et al.  Online algorithms for geographical load balancing , 2012, 2012 International Green Computing Conference (IGCC).

[15]  Juan Manuel García,et al.  A survey of migration mechanisms of virtual machines , 2014, CSUR.

[16]  Ethan Katz-Bassett,et al.  SPANStore: cost-effective geo-replicated storage spanning multiple cloud services , 2013, SOSP.

[17]  Ricardo Bianchini,et al.  DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments , 2013, USENIX Annual Technical Conference.

[18]  Lachlan L. H. Andrew,et al.  Dynamic Right-Sizing for Power-Proportional Data Centers , 2011, IEEE/ACM Transactions on Networking.

[19]  Mandar Kulkarni,et al.  Analyzing Compute vs. Storage Tradeoff for Video-aware Storage Efficiency , 2012, HotStorage.

[20]  Symeon Papavassiliou,et al.  A Cloud-Oriented Content Delivery Network Paradigm: Modeling and Assessment , 2013, IEEE Transactions on Dependable and Secure Computing.

[21]  Makhlouf Hadji Scalable and Cost-Efficient Algorithms for Reliable and Distributed Cloud Storage , 2015, CLOSER.

[22]  Minghua Chen,et al.  Simple and effective dynamic provisioning for power-proportional data centers , 2011, 2012 46th Annual Conference on Information Sciences and Systems (CISS).

[23]  Mohamed Faten Zhani,et al.  On optimizing replica migration in distributed cloud storage systems , 2015, 2015 IEEE 4th International Conference on Cloud Networking (CloudNet).

[24]  Chia-Wei Chang,et al.  Probability-Based Cloud Storage Providers Selection Algorithms with Maximum Availability , 2012, 2012 41st International Conference on Parallel Processing.

[25]  Katherine Guo,et al.  Intra-cloud lightning: Building CDNs in the cloud , 2012, 2012 Proceedings IEEE INFOCOM.

[26]  Hermann Hellwagner,et al.  Improving Internet Video Streaming Performance by Parallel TCP-Based Request-Response Streams , 2010, 2010 7th IEEE Consumer Communications and Networking Conference.

[27]  Zahir Tari,et al.  MetaCDN: Harnessing 'Storage Clouds' for high performance content delivery , 2009, J. Netw. Comput. Appl..

[28]  Rajkumar Buyya,et al.  Interconnected Cloud Computing Environments , 2014, ACM Comput. Surv..

[29]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[30]  Feng Xia,et al.  A survey on virtual machine migration and server consolidation frameworks for cloud data centers , 2015, J. Netw. Comput. Appl..

[31]  David Bermbach,et al.  MetaStorage: A Federated Cloud Storage System to Manage Consistency-Latency Tradeoffs , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[32]  Xiaowei Yang,et al.  CloudCmp: comparing public cloud providers , 2010, IMC '10.

[33]  Rajkumar Buyya,et al.  Brokering Algorithms for Optimizing the Availability and Cost of Cloud Storage Services , 2013, 2013 IEEE 5th International Conference on Cloud Computing Technology and Science.

[34]  Jun Li,et al.  Optimizing Cost for Online Social Networks on Geo-Distributed Clouds , 2016, IEEE/ACM Transactions on Networking.

[35]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[36]  Zongpeng Li,et al.  Cost-minimizing dynamic migration of content distribution services into hybrid clouds , 2012, 2012 Proceedings IEEE INFOCOM.

[37]  Divyakant Agrawal,et al.  Albatross: Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Migration , 2011, Proc. VLDB Endow..

[38]  Divyakant Agrawal,et al.  Zephyr: live migration in shared nothing databases for elastic cloud platforms , 2011, SIGMOD '11.

[39]  Minghua Chen,et al.  Moving Big Data to The Cloud: An Online Cost-Minimizing Approach , 2013, IEEE Journal on Selected Areas in Communications.