Consistency in the Cloud: When Money Does Matter!

With the emergence of cloud computing, many organizations have moved their data to the cloud in order to provide scalable, reliable and highly available services. To meet the ever-growing user needs, these services mainly rely on geographically-distributed data replication to guarantee good performance and high availability. However, with replication, consistency comes into question. Service providers in the cloud have the freedom to select the level of consistency according to the access patterns exhibited by the applications. Most optimizations efforts then concentrate on how to provide adequate trade-offs between consistency guarantees and performance. However, as the monetary cost completely relies on the service providers, in this paper we argue that monetary cost should be taken into consideration when evaluating or selecting a consistency level in the cloud. Accordingly, we define a new metric called consistency-cost efficiency. Based on this metric, we present a simple, yet efficient economical consistency model, called Bismar, that adaptively tunes the consistency level at runtime in order to reduce the monetary cost while simultaneously maintaining a low fraction of stale reads. Experimental evaluations with the Cassandra cloud storage on the Grid'5000 test bed show the validity of the metric and demonstrate the effectiveness of the proposed consistency model.

[1]  Michael D. Schroeder,et al.  Proceedings of the seventh ACM symposium on Operating systems principles , 1979 .

[2]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[3]  M. Herlihy A quorum-consensus replication method for abstract data types , 1986, TOCS.

[4]  John F. Meyer,et al.  Performability management in distributed database systems: an adaptive concurrency control protocol , 1996, Proceedings of MASCOTS '96 - 4th International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[5]  GhemawatSanjay,et al.  The Google file system , 2003 .

[6]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7]  Franck Cappello,et al.  Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed , 2006, Int. J. High Perform. Comput. Appl..

[8]  Simson L. Garfinkel,et al.  An Evaluation of Amazon's Grid Computing Services: EC2, S3, and SQS , 2007 .

[9]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[10]  Matei Ripeanu,et al.  Amazon S3 for science grids: a viable solution? , 2008, DADC '08.

[11]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[12]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[13]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[14]  Ewa Deelman,et al.  The cost of doing science on the cloud: the Montage example , 2008, HiPC 2008.

[15]  M. Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[16]  Gustavo Alonso,et al.  Consistency Rationing in the Cloud: Pay only when it matters , 2009, Proc. VLDB Endow..

[17]  Paulo Veríssimo,et al.  Proceedings of the Sixth international conference on Hot topics in system dependability , 2010 .

[18]  Bingsheng He,et al.  Distributed Systems Meet Economics: Pricing in the Cloud , 2010, HotCloud.

[19]  Asser N. Tantawi,et al.  See Spot Run: Using Spot Instances for MapReduce Workflows , 2010, HotCloud.

[20]  Hai Jin,et al.  Tools and Technologies for Building Clouds , 2010, Cloud Computing.

[21]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[22]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[23]  Jing Xu,et al.  An Application-Based Adaptive Replica Consistency for Cloud Storage , 2010, 2010 Ninth International Conference on Grid and Cloud Computing.

[24]  Xiaozhou Li,et al.  What Consistency Does Your Key-Value Store Actually Provide? , 2010, HotDep.

[25]  Kevin Lee,et al.  Data Consistency Properties and the Trade-offs in Commercial Cloud Storage: the Consumers' Perspective , 2011, CIDR.

[26]  Hai Jin,et al.  Adaptive Disk I/O Scheduling for MapReduce in Virtualized Environment , 2011, 2011 International Conference on Parallel Processing.

[27]  Huan Liu,et al.  Cutting MapReduce Cost with Spot Market , 2011, HotCloud.

[28]  Hai Jin,et al.  Towards Pay-As-You-Consume Cloud Computing , 2011, 2011 IEEE International Conference on Services Computing.

[29]  Sherif Sakr,et al.  CloudDB AutoAdmin: Towards a Truly Elastic Cloud-Based Data Store , 2011, 2011 IEEE International Conference on Web Services.

[30]  David Bermbach,et al.  Eventual consistency: How soon is eventual? An evaluation of Amazon S3's consistency behavior , 2011, MW4SOC '11.

[31]  María S. Pérez-Hernández,et al.  Harmony: Towards Automated Self-Adaptive Consistency in Cloud Storage , 2012, 2012 IEEE International Conference on Cluster Computing.

[32]  Cheng Li,et al.  Making geo-replicated systems fast as possible, consistent when necessary , 2012, OSDI 2012.

[33]  Hai Jin,et al.  Maestro: Replica-Aware Map Scheduling for MapReduce , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).