Adaptive Consistency Approaches for Cloud Computing Platform

Recently, cloud computing platforms are available to provide convenient infrastructures such that cloud applications could conduct cloud and data-intensive computing. As the number of users and data increases, distribution and concurrency in systems is becoming the norm, not the exception. Data is replicated in geographically distinct datacentres, and large numbers of user requests are concurrently processed within a single replica. Consequently, most present new consistency models, which seek to provide stronger guarantees whilst preserving performance, through making certain assumptions on the operations executed or the data accessed. We discuss that this approach is often flawed. Recent years have seen a paradigm shift: distribution and concurrency have become prevalent. It is unrealistic to expect that a unique system view could, or even should exist at any given time. Yet, the literature still equates divergent views to inconsistency. Secondly, current distributed systems often rely on a definition of consistency that is too narrow. Consistency is an application-centric property. It should not be reduced to freshness. Thirdly, there exists a multiplicity of consistency definitions, each relying on subtly different assumptions and definitions of consistency. As a result, it is almost impossible to compare the guarantees provided by each system.

[1]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[2]  Ju Wang,et al.  Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.

[3]  Matei Ripeanu,et al.  Amazon S3 for science grids: a viable solution? , 2008, DADC '08.

[4]  Kevin Lee,et al.  Data Consistency Properties and the Trade-offs in Commercial Cloud Storage: the Consumers' Perspective , 2011, CIDR.

[5]  David E. Culler,et al.  A blueprint for introducing disruptive technology into the Internet , 2003, CCRV.

[6]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[7]  Rini T. Kaushik,et al.  GreenHDFS: towards an energy-conserving, storage-efficient, hybrid Hadoop compute cluster , 2010 .

[8]  Bingsheng He,et al.  Distributed Systems Meet Economics: Pricing in the Cloud , 2010, HotCloud.

[9]  Michael J. Freedman,et al.  Stronger Semantics for Low-Latency Geo-Replicated Storage , 2013, NSDI.

[10]  John F. Meyer,et al.  Performability management in distributed database systems: an adaptive concurrency control protocol , 1996, Proceedings of MASCOTS '96 - 4th International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[11]  Guillaume Pierre,et al.  Wikipedia workload analysis for decentralized hosting , 2009, Comput. Networks.

[12]  Huan Liu,et al.  Cutting MapReduce Cost with Spot Market , 2011, HotCloud.

[13]  Yi Yang,et al.  Separating data and control: support for adaptable consistency protocols in collaborative systems , 2004, CSCW.

[14]  Robert H. Thomas,et al.  A Majority consensus approach to concurrency control for multiple copy databases , 1979, ACM Trans. Database Syst..

[15]  Sherif Sakr,et al.  CloudDB AutoAdmin: Towards a Truly Elastic Cloud-Based Data Store , 2011, 2011 IEEE International Conference on Web Services.

[16]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[17]  Rajkumar Buyya,et al.  Article in Press Future Generation Computer Systems ( ) – Future Generation Computer Systems Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility , 2022 .

[18]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[19]  Hai Jin,et al.  Adaptive Disk I/O Scheduling for MapReduce in Virtualized Environment , 2011, 2011 International Conference on Parallel Processing.

[20]  James A. Anderson,et al.  Neurocomputing: Foundations of Research , 1988 .

[21]  Amin Vahdat,et al.  Design and evaluation of a continuous consistency model for replicated services , 2000, OSDI.

[22]  Rui Liu,et al.  DAX: A Widely Distributed Multi-tenant Storage Service for DBMS Hosting , 2013, Proc. VLDB Endow..

[23]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[24]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[25]  Cheng Li,et al.  Making geo-replicated systems fast as possible, consistent when necessary , 2012, OSDI 2012.

[26]  Michel Dubois,et al.  Concurrent Miss Resolution in Multiprocessor Caches , 1988, ICPP.

[27]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[28]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[29]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.