G-Store: a scalable data store for transactional multi key access in the cloud

Cloud computing has emerged as a preferred platform for deploying scalable web-applications. With the growing scale of these applications and the data associated with them, scalable data management systems form a crucial part of the cloud infrastructure. Key-Value stores -- such as Bigtable, PNUTS, Dynamo, and their open source analogues-- have been the preferred data stores for applications in the cloud. In these systems, data is represented as Key-Value pairs, and atomic access is provided only at the granularity of single keys. While these properties work well for current applications, they are insufficient for the next generation web applications -- such as online gaming, social networks, collaborative editing, and many more -- which emphasize collaboration. Since collaboration by definition requires consistent access to groups of keys, scalable and consistent multi key access is critical for such applications. We propose the Key Group abstraction that defines a relationship between a group of keys and is the granule for on-demand transactional access. This abstraction allows the Key Grouping protocol to collocate control for the keys in the group to allow efficient access to the group of keys. Using the Key Grouping protocol, we design and implement G-Store which uses a key-value store as an underlying substrate to provide efficient, scalable, and transactional multi key access. Our implementation using a standard key-value store and experiments using a cluster of commodity machines show that G-Store preserves the desired properties of key-value stores, while providing multi key access functionality at a very low overhead.

[1]  Amr El Abbadi,et al.  ElasTraS: An Elastic Transactional Data Store in the Cloud , 2009, HotCloud.

[2]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[3]  Michael Stonebraker,et al.  H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..

[4]  J. T. Robinson,et al.  On optimistic methods for concurrency control , 1979, TODS.

[5]  Gerhard Weikum,et al.  Unbundling Transaction Services in the Cloud , 2009, CIDR.

[6]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[7]  David J. DeWitt,et al.  GAMMA - A High Performance Dataflow Database Machine , 1986, VLDB.

[8]  Hidehiko Tanaka,et al.  An Overview of The System Software of A Parallel Relational Database Machine GRACE , 1986, VLDB.

[9]  Shyam Antony,et al.  Data Management Challenges in Cloud Computing Infrastructures , 2010, DNIS.

[10]  Mohamed F. Mokbel,et al.  Locking Key Ranges with Unbundled Transaction Services , 2009, Proc. VLDB Endow..

[11]  Gustavo Alonso,et al.  Databases and Web 2.0 panel at VLDB 2007 , 2008, SGMD.

[12]  Marcos K. Aguilera,et al.  Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.

[13]  Fan Yang,et al.  A Scalable Data Platform for a Large Number of Small Applications , 2009, CIDR.

[14]  Tim Kraska,et al.  Building a database on S3 , 2008, SIGMOD Conference.

[15]  Pat Helland,et al.  Life beyond Distributed Transactions: an Apostate's Opinion , 2007, CIDR.

[16]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[17]  Eugene Wong,et al.  Introduction to a system for distributed databases (SDD-1) , 1980, TODS.

[18]  WeikumGerhard,et al.  Databases and Web 2.0 panel at VLDB 2007 , 2008 .

[19]  Gottfried Vossen,et al.  Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery , 2002 .

[20]  Irving L. Traiger,et al.  The notions of consistency and predicate locks in a database system , 1976, CACM.

[21]  Marc H. Scholl,et al.  Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery , 2001, SGMD.

[22]  David A. Patterson,et al.  SCADS: Scale-Independent Storage for Social Computing Applications , 2009, CIDR.

[23]  Gustavo Alonso,et al.  Consistency Rationing in the Cloud: Pay only when it matters , 2009, Proc. VLDB Endow..

[24]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[25]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[26]  Laura M. Haas,et al.  Computation and communication in R*: a distributed database manager , 1984, TOCS.