Calvin: fast distributed transactions for partitioned database systems

Many distributed storage systems achieve high data access throughput via partitioning and replication, each system with its own advantages and tradeoffs. In order to achieve high scalability, however, today's systems generally reduce transactional support, disallowing single transactions from spanning multiple partitions. Calvin is a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. Unlike previous deterministic database system prototypes, Calvin supports disk-based storage, scales near-linearly on a cluster of commodity machines, and has no single point of failure. By replicating transaction inputs rather than effects, Calvin is also able to support multiple consistency levels---including Paxos-based strong consistency across geographically distant replicas---at no cost to transactional throughput.

[1]  Bruce G. Lindsay,et al.  Transaction management in the R* distributed database management system , 1986, TODS.

[2]  Arthur T. Whitney,et al.  High volume transaction processing without currency control, two phase commit, SQLor C++ , 1997 .

[3]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[4]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[5]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[6]  Esther Pacitti,et al.  Preventive Multi-master Replication in a Cluster of Autonomous Databases , 2003, Euro-Par.

[7]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[8]  David Mazières Paxos Made Practical , 2007 .

[9]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[10]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[11]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[12]  Gerhard Weikum,et al.  Unbundling Transaction Services in the Cloud , 2009, CIDR.

[13]  Prashant Malik,et al.  Cassandra: structured storage system on a P2P network , 2009, PODC '09.

[14]  Mohamed F. Mokbel,et al.  Locking Key Ranges with Unbundled Transaction Services , 2009, Proc. VLDB Endow..

[15]  Daniel J. Abadi,et al.  The case for determinism in database systems , 2010, Proc. VLDB Endow..

[16]  J. Chris Anderson,et al.  CouchDB - The Definitive Guide: Time to Relax , 2010 .

[17]  Nigel Ellis,et al.  Extreme scale with full SQL language support in microsoft SQL Azure , 2010, SIGMOD Conference.

[18]  Tim Hawkins,et al.  The Definitive Guide to MongoDB , 2015, Apress.

[19]  Daniel J. Abadi,et al.  Low overhead concurrency control for partitioned main memory databases , 2010, SIGMOD Conference.

[20]  J. Chris Anderson,et al.  CouchDB: The Definitive Guide , 2010 .

[21]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX Annual Technical Conference.

[22]  Philip A. Bernstein,et al.  Hyder - A Transactional Record Manager for Shared Flash , 2011, CIDR.

[23]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[24]  Johannes Gehrke,et al.  Fast checkpoint recovery algorithms for frequently consistent applications , 2011, SIGMOD '11.

[25]  Jun Rao,et al.  Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore , 2011, Proc. VLDB Endow..