Extracting More Concurrency from Distributed Transactions

Distributed storage systems run transactions across machines to ensure serializability. Traditional protocols for distributed transactions are based on two-phase locking (2PL) or optimistic concurrency control (OCC). 2PL serializes transactions as soon as they conflict and OCC resorts to aborts, leaving many opportunities for concurrency on the table. This paper presents ROCOCO, a novel concurrency control protocol for distributed transactions that outperforms 2PL and OCC by allowing more concurrency. ROCOCO executes a transaction as a collection of atomic pieces, each of which commonly involves only a single server. Servers first track dependencies between concurrent transactions without actually executing them. At commit time, a transaction's dependency information is sent to all servers so they can re-order conflicting pieces and execute them in a serializable order. We compare ROCOCO to OCC and 2PL using a scaled TPC-C benchmark. ROCOCO outperforms 2PL and OCC in workloads with varying degrees of contention. When the contention is high, ROCOCO's throughput is 130% and 347% higher than that of 2PL and OCC.

[1]  Patrick Valduriez,et al.  Simple rational guidance for chopping up transactions , 1992, SIGMOD '92.

[2]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[3]  Pramod Bhatotia,et al.  Large-scale Incremental Data Processing with Change Propagation , 2011, HotCloud.

[4]  Philip A. Bernstein,et al.  Concurrency Control in Distributed Database Systems , 1986, CSUR.

[5]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[6]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[7]  Robert Gruber,et al.  Efficient optimistic concurrency control using loosely synchronized clocks , 1995, SIGMOD '95.

[8]  Michael Stonebraker,et al.  H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..

[9]  Marcos K. Aguilera,et al.  Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.

[10]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[11]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 1992, The VLDB Journal.

[12]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[13]  Hector Garcia-Molina,et al.  Using semantic knowledge for transaction processing in a distributed database , 1983, TODS.

[14]  Peng Li,et al.  Paxos Replicated State Machines as the Basis of a High-Performance Data Store , 2011, NSDI.

[15]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[16]  Marcos K. Aguilera,et al.  Transaction chains: achieving serializability with low latency in geo-distributed storage systems , 2013, SOSP.

[17]  Tim Kraska,et al.  MDCC: multi-data center consistency , 2012, EuroSys '13.

[18]  Michael Stonebraker,et al.  OLTP through the looking glass, and what we found there , 2008, SIGMOD Conference.

[19]  Leslie Lamport,et al.  Consensus on transaction commit , 2004, TODS.

[20]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[21]  Arthur J. Bernstein,et al.  Concurrency control for step-decomposed transactions , 1999, Inf. Syst..

[22]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[23]  Emin Gün Sirer,et al.  HyperDex: a distributed, searchable key-value store , 2012, SIGCOMM '12.

[24]  Cheng Li,et al.  Making geo-replicated systems fast as possible, consistent when necessary , 2012, OSDI 2012.

[25]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[26]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[27]  J. T. Robinson,et al.  On optimistic methods for concurrency control , 1979, TODS.

[28]  Arthur T. Whitney,et al.  High volume transaction processing without currency control, two phase commit, SQLor C++ , 1997 .

[29]  David G. Andersen,et al.  There is more consensus in Egalitarian parliaments , 2013, SOSP.

[30]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[31]  Divyakant Agrawal,et al.  Low-Latency Multi-Datacenter Databases using Replicated Commit , 2013, Proc. VLDB Endow..

[32]  Miron Livny,et al.  Concurrency control performance modeling: alternatives and implications , 1987, TODS.

[33]  Leslie Lamport,et al.  Generalized Consensus and Paxos , 2005 .

[34]  Daniel J. Abadi,et al.  Calvin: fast distributed transactions for partitioned database systems , 2012, SIGMOD Conference.

[35]  Richard J. Lipton,et al.  A Massive Memory Machine , 1984, IEEE Transactions on Computers.

[36]  Emin Gün Sirer,et al.  Warp: Lightweight Multi-Key Transactions for Key-Value Stores , 2015, ArXiv.

[37]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[38]  Daniel J. Abadi,et al.  Low overhead concurrency control for partitioned main memory databases , 2010, SIGMOD Conference.

[39]  Eddie Kohler,et al.  Speedy transactions in multicore in-memory databases , 2013, SOSP.

[40]  Jeffrey F. Naughton,et al.  Multiprocessor Main Memory Transaction Processing , 1988, Proceedings [1988] International Symposium on Databases in Parallel and Distributed Systems.

[41]  Barbara Liskov,et al.  Granola: Low-Overhead Distributed Transaction Coordination , 2012, USENIX Annual Technical Conference.

[42]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[43]  Bruce G. Lindsay,et al.  Transaction management in the R* distributed database management system , 1986, TODS.

[44]  Michael J. Freedman,et al.  Stronger Semantics for Low-Latency Geo-Replicated Storage , 2013, NSDI.

[45]  Patrick Valduriez,et al.  Transaction chopping: algorithms and performance studies , 1995, TODS.

[46]  Hector Garcia-Molina,et al.  Main Memory Database Systems: An Overview , 1992, IEEE Trans. Knowl. Data Eng..

[47]  Marcos K. Aguilera,et al.  Transactional storage for geo-replicated systems , 2011, SOSP.

[48]  Umeshwar Dayal,et al.  Proceedings of the 1987 ACM SIGMOD international conference on Management of data , 1987 .

[49]  Daniel J. Rosenkrantz,et al.  System level concurrency control for distributed database systems , 1978, TODS.