Scalable Transactions for Scalable Distributed Database Systems

With the advent of the Internet and Internet-connected devices, modern applications can experience very rapid growth of users from all parts of the world. A growing user base leads to greater usage and large data sizes, so scalable database systems capable of handling the great demands are critical for applications. With the emergence of cloud computing, a major movement in the industry, modern applications depend on distributed data stores for their scalable data management solutions. Many large-scale applications utilize NoSQL systems, such as distributed key-value stores, for their scalability and availability properties over traditional relational database systems. By simplifying the design and interface, NoSQL systems can provide high scalability and performance for large data sets and high volume workloads. However, to provide such benefits, NoSQL systems sacrifice traditional consistency models and support for transactions typically available in database systems. Without transaction semantics, it is harder for developers to reason about the correctness of the interactions with the data. Therefore, it is important to support transactions for distributed database systems without sacrificing scalability.In this thesis, I present new techniques for scalable transactions for scalable database systems. Distributed data stores need scalable transactions to take advantage of cloud computing, and to meet the demands of modern applications. Traditional techniques for transactions may not be appropriate in a large, distributed environment, so in this thesis, I describe new techniques for distributed transactions, without having to sacrifice traditional semantics or scalability.I discuss three facets to improving transaction scalability and support in distributed database systems. First, I describe a new transaction commit protocol that reduces the response times for distributed transactions. Second, I propose a new transaction programming model that allows developers to better deal with the unexpected behavior of distributed transactions. Lastly, I present a new scalable view maintenance algorithm for convergent join views. Together, the new techniques in this thesis contribute to providing scalable transactions for modern, distributed database systems.

[1]  Patrick Valduriez,et al.  Principles of Distributed Database Systems, Third Edition , 2011 .

[2]  Carlo Curino,et al.  Workload-aware database monitoring and consolidation , 2011, SIGMOD '11.

[3]  Leslie Lamport,et al.  Generalized Consensus and Paxos , 2005 .

[4]  V. S. Subrahmanian,et al.  Maintaining views incrementally , 1993, SIGMOD Conference.

[5]  Ali Ghodsi,et al.  Scalable atomic visibility with RAMP transactions , 2014, SIGMOD Conference.

[6]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[7]  Hector Garcia-Molina,et al.  The demarcation protocol: A technique for maintaining constraints in distributed database systems , 1994, The VLDB Journal.

[8]  Victor Shoup,et al.  Secure and Efficient Asynchronous Broadcast Protocols , 2001, CRYPTO.

[9]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[10]  Keith Marzullo,et al.  Mencius: Building Efficient Replicated State Machine for WANs , 2008, OSDI.

[11]  Jennifer Widom,et al.  View maintenance in a warehousing environment , 1995, SIGMOD '95.

[12]  Michael K. Reiter,et al.  Fault-scalable Byzantine fault-tolerant services , 2005, SOSP '05.

[13]  Gerhard Weikum,et al.  Conflict-driven load control for the avoidance of data-contention thrashing , 1990, [1991] Proceedings. Seventh International Conference on Data Engineering.

[14]  Michael Stonebraker,et al.  The Case for Shared Nothing , 1985, HPTS.

[15]  Tim Kraska,et al.  PIQL: Success-Tolerant Query Processing in the Cloud , 2011, Proc. VLDB Endow..

[16]  Abraham Silberschatz,et al.  An optimistic commit protocol for distributed transaction management , 1991, SIGMOD '91.

[17]  Patrick E. O'Neil,et al.  The Escrow transactional method , 1986, TODS.

[18]  Arie Segev,et al.  Updating Distributed Materialized Views , 1989, IEEE Trans. Knowl. Data Eng..

[19]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[20]  Ambuj K. Singh,et al.  Efficient view maintenance at data warehouses , 1997, SIGMOD '97.

[21]  Bruce G. Lindsay,et al.  How to roll a join: asynchronous incremental view maintenance , 2000, SIGMOD '00.

[22]  Latha S. Colby,et al.  Algorithms for deferred view maintenance , 1996, SIGMOD '96.

[23]  Yue Zhuge,et al.  The Strobe algorithms for multi-source warehouse consistency , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[24]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[25]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[26]  Hans-Ulrich Heiß,et al.  Adaptive Load Control in Transaction Processing Systems , 1991, VLDB.

[27]  Miron Livny,et al.  Load control for locking: the “half-and-half” approach , 1990, PODS '90.

[28]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[29]  Elke A. Rundensteiner,et al.  Multiversion-based view maintenance over distributed data sources , 2004, TODS.

[30]  Leslie Lamport,et al.  Fast Paxos , 2006, Distributed Computing.

[31]  Florian Schintke,et al.  Scalaris: reliable transactional p2p key/value store , 2008, ERLANG '08.

[32]  Dale Skeen,et al.  Nonblocking commit protocols , 1981, SIGMOD '81.

[33]  Tim Kraska,et al.  Building a database on S3 , 2008, SIGMOD Conference.

[34]  Tim Kraska,et al.  Generalized scale independence through incremental precomputation , 2013, SIGMOD '13.

[35]  Gio Wiederhold,et al.  Incremental Recomputation of Active Relational Expressions , 1991, IEEE Trans. Knowl. Data Eng..

[36]  Tim Kraska,et al.  An evaluation of alternative architectures for transaction processing in the cloud , 2010, SIGMOD Conference.

[37]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[38]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[39]  Frank Wm. Tompa,et al.  Efficiently updating materialized views , 1986, SIGMOD '86.

[40]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[41]  Marcos K. Aguilera,et al.  Transactional storage for geo-replicated systems , 2011, SOSP.

[42]  Gustavo Alonso,et al.  Processing transactions over optimistic atomic broadcast protocols , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[43]  Amr El Abbadi,et al.  Posse: A Framework for Optimizing Incremental View Maintenance at Data Warehouse , 1999, DaWaK.

[44]  Prashant J. Shenoy,et al.  Resilient and coherence preserving dissemination of dynamic data using cooperating peers , 2004, IEEE Transactions on Knowledge and Data Engineering.

[45]  Eli Upfal,et al.  Performance prediction for concurrent database workloads , 2011, SIGMOD '11.

[46]  Michael I. Jordan,et al.  Characterizing, modeling, and generating workload spikes for stateful services , 2010, SoCC '10.

[47]  Marcos K. Aguilera,et al.  Transaction chains: achieving serializability with low latency in geo-distributed storage systems , 2013, SOSP.

[48]  Tim Kraska,et al.  MDCC: multi-data center consistency , 2012, EuroSys '13.

[49]  Leslie Lamport,et al.  Consensus on transaction commit , 2004, TODS.

[50]  Jorge-Arnulfo Quiané-Ruiz,et al.  Runtime measurements in the cloud , 2010, Proc. VLDB Endow..

[51]  Arie Segev,et al.  Currency-based updates to distributed materialized views , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[52]  Alexander Thomasian,et al.  Thrashing in two-phase locking revisited , 1992, [1992] Eighth International Conference on Data Engineering.

[53]  Elke A. Rundensteiner,et al.  A compensation-based approach for view maintenance in distributed environments , 2006, IEEE Transactions on Knowledge and Data Engineering.

[54]  Alexander Thomasian,et al.  Two-phase locking performance and its thrashing behavior , 1993, TODS.

[55]  Dan Dobre,et al.  HP: Hybrid Paxos for WANs , 2010, 2010 European Dependable Computing Conference.

[56]  Gerhard Weikum,et al.  Conflict-driven load control for the avoidance of data-contention thrashing , 1990, [1991] Proceedings. Seventh International Conference on Data Engineering.

[57]  Parag Agrawal,et al.  Asynchronous view maintenance for VLSD databases , 2009, SIGMOD Conference.

[58]  Hicham G. Elmongui,et al.  Lazy Maintenance of Materialized Views , 2007, VLDB.

[59]  Erich M. Nahum,et al.  A method for transparent admission control and request scheduling in e-commerce web sites , 2004, WWW '04.

[60]  David J. DeWitt,et al.  An Evaluation of Non-Equijoin Algorithms , 1991, VLDB.

[61]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[62]  F. E. A Relational Model of Data Large Shared Data Banks , 2000 .

[63]  Hui Ding,et al.  TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.

[64]  Gustavo Alonso,et al.  Consistency Rationing in the Cloud: Pay only when it matters , 2009, Proc. VLDB Endow..

[65]  Jennifer Widom,et al.  Adaptive precision setting for cached approximate values , 2001, SIGMOD '01.

[66]  Jennifer Widom,et al.  On-line warehouse view maintenance , 1997, SIGMOD '97.

[67]  Marc Shapiro,et al.  Non-Monotonic Snapshot Isolation , 2013, ArXiv.

[68]  Steve Harrison,et al.  Boosting system performance with optimistic distributed protocols , 2001 .

[69]  Pat Helland,et al.  Building on Quicksand , 2009, CIDR.

[70]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[71]  Elke A. Rundensteiner,et al.  Parallel multisource view maintenance , 2003, The VLDB Journal.

[72]  Jennifer Widom,et al.  Deriving Production Rules for Incremental View Maintenance , 1991, VLDB.

[73]  Inderpal Singh Mumick,et al.  The Stanford Data Warehousing Project , 1995 .

[74]  Divyakant Agrawal,et al.  Serializability, not Serial: Concurrency Control and Availability in Multi-Datacenter Datastores , 2012, Proc. VLDB Endow..

[75]  Ion Stoica,et al.  Probabilistically Bounded Staleness for Practical Partial Quorums , 2012, Proc. VLDB Endow..

[76]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.