Ensuring Serializable Executions with Snapshot Isolation DBMS

Snapshot Isolation (SI) is a multiversion concurrency control that has been implemented by open source and commercial database systems such as PostgreSQL and Oracle. The main feature of SI is that a read operation does not block a write operation and vice versa, which allows higher degree of concurrency than traditional two-phase locking. SI prevents many anomalies that appear in other isolation levels, but it still can result in non-serializable execution, in which database integrity constraints can be violated. Several techniques have been proposed to ensure serializable execution with engines running SI; these techniques are based on modifying the applications by introducing conflicting SQL statements. However, with each of these techniques the DBA has to make a difficult choice among possible transactions to modify. This thesis helps the DBA’s to choose between these different techniques and choices by understanding how the choices affect system performance. It also proposes a novel technique called ’External Lock Manager’ (ELM) which introduces conflicts in a separate lock-manager object so that every execution will be serializable. We build a prototype system for ELM and we run experiments to demonstrate the robustness of the new technique compare to the previous techniques. Experiments show that modifying the application code for some transactions has a high impact on performance for some choices, which makes it very hard for DBA’s to choose wisely. However, ELM has peak performance which is similar to SI, no matter which transactions are chosen for modification. Thus we say that ELM is a robust technique for ensure serializable execution.

[1]  Song Chun Moon Performance of concurrency control methods in distributed database management systems (timestamp ordering, two-phase locking, optimistic scheme, restart, transaction blocking) , 1984 .

[2]  Y. C. Tay,et al.  Locking performance in centralized databases , 1985, TODS.

[3]  Gregory R. Ganger,et al.  Network Support for Network-Attached Storage , 1999 .

[4]  Sameh Elnikety,et al.  Performance Comparison of Middleware Architectures for Generating Dynamic Web Content , 2003, Middleware.

[5]  Christos H. Papadimitriou,et al.  The serializability of concurrent database updates , 1979, JACM.

[6]  Peter M. Stocker,et al.  VLDB'87, Proceedings of 13th International Conference on Very Large Data Bases, September 1-4, 1987, Brighton, England , 1987 .

[7]  Philip A. Bernstein,et al.  Categories and Subject Descriptors: H.2.4 [Database Management]: Systems. , 2022 .

[8]  Philip A. Bernstein,et al.  Concurrency Control in Distributed Database Systems , 1986, CSUR.

[9]  Michael Stonebraker,et al.  The Design of the POSTGRES Storage System , 1988, VLDB.

[10]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[11]  Stéphane Bressan,et al.  Introduction to Database Systems , 2005 .

[12]  Kenneth Salem,et al.  Lazy database replication with snapshot isolation , 2006, VLDB.

[13]  Alexander Thomasian,et al.  Analysis of database performance with dynamic locking , 1990, JACM.

[14]  Dennis Shasha,et al.  Making snapshot isolation serializable , 2005, TODS.

[15]  Vijay Kumar,et al.  Performance of multiversion concurrency control mechanism in partitioned and partially replicated databases , 1992, CSC '92.

[16]  Miron Livny,et al.  Concurrency control performance modeling: alternatives and implications , 1987, TODS.

[17]  Alfons Kemper,et al.  Databases and transaction processing: an application-oriented approach , 2002, SGMD.

[18]  Thanasis Hadzilacos Serialization graph algorithms for multiversion concurrency control , 1988, PODS '88.

[19]  Robert Gruber,et al.  Efficient optimistic concurrency control using loosely synchronized clocks , 1995, SIGMOD '95.

[20]  Carl S. Hartzman The Delay Due to Dynamic Two-Phase Locking , 1989, IEEE Trans. Software Eng..

[21]  Gomer Thomas,et al.  Performance results on multiversion timestamp concurrency control with predeclared writesets , 1987, PODS '87.

[22]  Gustavo Alonso,et al.  Ganymed: Scalable Replication for Transactional Web Applications , 2004, Middleware.

[23]  Yi Lin,et al.  Hybrid protocols using dynamic adjustment of serialization order for real-time concurrency control , 2004, Real-Time Systems.

[24]  Alexander Thomasian,et al.  Concurrency control: methods, performance, and analysis , 1998, CSUR.

[25]  Miron Livny,et al.  Conflict detection tradeoffs for replicated data , 1991, TODS.

[26]  Alexander Thomasian,et al.  Two-phase locking performance and its thrashing behavior , 1993, TODS.

[27]  Philip A. Bernstein,et al.  Timestamp-Based Algorithms for Concurrency Control in Distributed Database Systems , 1980, VLDB.

[28]  Alan Fekete,et al.  Allocating isolation levels to transactions , 2005, PODS '05.

[29]  Toshiharu Hasegawa,et al.  MULTIVERSION CONCURRENCY CONTROL SCHEME FOR A DISTRIBUTED DATABASE SYSTEM(Software Science and Engineering) , 1985 .

[30]  Patrick E. O'Neil,et al.  A read-only transaction anomaly under snapshot isolation , 2004, SGMD.

[31]  Ricardo Jiménez-Peris,et al.  Middleware based data replication providing snapshot isolation , 2005, SIGMOD '05.

[32]  Irving L. Traiger,et al.  Granularity of Locks and Degrees of Consistency in a Shared Data Base , 1998, IFIP Working Conference on Modelling in Data Base Management Systems.

[33]  Philip S. Yu,et al.  Efficient LRU-Based Buffering in a LAN Remote Caching Architecture , 1996, IEEE Trans. Parallel Distributed Syst..

[34]  S. Sudarshan,et al.  Automating the Detection of Snapshot Isolation Anomalies , 2007, VLDB.

[35]  Seog Park,et al.  Alternative correctness criteria for multiversion concurrency control and its applications in advanced database systems , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[36]  Alexander Thomasian Performance limits of two-phase locking , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.

[37]  Jerry Nolte,et al.  Basic Timestamp, Multiple Version Timestamp, and Two-Phase Locking , 1983, VLDB.

[38]  Michael Kifer,et al.  Database Systems : An Application-Oriented Approach , 2005 .

[39]  Alexander Thomasian,et al.  Performance Analysis of Two-Phase Locking , 1991, IEEE Trans. Software Eng..

[40]  Michael Kifer,et al.  Database Systems: An Application Oriented Approach, Complete Version (2nd Edition) , 2005 .

[41]  Bettina Kemme,et al.  Postgres-R(SI): combining replica control with concurrency control based on snapshot isolation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[42]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[43]  Michael J. Carey,et al.  The performance of multiversion concurrency control algorithms , 1986, TOCS.

[44]  Abraham Silberschatz,et al.  Database Systems Concepts , 1997 .

[45]  Shiyong Lu,et al.  Semantic conditions for correctness at different isolation levels , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[46]  Alan Fekete,et al.  Serializable Executions with Snapshot Isolation: Modifying Application Code or Mixing Isolation Levels? , 2008, DASFAA.

[47]  Patrick E. O'Neil,et al.  Generalized isolation level definitions , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[48]  Fernando Pedone,et al.  Database replication using generalized snapshot isolation , 2005, 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05).

[49]  J. T. Robinson,et al.  On optimistic methods for concurrency control , 1979, TODS.

[50]  Yi Lin,et al.  Concurrency control in real-time databases by dynamic adjustment of serialization order , 1990, [1990] Proceedings 11th Real-Time Systems Symposium.