Revisiting epsilon serializabilty to improve the database state machine

Recently, a large body of research has been exploiting group communication based techniques to improve the dependability and performance of synchronously replicated database systems [8, 7, 13, 9]. Database replication based on group communication appears as a promise to overcome the scalability and performance problems of traditional strong consistency protocols, reducing the interactions among the replicas and eliminating deadlocks. Protocols such as those presented in [8, 7, 13, 9], and in particular the Database State Machine (DBSM), allow a transaction to be executed at any site and postpone the interaction among distributed concurrent transactions, which can be seen as an optimistic execution. Upon receiving the commit request, they propagate relevant information of the transaction to all replicas. If conflicts arise among concurrent transactions, the order in which the transactions were delivered is used to decide which of them commit or abort. The transaction propagation relies on an atomic multicast primitive [6] which guarantees that the sequence of transactions is the same at all non-faulty replicas. Unfortunately, the optimistic execution of transactions combined with the strictness of the serializability consistency criterion [2] adopted in the DBSM may lead to a considerable number of aborts. In this paper, we investigate how to relax the consistency criteria of DBSM in a controlled manner according to the Epsilon Serializability (ESR) concepts [16] and evaluate the direct benefits in terms of performance. Basically, ESR relies on the assumption that some transactions may tolerate a certain degree of imprecision to improve the overall performance. It allows controlled inconsistencies using a framework that can be in part applied regardless of the application semantics. For instance, a transaction that retrieves a warehouse’s amount of sales may accept a value that does not represent the amount in the last millisecond but some value in the last couple of seconds.

[1]  Sam Toueg,et al.  A Modular Approach to Fault-Tolerant Broadcasts and Related Problems , 1994 .

[2]  Fernando Pedone,et al.  Partial replication in the Database State Machine , 2001, Proceedings IEEE International Symposium on Network Computing and Applications. NCA 2001.

[3]  José Legatheaux Martins,et al.  SqlIceCube: Automatic Semantics-Based Reconciliation for Mobile Databases , 2003 .

[4]  Calton Pu,et al.  Asynchronous consistency restoration under epsilon serializability , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[5]  Gustavo Alonso,et al.  A suite of database replication protocols based on group communication primitives , 1998, Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183).

[6]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[7]  A. Correia,et al.  Testing the Dependability and Performance of GCS-Based Database Replication Protocols , .

[8]  Dennis Shasha,et al.  The dangers of replication and a solution , 1996, SIGMOD '96.

[9]  Dennis Shasha,et al.  Making snapshot isolation serializable , 2005, TODS.

[10]  José Legatheaux Martins,et al.  Reservations for Conflict Avoidance in a Mobile Database System , 2003, MobiSys '03.

[11]  Gustavo Alonso,et al.  Improving the scalability of fault-tolerant database clusters , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[12]  Fernando Pedone The database state machine and group communication issues , 1999 .

[13]  Philip S. Yu,et al.  Divergence control for epsilon-serializability , 1992, [1992] Eighth International Conference on Data Engineering.