Characterization and Optimization ofCommit Processing Performance inDistributed Database Systems

A signiicant body of literature is available on distributed transaction commit protocols. Surprisingly, however, the relative merits of these protocols have not been suuciently studied with respect to their quantitative impact on transaction processing performance. Also, even though several optimizations have been suggested to improve the performance of the ubiquitous Two-Phase Commit (2PC) protocol, none have addressed the fact that under 2PC the data updated by a transaction during its data processing phase remains unavailable to other transactions during its commit processing phase and, worse, there is no inherent bound on the duration of this unavailability. This paper addresses both these issues. First, using a detailed simulation model of a distributed database system, we proole the transaction throughput performance of a representative set of commit protocols, including 2PC, Presumed Abort, Presumed Commit and 3PC. Second, we propose and evaluate a new commit protocol, OPT, that allows transactions to \optimistically" borrow the updated data of transactions currently in their commit phase. This borrowing is controlled to ensure that cascading aborts, usually associated with the use of dirty data, do not occur. The new protocol is easy to implement and incorporate in current systems, and can coexist, often synergistically, with many of the optimizations proposed earlier including current industry standard protocols such as Presumed Commit and Presumed Abort. The experimental results show that distributed commit processing can have considerably more in-uence than distributed data processing on the throughput performance and that the choice of commit protocol clearly aaects the magnitude of this innuence. Among the protocols evaluated, the new optimistic commit protocol provides the best transaction throughput performance for a variety of workloads and system conngurations. In fact, OPT's peak throughput is often close to the upper bound on achievable peak performance. Even more interestingly, when data contention is signiicant, integrating OPT with the non-blocking three-phase commit protocol provides better peak throughput performance than all of the standard two-phase blocking protocols evaluated in our study. Further, OPT makes eecient use of the borrowing approach and its performance is robust for practical workloads. In short, OPT is a portable, practical, high-performance, eecient and robust distributed commit protocol. A partial and preliminary version of the results presented here appeared earlier in Revisiting Commit Processing in

[1]  Nandit Soparkar,et al.  Adaptive Commitment for Real-Time Distributed Transactions , 1992 .

[2]  Sujata Banerjee,et al.  Data sharing and recovery in gigabit-networked databases , 1995, Proceedings of Fourth International Conference on Computer Communications and Networks - IC3N'95.

[3]  Peter M. Spiro,et al.  Designing an Optimized Transaction Committ Protocol , 1991, Digit. Tech. J..

[4]  Steven P. Levitan,et al.  An argument in favor of the presumed commit protocol , 1997, Proceedings 13th International Conference on Data Engineering.

[5]  Ramesh Kumar Gupta,et al.  Commit Processing In Distributed On-Line And Real-Time Transaction Processing Systems , 1997 .

[6]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[7]  Jayant R. Haritsa,et al.  More optimism about real-time distributed commit processing , 1997, Proceedings Real-Time Systems Symposium.

[8]  C. Mohan,et al.  Two-phase commit optimizations in a commercial distributed environment , 1995, Distributed and Parallel Databases.

[9]  Flaviu Cristian,et al.  A low-cost atomic commit protocol , 1990, Proceedings Ninth Symposium on Reliable Distributed Systems.

[10]  Dale Skeen,et al.  Nonblocking commit protocols , 1981, SIGMOD '81.

[11]  Terry Williams,et al.  Probability and Statistics with Reliability, Queueing and Computer Science Applications , 1983 .

[12]  Abraham Silberschatz,et al.  An optimistic commit protocol for distributed transaction management , 1991, SIGMOD '91.

[13]  Sujata Banerjee,et al.  A Fast and Robust Failure Recovery Scheme for Shared-Nothing Gigabit-Networked Databases , 1996 .

[14]  Miron Livny,et al.  Parallelism and concurrency control performance in distributed database machines , 1989, SIGMOD '89.

[15]  Miron Livny,et al.  Concurrency control performance modeling: alternatives and implications , 1987, TODS.

[16]  Flaviu Cristian,et al.  Coordinator log transaction execution protocol , 2005, Distributed and Parallel Databases.

[17]  Miron Livny,et al.  Load control for locking: the “half-and-half” approach , 1990, PODS '90.

[18]  Eric C. Cooper Analysis of distributed commit protocols , 1982, SIGMOD '82.

[19]  Jim Gray,et al.  The Transaction Concept: Virtues and Limitations (Invited Paper) , 1981, VLDB.

[20]  Bruce G. Lindsay,et al.  Transaction management in the R* distributed database management system , 1986, TODS.

[21]  Chandrasekaran Mohan,et al.  Recent Work on Distributed Commit Protocolls, and Recoverable Messaging and Queuing , 1994, IEEE Data Eng. Bull..

[22]  Walter H. Kohler,et al.  A Survey of Techniques for Synchronization and Recovery in Decentralized Computer Systems , 1981, CSUR.

[23]  Abraham Silberschatz,et al.  A Formal Approach to Recovery by Compensating Transactions , 1990, VLDB.

[24]  Butler W. Lampson,et al.  A New Presumed Commit Optimization for Two Phase Commit , 1993, VLDB.

[25]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[26]  Divyakant Agrawal,et al.  The Performance of Two Phase Commit Protocols in the Presence of Site Failures , 1994, Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing.

[27]  Jayant R. Haritsa,et al.  Commit processing in distributed real-time database systems , 1996, 17th IEEE Real-Time Systems Symposium.

[28]  Laura M. Haas,et al.  Computation and communication in R*: a distributed database manager , 1984, TOCS.

[29]  Butler W. Lampson,et al.  Crash Recovery in a Distributed Data Storage System , 1981 .

[30]  Panos K. Chrysanthis,et al.  The Implicit-Yes Vote Commit Protocol with Delegation of Commitment , 1996 .

[31]  Panos K. Chrysanthis,et al.  Two-Phase Commit in Gigabit-Networked Distributed Databases , 1995 .

[32]  Miron Livny,et al.  Distributed Concurrency Control Performance: A Study of Algorithms, Distribution, and Replication , 1988, VLDB.

[33]  Vishal Sharma,et al.  IEEE TRANS ON PARALLEL AND DISTRIBUTED SYSTEMS Circuit Switching with Input Queueing An Analysis for the d dimensional Wraparound Mesh and the Hypercube , 2002 .