Analysis of Replication in Distributed Database

Geographically distributed database systems have re- ceived growing interest in recent years. In this paper, we develop an approximate analytical model to study the tradeoffs of replicating data in a distributed database environment. Several concurrency control protocols are considered including pessimistic, optimistic, and semi- optimistic protocols. The approximate analysis captures the effect of the protocol on hardware resource contention and data contention. The accuracy of the approximation is validated through detailed simula- tions. We find that the benefit of replicating data and the optimal num- ber cf replicates are sensitive to the concurrency control protocol. Un- der the optimistic and semi-optimistic protocols, replications can significantly improve response time with an additional MIPS require- ment to maintain consistency among the replicates. The optimal degree of replication is further affected by the transaction mix (e.g., the frac- tion of read-only transactions), the communications delay and over- head, the number of distributed sites, and the available MIPS. Sensi- tivity analyses have been carried out to examine how the optimal degree of replication changes with respect to these factors.

[1]  Y. C. Tay,et al.  Locking performance in centralized databases , 1985, TODS.

[2]  Keki B. Irani,et al.  Queueing network models for concurrent transaction processing in a database system , 1979, SIGMOD '79.

[3]  James A. Larson,et al.  Tutorial--Distributed Database Management , 1985 .

[4]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[5]  Hector Garcia-Molina Performance of update algorithms for replicated data in a distributed database , 1979 .

[6]  Elisa Bertino,et al.  The Effects of Two-Phase Locking on the Performance of a Distributed Database Management System , 1988, Perform. Evaluation.

[7]  Randolph D. Nelson,et al.  Analysis of a Replicated Data Base , 1985, Perform. Evaluation.

[8]  Philip S. Yu,et al.  On Centralized versus Geographically Distributed Database Systems , 1987, IEEE International Conference on Distributed Computing Systems.

[9]  Olivia R. Liu Sheng,et al.  Analysis of Query Processing in Distributed Database Systems with Fully Replicated Files: A Hierarchical Approach , 1988, Perform. Evaluation.

[10]  Erol Gelenbe,et al.  Optimization of the Number of Copies in a Distributed Data Base , 1981, IEEE Trans. Software Eng..

[11]  Michael Stonebraker,et al.  Concurrency Control and Consistency of Multiple Copies of Data in Distributed Ingres , 1979, IEEE Transactions on Software Engineering.

[12]  Miron Livny,et al.  Distributed Concurrency Control Performance: A Study of Algorithms, Distribution, and Replication , 1988, VLDB.

[13]  Philip S. Yu,et al.  Database buffer model for the data sharing environment , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[14]  Donald F. Towsley,et al.  Modeling the effects of data and resource contention on the performance of optimistic concurrency control protocols , 1988, Proceedings. Fourth International Conference on Data Engineering.

[15]  Dominique Potier,et al.  Analysis of locking policies in database management systems , 1980, CACM.

[16]  Philip S. Yu,et al.  Performance Comparison of IO Shipping and Database Call Shipping: Schemes in Multisystem Partitioned Databases , 1989, Perform. Evaluation.

[17]  Bruce Irving Galler Concurrency control performance issues , 1982 .

[18]  Henry F. Korth,et al.  A Straw Man Analysis of the Probability of Waiting and Deadlock in a Database System , 1981, Berkeley Workshop.

[19]  Dean Daniels,et al.  R*: An Overview of the Architecture , 1986, JCDKB.

[20]  Jerry Nolte,et al.  Basic Timestamp, Multiple Version Timestamp, and Two-Phase Locking , 1983, VLDB.

[21]  Philip S. Yu,et al.  On multisystem coupling through function request shipping , 1986, IEEE Transactions on Software Engineering.

[22]  S. L. Mehndiratta,et al.  Timestamp based certification schemes for transactions in distributed database systems , 1985, SIGMOD '85.

[23]  J. T. Robinson,et al.  On coupling multi-systems through data sharing , 1987, Proceedings of the IEEE.

[24]  Y. C. Tay,et al.  A mean value performance model for locking in databases: the waiting case , 1984, PODS '84.

[25]  Philip S. Yu,et al.  Modelling of centralized concurrency control in a multi-system environment , 1985, SIGMETRICS '85.