Testing the dependability and performance of group communication based database replication protocols

Database replication based on group communication systems has recently been proposed as an efficient and resilient solution for large-scale data management. However, its evaluation has been conducted either on simplistic simulation models, which fail to assess concrete implementations, or on complete system implementations, which are costly to test with realistic large-scale scenarios. This paper presents a tool that combines implementations of replication and communication protocols under study with simulated network, database engine, and traffic generator models. Replication components can therefore be subjected to realistic large scale loads in a variety of scenarios, including fault-injection, while at the same time providing global observation and control. The paper shows first how the model is configured and validated to closely reproduce the behavior of a real system, and then how it is applied, allowing us to derive interesting conclusions both on replication and communication protocols and on their implementations.

[1]  Kenneth P. Birman,et al.  Scalable message stability detection protocols , 1998 .

[2]  Gustavo Alonso,et al.  A suite of database replication protocols based on group communication primitives , 1998, Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183).

[3]  Rachid Guerraoui,et al.  The Database State Machine Approach , 2003, Distributed and Parallel Databases.

[4]  Divyakant Agrawal,et al.  Epidemic algorithms in replicated databases (extended abstract) , 1997, PODS.

[5]  Fernando Pedone The database state machine and group communication issues , 1999 .

[6]  André Schiper,et al.  Generic Broadcast , 1999, DISC.

[7]  Divyakant Agrawal,et al.  The performance of database replication with group multicast , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[8]  André Schiper,et al.  Uniform reliable multicast in a virtually synchronous environment , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[9]  William S. Keezer Array-driven simulation of real databases , 1998, 1998 Winter Simulation Conference. Proceedings (Cat. No.98CH36274).

[10]  Fernando Pedone,et al.  Partial replication in the Database State Machine , 2001, Proceedings IEEE International Symposium on Network Computing and Applications. NCA 2001.

[11]  Donald F. Towsley,et al.  A Comparison of Sender-Initiated and Receiver-Initiated Reliable Multicast Protocols , 1997, IEEE J. Sel. Areas Commun..

[12]  Flaviu Cristian,et al.  Applying simulation to the design and performance evaluation of fault-tolerant systems , 1997, Proceedings of SRDS'97: 16th IEEE Symposium on Reliable Distributed Systems.

[13]  Donald F. Towsley,et al.  A comparison of sender-initiated and receiver-initiated reliable multicast protocols , 1994, IEEE J. Sel. Areas Commun..

[14]  Miron Livny,et al.  Concurrency control performance modeling: alternatives and implications , 1987, TODS.

[15]  Gustavo Alonso,et al.  Don't Be Lazy, Be Consistent: Postgres-R, A New Way to Implement Database Replication , 2000, VLDB.

[16]  Francisco Moura,et al.  Optimistic total order in wide area networks , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[17]  Robbert van Renesse,et al.  Reliable Distributed Computing with the Isis Toolkit , 1994 .

[18]  Andrew S. Tanenbaum,et al.  Group communication in the Amoeba distributed operating system , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[19]  David M. Nicol,et al.  Towards Realistic Million-Node Internet Simulation , 1999, PDPTA.

[20]  Luís E. T. Rodrigues,et al.  Semantically Reliable Multicast: Definition, Implementation, and Performance Evaluation , 2003, IEEE Trans. Computers.

[21]  Idit Keidar,et al.  Group communication specifications: a comprehensive study , 2001, CSUR.