Towards Seamless Resynchronization for Active-Active Database Clustering

Database clustering is a well-established technology to improve the availability and scalability of a database service. Query replication and log replication are two popular methods for propagating database updates across a cluster of database servers. An important issue that received relatively scant attention and is the focus of this paper is how to add a new server to an active-active database cluster with minimum service disruption while maximizing the cluster's request processing concurrency during run time. Even though log replication allows attaching a new database server without stopping the attached database cluster's service, the asynchronous nature of its operation is incompatible with active-active clusters, where every cluster node is designed to share the load of servicing incoming read query queries. In contrast, query replication is synchronous and thus could readily support active-active clustering, but most existing query replication implementations need to halt a cluster's service when attaching a new server. In this paper, we present a database resynchronization scheme that achieves the best of both worlds: leveraging log replication to minimize the service disruption time associated with addition of new servers, while applying query replication during run time to maximize the parallelism of read query processing, and demonstrate its effectiveness in a product-grade database engine, PostgreSQL.

[1]  Kartik Gopalan,et al.  Post-copy based live virtual machine migration using adaptive pre-paging and dynamic self-ballooning , 2009, VEE '09.

[2]  Fernando Pedone,et al.  Tashkent: uniting durability with transaction ordering for high-performance scalable database replication , 2006, EuroSys.

[3]  André Schiper,et al.  Comparison of database replication techniques based on total order broadcast , 2005, IEEE Transactions on Knowledge and Data Engineering.

[4]  Kenneth P. Birman,et al.  The design and architecture of the Microsoft Cluster Service-a practical approach to high-availability and scalability , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[5]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[6]  Willy Zwaenepoel,et al.  C-JDBC: Flexible Database Clustering Middleware , 2004, USENIX Annual Technical Conference, FREENIX Track.

[7]  Alexander Shraer,et al.  Dynamic Reconfiguration of Primary/Backup Clusters , 2012, USENIX Annual Technical Conference.

[8]  Prashant J. Shenoy,et al.  Dolly: virtualization-driven database provisioning for the cloud , 2011, VEE '11.

[9]  George Candea,et al.  Middleware-based database replication: the gaps between theory and practice , 2007, SIGMOD Conference.

[10]  Bettina Kemme,et al.  Online recovery in cluster databases , 2008, EDBT '08.

[11]  Alberto Bartoli,et al.  Online reconfiguration in replicated databases based on group communication , 2001, 2001 International Conference on Dependable Systems and Networks.

[12]  Gustavo Alonso,et al.  Non-intrusive, parallel recovery of replicated data , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[13]  José Ramón González de Mendívil,et al.  On the Cost of Database Clusters Reconfiguration , 2009, 2009 28th IEEE International Symposium on Reliable Distributed Systems.

[14]  Idit Keidar,et al.  Group communication specifications: a comprehensive study , 2001, CSUR.