Database replication in large scale systems: optimizing the number of replicas

In distributed systems, replication is used for ensuring availability and increasing performances. However, the heavy workload of distributed systems such as web2.0 applications or Global Distribution Systems, limits the benefit of replication if its degree (i.e., the number of replicas) is not controlled. Since every replica must perform all updates eventually, there is a point beyond which adding more replicas does not increase the throughput, because every replica is saturated by applying updates. Moreover, if the replication degree exceeds the optimal threshold, the useless replica would generate an overhead due to extra communication messages. In this paper, we propose a suitable replication management solution in order to reduce useless replicas. To this end, we define two mathematical models which approximate the appropriate number of replicas to achieve a given level of performance. Moreover, we demonstrate the feasibility of our replication management model through simulation. The results expose the effectiveness of our models and their accuracy.

[1]  Marta Patiño-Martínez Consistent Database Replication at the Middleware Level , 2005 .

[2]  Rachid Guerraoui,et al.  Software-Based Replication for Fault Tolerance , 1997, Computer.

[3]  Gustavo Alonso,et al.  MIDDLE-R: Consistent database replication at the middleware level , 2005, TOCS.

[4]  Patrick Valduriez,et al.  Preventive Replication in a Database Cluster , 2005, Distributed and Parallel Databases.

[5]  Hubert Naacke,et al.  DTR: Distributed Transaction Routing in a Large Scale Network , 2008, VECPAR.

[6]  Patrick Valduriez,et al.  Principles of distributed database systems (2nd ed.) , 1999 .

[7]  RICHARD KOO,et al.  Checkpointing and Rollback-Recovery for Distributed Systems , 1986, IEEE Transactions on Software Engineering.

[8]  Patrick Valduriez,et al.  Refresco: Improving Query Performance Through Freshness Control in a Database Cluster , 2004, CoopIS/DOA/ODBASE.

[9]  Richard Wolski,et al.  Automatic methods for predicting machine availability in desktop Grid and peer-to-peer systems , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[10]  Esther Pacitti,et al.  Fast Algorithms for Maintaining Replica Consistency in Lazy Master Replicated Databases , 1999, VLDB.

[11]  Calton Pu,et al.  A Formal Characterization of Epsilon Serializability , 1995, IEEE Trans. Knowl. Data Eng..

[12]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[13]  Fuat Akal,et al.  Fine-Grained Replication and Scheduling with Freshness and Correctness Guarantees , 2005, VLDB.

[14]  George Candea,et al.  Middleware-based database replication: the gaps between theory and practice , 2007, SIGMOD Conference.

[15]  Matthias Nicola,et al.  Improving Performance in Replicated Databases through Relaxed Coherency , 1995, VLDB.

[16]  Patrick Valduriez,et al.  The leganet system: Freshness-aware transaction routing in a database cluster , 2007, Inf. Syst..

[17]  Sébastien Monnet,et al.  How to bring together fault tolerance and data consistency to enable Grid data sharing , 2006, Concurr. Comput. Pract. Exp..

[18]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[19]  Heiko Schuldt,et al.  FAS - A Freshness-Sensitive Coordination Middleware for a Cluster of OLAP Components , 2002, VLDB.