The leganet system: Freshness-aware transaction routing in a database cluster

We consider the use of a database cluster for Application Service Provider (ASP). In the ASP context, applications and databases can be update-intensive and must remain autonomous. In this paper, we describe the Leganet system which performs freshness-aware transaction routing in a database cluster. We use multi-master replication and relaxed replica freshness to increase load balancing. Our transaction routing takes into account freshness requirements of queries at the relation level and uses a cost function that takes into account the cluster load and the cost to refresh replicas to the required level. We implemented the Leganet prototype on an 11-node Linux cluster running Oracle8i. Using experimentation and emulation up to 128 nodes, our validation based on the TPC-C benchmark demonstrates the performance benefits of our approach.

[1]  Jennifer Widom,et al.  Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data , 2000, VLDB.

[2]  Philip S. Yu,et al.  Divergence control for epsilon-serializability , 1992, [1992] Eighth International Conference on Data Engineering.

[3]  Gustavo Alonso,et al.  Non-intrusive, parallel recovery of replicated data , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[4]  Christos Faloutsos,et al.  Data Mining on an OLTP System (Nearly) for Free (CMU-CS-99-151) , 2000, SIGMOD 2000.

[5]  Esther Pacitti,et al.  Replica Consistency in Lazy Master Replicated Databases , 2001, Distributed and Parallel Databases.

[6]  Gustavo Alonso,et al.  Are quorums an alternative for data replication? , 2003, TODS.

[7]  Gustavo Alonso,et al.  Scalable Replication in Database Clusters , 2000, DISC.

[8]  Rafael Alonso,et al.  Data caching issues in an information retrieval system , 1990, TODS.

[9]  Esther Pacitti,et al.  Preventive Multi-master Replication in a Cluster of Autonomous Databases , 2003, Euro-Par.

[10]  Klemens Böhm,et al.  OLAP Query Routing and Physical Design in a Database Cluster , 2000, EDBT.

[11]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[12]  Alan L. Cox,et al.  Conflict-Aware Scheduling for Dynamic Content Applications , 2003, USENIX Symposium on Internet Technologies and Systems.

[13]  Gustavo Alonso,et al.  A new approach to developing and implementing eager database replication protocols , 2000, TODS.

[14]  Gustavo Alonso,et al.  Improving the scalability of fault-tolerant database clusters , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[15]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[16]  Avishai Wool,et al.  Replication, consistency, and practicality: are these mutually exclusive? , 1998, SIGMOD '98.

[17]  Bettina Kemme,et al.  Postgres-R(SI): combining replica control with concurrency control based on snapshot isolation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[18]  Hans-Jörg Schek,et al.  Cache-aware query routing in a cluster of databases , 2001, Proceedings 17th International Conference on Data Engineering.

[19]  Willy Zwaenepoel,et al.  Partial Replication: Achieving Scalability in Redundant Arrays of Inexpensive Databases , 2003, OPODIS.

[20]  Michael Stonebraker,et al.  Data replication in Mariposa , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[21]  Anna R. Karlin,et al.  Implementing cooperative prefetching and caching in a globally-managed memory system , 1998, SIGMETRICS '98/PERFORMANCE '98.

[22]  Mark Garland Hayden,et al.  The Ensemble System , 1998 .

[23]  Patrick Valduriez,et al.  Load Balancing of Autonomous Applications and Databases in a Cluster System , 2002, WDAS.

[24]  Patrick Valduriez,et al.  Parallel database systems: Open problems and new issues , 1993, Distributed and Parallel Databases.

[25]  Willy Zwaenepoel,et al.  C-JDBC: Flexible Database Clustering Middleware , 2004, USENIX Annual Technical Conference, FREENIX Track.

[26]  Alan L. Cox,et al.  TreadMarks: shared memory computing on networks of workstations , 1996 .

[27]  Erhard Rahm,et al.  Dynamic query scheduling in parallel data warehouses , 2002, Concurr. Comput. Pract. Exp..

[28]  Erhard Rahm,et al.  Multi-Dimensional Database Allocation for Parallel Data Warehouses , 2000, VLDB.

[29]  Hans-Jörg Schek,et al.  Scalable distributed query and update service implementations for XML document elements , 2001, Proceedings Eleventh International Workshop on Research Issues in Data Engineering. Document Management for Data Intensive Business and Scientific Applications. RIDE 2001.

[30]  Philip S. Yu,et al.  On Affinity Based Routing in Multi-System Data Sharing , 1986, VLDB.

[31]  Christos Faloutsos,et al.  Data mining on an OLTP system (nearly) for free , 2000, SIGMOD '00.

[32]  Erhard Rahm,et al.  Dynamic Query Scheduling in Parallel Data Warehouses , 2002, Euro-Par.

[33]  Alan L. Cox,et al.  Distributed Versioning: Consistent Replication for Scaling Back-End Databases of Dynamic Content Web Sites , 2003, Middleware.

[34]  Gerhard Weikum,et al.  Principles and realization strategies of multilevel transaction management , 1991, TODS.

[35]  Jonathan Goldstein,et al.  Relaxed currency and consistency: how to say "good enough" in SQL , 2004, SIGMOD '04.

[36]  Erhard Rahm,et al.  A Framework for workload allocation in distributed transaction processing systems , 1992, J. Syst. Softw..

[37]  Philip A. Bernstein,et al.  Principles of Transaction Processing , 1996 .

[38]  Cho-Li Wang,et al.  Building a scalable web server with global object space support on heterogeneous clusters , 2001, Proceedings 42nd IEEE Symposium on Foundations of Computer Science.

[39]  Patrick Valduriez,et al.  Principles of distributed database systems (2nd ed.) , 1999 .

[40]  Julius T. Tou,et al.  Information Systems , 1973, GI Jahrestagung.

[41]  Ricardo Jiménez-Peris,et al.  Middleware based data replication providing snapshot isolation , 2005, SIGMOD '05.

[42]  Doug Stacey Replication: DB2, Oracle, or Sybase? , 1995, SGMD.

[43]  Gustavo Alonso,et al.  Don't Be Lazy, Be Consistent: Postgres-R, A New Way to Implement Database Replication , 2000, VLDB.

[44]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[45]  Christos Nikolaou,et al.  Transaction Routing for Distributed OLTP Systems: Survey and Recent Results , 1997, Inf. Sci..

[46]  Esther Pacitti,et al.  Optimistic Replication in Pharos, a Collaborative Application on the Web , 2002, J. Braz. Comput. Soc..

[47]  Ricardo Jiménez-Peris,et al.  Adaptive Middleware for Data Replication , 2004, Middleware.

[48]  Amit P. Sheth,et al.  Management of interdependent data: specifying dependency and consistency requirements , 1990, [1990] Proceedings. Workshop on the Management of Replicated Data.

[49]  Esther Pacitti,et al.  Fast Algorithms for Maintaining Replica Consistency in Lazy Master Replicated Databases , 1999, VLDB.

[50]  Patrick Valduriez,et al.  Refresco: Improving Query Performance Through Freshness Control in a Database Cluster , 2004, CoopIS/DOA/ODBASE.

[51]  Tao Yang,et al.  Cluster load balancing for fine-grain network services , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[52]  Amin Vahdat,et al.  Efficient Numerical Error Bounding for Replicated Network Services , 2000, VLDB.

[53]  Anne Doucet,et al.  Checking Integrity Constraints in Multidatabase Systems with Nested Transactions , 2001, CoopIS.

[54]  Heiko Schuldt,et al.  FAS - A Freshness-Sensitive Coordination Middleware for a Cluster of OLAP Components , 2002, VLDB.

[55]  Gustavo Alonso,et al.  Ganymed: Scalable Replication for Transactional Web Applications , 2004, Middleware.

[56]  Martin L. Kersten,et al.  Memory Aware Query Routing in Interactive Web-Based Information Systems , 2001, BNCOD.

[57]  Tao Yang,et al.  Integrated resource management for cluster-based Internet services , 2002, OSDI.