A Low-Cost Checkpointing Technique for Distributed Databases

For distributed databases, checkpointing is used to ensure an efficient way to perform global reconstruction. However, the need for global reconstruction is infrequent. Most current checkpointing approaches for distributed databases are too expensive during run time. Some of them allow the checkpointing process to run in parallel with normal transactions at the cost of more data and resource contention, which in turn causes longer response time for normal transactions. Thus, an efficient way to checkpoint distributed databases is needed to avoid degrading the system performance. This paper presents a low-cost solution, called Loosely Synchronized Local Fuzzy Checkpointing (LSLFC), to these problems. LSLFC supports global reconstruction, and our performance study shows that LSLFC has little overhead during run time.

[1]  Hans-Jörg Schek,et al.  Semantics-based multilevel transaction management in federated systems , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[2]  Patrick Valduriez,et al.  Principles of distributed database systems (2nd ed.) , 1999 .

[3]  Björn Þór Jónsson,et al.  Performance tradeoffs for client-server query processing , 1996, SIGMOD '96.

[4]  Sang H. Son,et al.  Experimental Evaluation of a Concurrent Checkpointing Algorithm , 1990 .

[5]  Alexander Thomasian Checkpointing for Optimistic Concurrency Control Methods , 1995, IEEE Trans. Knowl. Data Eng..

[6]  Stefano Ceri,et al.  Distributed Databases: Principles and Systems , 1984 .

[7]  Frank J. Derfler PC Magazine Guide to Connectivity , 1995 .

[8]  Jim Gray,et al.  Benchmark Handbook: For Database and Transaction Processing Systems , 1992 .

[9]  Slawomir Pilarski,et al.  A novel checkpointing scheme for distributed database systems , 1990, PODS '90.

[10]  J. T. Lim,et al.  A checkpointing scheme for heterogeneous distributed database systems , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[11]  Gerhard Weikum,et al.  Implementation and performance of multi-level transaction management in a multidatabase environment , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[12]  Sy-Yen Kuo,et al.  Theoretical Analysis for Communication-Induced Checkpointing Protocols with Rollback-Dependency Trackability , 1998, IEEE Trans. Parallel Distributed Syst..

[13]  Michel Raynal,et al.  Distributed Database Checkpointing , 1999, Euro-Par.

[14]  Eui-In Choi,et al.  Recovery technique based on fuzzy checkpoint in a client/server database system , 1996, Proceedings of 20th International Computer Software and Applications Conference: COMPSAC '96.

[15]  Sang Hyuk Son,et al.  Distributed Checkpointing for Globally Consistent States of Databases , 1989, IEEE Transactions on Software Engineering.

[16]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[17]  Brian Randell,et al.  Operating Systems, An Advanced Course , 1978 .

[18]  Junguk L. Kim,et al.  An efficient recovery scheme for locking-based distributed database systems , 1994, Proceedings of IEEE 13th Symposium on Reliable Distributed Systems.

[19]  Calton Pu,et al.  Performance Evaluation of Global Reading of Entire Databases , 1988, Proceedings [1988] International Symposium on Databases in Parallel and Distributed Systems.

[20]  Yin-Min Wang,et al.  Consistent Global checkpoints that Contain a Given Set of Local Chekpoints , 1997, IEEE Trans. Computers.

[21]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[22]  Songchun Moon,et al.  A Checkpointing Scheme for Heterogeneous Database Systems , 1991, ICDCS 1991.

[23]  Mario A. Nascimento,et al.  A Survey of Distributed Database Checkpointing , 1997, Distributed and Parallel Databases.

[24]  Yi-Min Wang,et al.  Integrating checkpointing with transaction processing , 1997, Proceedings of IEEE 27th International Symposium on Fault Tolerant Computing.

[25]  David A. Bell,et al.  Distributed database systems , 1992 .

[26]  Andreas Reuter,et al.  Principles of transaction-oriented database recovery , 1983, CSUR.

[27]  Michel Raynal,et al.  Rollback-dependency trackability: visible characterizations , 1999, PODC '99.

[28]  Andrzej M. Goscinski,et al.  Distributed operating systems - the logical design , 1991 .

[29]  Slawomir Pilarski,et al.  Checkpointing for Distributed Databases: Starting from the Basics , 1992, IEEE Trans. Parallel Distributed Syst..