论文信息 - A Survey of Distributed Database Checkpointing

A Survey of Distributed Database Checkpointing

Checkpointing a database is a vital technique to reduce the recovery time in the presence of a failure. For distributed databases, checkpointing also provides an efficient way to perform global reconstruction. In this paper, we survey and classify previous approaches for checkpointing a distributed database. Since the need for global reconstruction is infrequent in most distributed databases, a less restrictive and less resource-consuming approach to checkpoint distributed databases in an integrated distributed database system is recommended over a transaction consistent checkpoint approach. For a federated or multidatabase system, any type of global consistent checkpoint is difficult to achieve without violating local autonomy.

[1] Henry F. Korth,et al. The Double Life of the Transaction Abstraction: Fundamental Principle and Evolving System Concept , 1995, VLDB.

[2] Sang H. Son,et al. Experimental Evaluation of a Concurrent Checkpointing Algorithm , 1990 .

[3] J. A. McDermid. Checkpointing and Error Recovery in distributed Systems , 1981, ICDCS.

[4] Gilles Zurfluh. Failure Survivability Mechanisms in Plexus Project , 1981, DDSS.

[5] Dimitrios Georgakopoulos. Transaction management in multidatabase systems , 1991 .

[6] Peter Dadam,et al. Reconstruction of Consistent Global States in Distributed Databases , 1980, DDB.

[7] Sang Hyuk Son,et al. Distributed Checkpointing for Globally Consistent States of Databases , 1989, IEEE Transactions on Software Engineering.

[8] Joost Verhofstad,et al. Recovery Techniques for Database Systems , 1978, CSUR.

[9] Michael Stonebraker,et al. A Formal Model of Crash Recovery in a Distributed System , 1983, IEEE Transactions on Software Engineering.

[10] Junguk L. Kim,et al. An efficient recovery scheme for locking-based distributed database systems , 1994, Proceedings of IEEE 13th Symposium on Reliable Distributed Systems.

[11] Hans-Jörg Schek,et al. Semantics-based multilevel transaction management in federated systems , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[12] Nancy A. Lynch,et al. Global States of a Distributed System , 1982, IEEE Transactions on Software Engineering.

[13] Calton Pu,et al. Superdatabases for composition of heterogeneous databases , 1988, Proceedings. Fourth International Conference on Data Engineering.

[14] Jari Veijalainen,et al. 2PC Agent method: achieving serializability in presence of failures in a heterogeneous multidatabase , 1990, Proceedings. PARBASE-90: International Conference on Databases, Parallel Architectures, and Their Applications.

[15] Philip A. Bernstein,et al. An algorithm for concurrency control and recovery in replicated distributed databases , 1984, TODS.

[16] Slawomir Pilarski,et al. Checkpointing for Distributed Databases: Starting from the Basics , 1992, IEEE Trans. Parallel Distributed Syst..

[17] Leslie Lamport,et al. Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[18] Hector Garcia-Molina,et al. Node Autonomy In Distributed Systems , 1988, Proceedings [1988] International Symposium on Databases in Parallel and Distributed Systems.

[19] Abraham Silberschatz,et al. Reliable transaction management in a multidatabase system , 1990, SIGMOD '90.

[20] Hector Garcia-Molina,et al. Management of a remote backup copy for disaster recovery , 1991, TODS.

[21] Andreas Reuter,et al. Principles of transaction-oriented database recovery , 1983, CSUR.

[22] RICHARD KOO,et al. Checkpointing and Rollback-Recovery for Distributed Systems , 1986, IEEE Transactions on Software Engineering.

[23] Abraham Silberschatz,et al. Failure-resilient transaction management in multidatabase , 1991, Computer.

[24] Gerhard Weikum,et al. Implementation and performance of multi-level transaction management in a multidatabase environment , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[25] Radu Popescu-Zeletin,et al. Transaction management in distributed heterogeneous database management systems , 1986, Inf. Syst..

[26] Peter Dadam,et al. Recovery in Distributed Databases Based on Non-Synchronized Local Checkpoints , 1980, IFIP Congress.

[27] S.H. Son,et al. Efficient decentralized checkpointing in distributed database systems , 1988, [1988] Proceedings of the Twenty-First Annual Hawaii International Conference on System Sciences. Volume II: Software track.

[28] Jim Lyon. Design considerations in replicated database systems for disaster protection , 1988, Digest of Papers. COMPCON Spring 88 Thirty-Third IEEE Computer Society International Conference.

[29] Calton Pu,et al. Performance Evaluation of Global Reading of Entire Databases , 1988, Proceedings [1988] International Symposium on Databases in Parallel and Distributed Systems.

[30] Ge-Ming Chiu,et al. A crash recovery technique in distributed computing systems , 1994, 14th International Conference on Distributed Computing Systems.

[31] Slawomir Pilarski,et al. A novel checkpointing scheme for distributed database systems , 1990, PODS '90.

[32] J. Eliot B. Moss,et al. Checkpoint and Restart in Distributed Transaction Systems , 1983, Symposium on Reliability in Distributed Software and Database Systems.

[33] J. T. Lim,et al. A checkpointing scheme for heterogeneous distributed database systems , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[34] Amit P. Sheth,et al. Using Tickets to Enforce the Serializability of Multidatabase Transactions , 1994, IEEE Trans. Knowl. Data Eng..

[35] Walter H. Kohler,et al. A Survey of Techniques for Synchronization and Recovery in Decentralized Computer Systems , 1981, CSUR.

[36] Virgil D. Gligor,et al. Interconnecting Heterogeneous Database Management Systems , 1984, Computer.

[37] Sang Hyuk Son,et al. An Algorithm for Database Reconstruction in Distributed Environments , 1986, ICDCS.

[38] Jim Gray,et al. Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[39] Kwang-Moo Choe,et al. Techniques for database recovery in distributed environments , 1988 .

[40] Sang Hyuk Son,et al. Practicality of Non-Interfering Checkpoints in Distributed Database Systems , 1986, IEEE Real-Time Systems Symposium.

[41] Calton Pu. On-the-fly, incremental, consistent reading of entire databases , 2005, Algorithmica.

[42] Patrick Valduriez,et al. Principles of Distributed Database Systems , 1990 .

[43] Sang Hyuk Son. An Adaptive Checkpointing Scheme for Distributed Databases with Mixed Types of Transactions , 1989, IEEE Trans. Knowl. Data Eng..

[44] Jim Gray,et al. Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[45] Andreas Reuter,et al. Transaction Processing: Concepts and Techniques , 1992 .

[46] David A. Bell,et al. Distributed database systems , 1992 .

[47] Herbert Kuss. On totally ordering checkpoints in distributed data bases , 1982, SIGMOD '82.

[48] Abraham Silberschatz,et al. Transaction management issues in a failure-prone multidatabase system environment , 2005, The VLDB Journal.

[49] Hans-Jörg Schek,et al. A multi-level transaction approach to federated DBMS transaction management , 1991, [1991] Proceedings. First International Workshop on Interoperability in Multidatabase Systems.

[50] Sang Hyuk Son. An algorithm for non-interfering checkpoints and its practicality in distributed database systems , 1989, Inf. Syst..

[51] Guy Ferran. Distributed Checkpointing in a Distributed Data Management System , 1981, RTSS.