An efficient algorithm for causal message logging

Causal message logging has many good properties such as nonblocking message logging and no rollback propagation. However, it requires a large amount of information to be piggybacked on each message, which may incur severe performance degradation. This paper presents an efficient causal logging algorithm based on the new message log structure, LogOn, which represents the causal interprocess dependency relation with much smaller overhead compared to the existing algorithms. The proposed algorithm is efficient in the sense that it requires no additional information other than LogOn to be carried in each message, while the other algorithms require extra information other than the message log, to eliminate the duplicates in log entries. Moreover, in those algorithms, as more extra information is added into the message, more duplicates can be detected. However, the proposed algorithm achieves the same degree of efficiency using only the message log carried in each message, without any extra information.

[1]  David B. Johnson,et al.  Sender-Based Message Logging , 1987 .

[2]  Willy Zwaenepoel,et al.  Manetho: Transparent Rollback-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit , 1992, IEEE Trans. Computers.

[3]  David B. Johnson,et al.  Recovery in Distributed Systems Using Optimistic Message Logging and Checkpointing , 1988, J. Algorithms.

[4]  E. N. Elnozahy,et al.  On the relevance of communication costs of rollback-recovery protocols , 1995, PODC '95.

[5]  Lorenzo Alvisi,et al.  Trade-offs in implementing causal message logging protocols , 1996, PODC '96.

[6]  David F. Bacon,et al.  Volatile logging in n-fault-tolerant distributed systems , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[7]  Richard D. Schlichting,et al.  Fail-Stop Processors: An Approach to Designing Computing Systems , 1983 .

[8]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[9]  David B. Johnsonandwillyzwaenepoel Recovery in Distributed Systems Using Optimistic Message Logging and Checkpointing , 1990 .

[10]  Lorenzo Alvisi,et al.  Message logging: pessimistic, optimistic, and causal , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.

[11]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[12]  Ajay D. Kshemkalyani,et al.  Necessary and sufficient conditions on information for causal message ordering and their optimal implementation , 1998, Distributed Computing.