Programming simultaneous actions using common knowledge: Preliminary version

This work applies the theory of knowledge in distributed systems to the design of faulttolerant protocols for problems involving coordinated simultaneous actions in synchronous systems. We give a simple method for transforming specifications of such problems into high-level protocols programmed using explicit tests of whether certain facts are common knowledge. The resulting protocols are optimal in all runs: for every possible input to system and pattern of processor failures, they are guaranteed to perform the simultaneous actions as soon as any other protocol can possibly perform them. A careful analysis of when facts become common knowledge shows how to efficiently implement these protocols in many variants of the omissions failure model. In the generalized omissions model, however, it is shown that any protocol that is optimal in this sense must require co-NP hard computations. The analysis in this paper exposes subtle differences between the failure models, including the precise point at which this gap in complexity occurs.

[1]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[2]  Neil Immerman,et al.  Foundations of Knowledge for Distributed Systems , 1986, TARK.

[3]  Michael J. Fischer,et al.  The Consensus Problem in Unreliable Distributed Systems (A Brief Survey) , 1983, FCT.

[4]  Brian A. Coan,et al.  A communication-efficient canonical form for fault-tolerant distributed protocols , 1986, PODC '86.

[5]  Joseph Y. Halpern,et al.  Knowledge and common knowledge in a distributed environment , 1984, JACM.

[6]  C. Mohan,et al.  Method for distributed transaction commit and recovery using Byzantine Agreement within clusters of processors , 1983, PODC '83.

[7]  Yoram Moses,et al.  Knowledge in a distributed environment , 1986 .

[8]  Yoram Moses,et al.  Knowledge and common knowledge in a Byzantine environment I: crash failures , 1986 .

[9]  Danny Dolev,et al.  'Eventual' is earlier than 'immediate' , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[10]  Nancy A. Lynch,et al.  The Byzantine Firing Squad Problem. , 1985 .

[11]  Nancy A. Lynch,et al.  A Lower Bound for the Time to Assure Interactive Consistency , 1982, Inf. Process. Lett..

[12]  K. Mani Chandy,et al.  How processes learn , 1985, PODC '85.

[13]  Ronald Fagin,et al.  A formal model of knowledge, action, and communication in distributed systems: preliminary report , 1985, PODC '85.

[14]  Ramaswamy Ramanujam,et al.  Distributed Processes and the Logic of Knowledge , 1985, Logic of Programs.