Programming simultaneous actions using common knowledge

This work applies the theory of knowledge in distributed systems to the design of efficient fault-tolerant protocols. We define a large class of problems requiring coordinated, simultaneous action in synchronous systems, and give a method of transforming specifications of such problems into protocols that areoptimal in all runs: these protocols are guaranteed to perform the simultaneous actions as soon as any other protocol could possibly perform them, given the input to the system and faulty processor behavior. This transformation is performed in two steps. In the first step we extract, directly from the problem specification, a high-level protocol programmed using explicit tests for common knowledge. In the second step we carefully analyze when facts become common knowledge, thereby providing a method of efficiently implementing these protocols in many variants of the omissions failure model. In the generalized omissions model, however, our analysis shows that testing for common knowledge is NP-hard. Given the close correspondence between common knowledge and simultaneous actions, we are able to show that no optimal protocol for any such problem can be computationally efficient in this model. The analysis in this paper exposes many subtle differences between the failure models, including the precise point at which this gap in complexity occurs.

[1]  Neil Immerman,et al.  Foundations of Knowledge for Distributed Systems , 1986, TARK.

[2]  Brian A. Coan,et al.  The Distributed Firing Squad Problem , 1989, SIAM J. Comput..

[3]  守屋 悦朗,et al.  J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .

[4]  Joseph Y. Halpern,et al.  A Guide to the Modal Logics of Knowledge and Belief: Preliminary Draft , 1985, IJCAI.

[5]  Michael J. Fischer,et al.  The Consensus Problem in Unreliable Distributed Systems (A Brief Survey) , 1983, FCT.

[6]  Nancy A. Lynch,et al.  The Byzantine Firing Squad Problem. , 1985 .

[7]  J. Hintikka Knowledge and belief , 1962 .

[8]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[9]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[10]  Joseph Y. Halpern,et al.  Knowledge and common knowledge in a distributed environment , 1984, JACM.

[11]  Yoram Moses,et al.  Knowledge and Common Knowledge in a Byzantine Environment I: Crash Failures , 1986, TARK.

[12]  Nancy A. Lynch,et al.  A Lower Bound for the Time to Assure Interactive Consistency , 1982, Inf. Process. Lett..

[13]  K. Mani Chandy,et al.  How processes learn , 1985, PODC '85.

[14]  C. Mohan,et al.  Method for distributed transaction commit and recovery using Byzantine Agreement within clusters of processors , 1983, PODC '83.

[15]  Brian A. Coan,et al.  A communication-efficient canonical form for fault-tolerant distributed protocols , 1986, PODC '86.

[16]  Ronald Fagin,et al.  A formal model of knowledge, action, and communication in distributed systems: preliminary report , 1985, PODC '85.

[17]  Ramaswamy Ramanujam,et al.  Distributed Processes and the Logic of Knowledge , 1985, Logic of Programs.

[18]  Yoram Moses,et al.  Knowledge in a distributed environment , 1986 .

[19]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[20]  Sam Toueg,et al.  Distributed agreement in the presence of processor and communication faults , 1986, IEEE Transactions on Software Engineering.

[21]  Danny Dolev,et al.  'Eventual' is earlier than 'immediate' , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).