论文信息 - Certifying Safety when Implementing Consensus

Certifying Safety when Implementing Consensus

Ensuring the correctness of distributed system implementations remains a challenging and largely unaddressed problem. In this paper we present a protocol that can be used to certify the safety of consensus implementations. Our proposed protocol is efficient both in terms of the number of additional messages sent and their size, and is designed to operate correctly in the presence of $n-1$ nodes failing in an $n$ node distributed system (assuming fail-stop failures). We also comment on how our construction might be generalized to certify other protocols and invariants.

Aurojit Panda | Aurojit Panda

[1] Randy H. Katz,et al. X-Trace: A Pervasive Network Tracing Framework , 2007, NSDI.

[2] Brett D. Fleisch,et al. The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[3] Shmuel Sagiv,et al. Paxos made EPR: decidable reasoning about distributed protocols , 2017, Proc. ACM Program. Lang..

[4] Butler W. Lampson,et al. How to Build a Highly Available System Using Consensus , 1996, WDAG.

[5] Moni Naor,et al. The Power of Distributed Verifiers in Interactive Proofs , 2018, Electron. Colloquium Comput. Complex..

[6] David R. Cheriton,et al. Leases: an efficient fault-tolerant mechanism for distributed file cache consistency , 1989, SOSP '89.

[7] Srinath T. V. Setty,et al. IronFleet: proving practical distributed systems correct , 2015, SOSP.

[8] Ilya Sergey,et al. Programming and proving with distributed protocols , 2017, Proc. ACM Program. Lang..

[9] John K. Ousterhout,et al. In Search of an Understandable Consensus Algorithm , 2014, USENIX ATC.

[10] Nancy A. Lynch,et al. Impossibility of distributed consensus with one faulty process , 1985, JACM.

[11] R. V. Renesse,et al. Derecho : Group Communication at the Speed of Light , 2016 .