PeerReview: practical accountability for distributed systems

We describe PeerReview, a system that provides accountability in distributed systems. PeerReview ensures that Byzantine faults whose effects are observed by a correct node are eventually detected and irrefutably linked to a faulty node. At the same time, PeerReview ensures that a correct node can always defend itself against false accusations. These guarantees are particularly important for systems that span multiple administrative domains, which may not trust each other.PeerReview works by maintaining a secure record of the messages sent and received by each node. The record isused to automatically detect when a node's behavior deviates from that of a given reference implementation, thus exposing faulty nodes. PeerReview is widely applicable: it only requires that a correct node's actions are deterministic, that nodes can sign messages, and that each node is periodically checked by a correct node. We demonstrate that PeerReview is practical by applying it to three different types of distributed systems: a network filesystem, a peer-to-peer system, and an overlay multicast system.

[1]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[2]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[3]  Leslie Lamport,et al.  Using Time Instead of Timeout for Fault-Tolerant Distributed Systems. , 1984, TOPL.

[4]  Sam Toueg,et al.  Asynchronous consensus and broadcast protocols , 1985, JACM.

[5]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1987, IEEE Transactions on Software Engineering.

[6]  Gabriel Bracha,et al.  Asynchronous Byzantine Agreement Protocols , 1987, Inf. Comput..

[7]  Gil Neiger,et al.  Automatically increasing the fault-tolerance of distributed systems , 1988, PODC '88.

[8]  Brian A. Coan,et al.  A Compiler that Increases the Fault Tolerance of Asynchronous Protocols , 1988, IEEE Trans. Computers.

[9]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[10]  Karl N. Levitt,et al.  Automated detection of vulnerabilities in privileged programs by execution monitoring , 1994, Tenth Annual Computer Security Applications Conference.

[11]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[12]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[13]  Michael K. Reiter,et al.  Unreliable intrusion detection in distributed computations , 1997, Proceedings 10th Computer Security Foundations Workshop.

[14]  Bruce Schneier,et al.  Cryptographic Support for Secure Logs on Untrusted Machines , 1998, USENIX Security Symposium.

[15]  Rachid Guerraoui,et al.  Muteness Failure Detectors: Specification and Implementation , 1999, EDCC.

[16]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[17]  Michael K. Reiter,et al.  Fault detection for Byzantine quorum systems , 1999, Dependable Computing for Critical Applications 7.

[18]  Stephen T. Kent,et al.  Secure Border Gateway Protocol (S-BGP) - Real World Performance and Deployment Issues , 2000, NDSS.

[19]  Moni Naor,et al.  Certificate revocation and certificate update , 1998, IEEE Journal on Selected Areas in Communications.

[20]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[21]  Miguel Castro,et al.  BASE: using abstraction to improve fault tolerance , 2001, SOSP.

[22]  Rida A. Bazzi,et al.  Simplifying fault-tolerance: providing the abstraction of crash failures , 2001, JACM.

[23]  Andy Oram,et al.  Peer-to-Peer: Harnessing the Power of Disruptive Technologies , 2001 .

[24]  David Mazières,et al.  Tangler: a censorship-resistant publishing system based on document entanglements , 2001, CCS '01.

[25]  Miguel Castro,et al.  SCRIBE: The Design of a Large-Scale Event Notification Infrastructure , 2001, Networked Group Communication.

[26]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[27]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[28]  Andy Oram,et al.  Peer-to-peer , 2008, Nature Immunology.

[29]  James F. Doyle,et al.  Peer-to-Peer: harnessing the power of disruptive technologies , 2001, UBIQ.

[30]  Miguel Castro,et al.  Using abstraction to improve fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[31]  Mary Baker,et al.  Secure History Preservation Through Timeline Entanglement , 2002, USENIX Security Symposium.

[32]  John R. Douceur,et al.  The Sybil Attack , 2002, IPTPS.

[33]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[34]  Robert Tappan Morris,et al.  Ivy: a read/write peer-to-peer file system , 2002, OSDI '02.

[35]  Miguel Castro,et al.  Secure routing for structured peer-to-peer overlay networks , 2002, OSDI '02.

[36]  Miguel Castro,et al.  SplitStream: high-bandwidth multicast in cooperative environments , 2003, SOSP '03.

[37]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[38]  Louise E. Moser,et al.  Byzantine Fault Detectors for Solving Consensus , 2003, Comput. J..

[39]  Arun Venkataramani,et al.  Separating agreement from execution for byzantine fault tolerant services , 2003, SOSP '03.

[40]  Brian D. Noble,et al.  Samsara: honor among thieves in peer-to-peer storage , 2003, SOSP '03.

[41]  Tal Garfinkel,et al.  Flexible OS Support and Applications for Trusted Computing , 2003, HotOS.

[42]  Archana Ganapathi,et al.  Why Do Internet Services Fail, and What Can Be Done About It? , 2002, USENIX Symposium on Internet Technologies and Systems.

[43]  Jeffrey S. Chase,et al.  Trust but verify: accountability for network services , 2004, EW 11.

[44]  Butler W. Lampson,et al.  31. Paper: Computer Security in the Real World Computer Security in the Real World , 2022 .

[45]  Koushik Sen,et al.  Rule-Based Runtime Verification , 2004, VMCAI.

[46]  Ozalp Babaoglu,et al.  Detection and Removal of Malicious Peers in Gossip-Based Protocols∗ , 2004 .

[47]  Dennis Shasha,et al.  Secure Untrusted Data Repository (SUNDR) , 2004, OSDI.

[48]  Alan E. Mislove,et al.  POST: A decentralized platform for reliable collaborative applications , 2005 .

[49]  Jeffrey S. Chase,et al.  The role of accountability in dependable distributed systems , 2005 .

[50]  Sam Toueg,et al.  Simulating authenticated broadcasts to derive simple fault-tolerant algorithms , 1987, Distributed Computing.

[51]  Atul Singh,et al.  Scrivener: Providing Incentives in Cooperative Content Distribution Systems , 2005, Middleware.

[52]  Scott Rose,et al.  DNS Security Introduction and Requirements , 2005, RFC.

[53]  Scott Rose,et al.  DNS Security Introduction and Requirements, RFC 4033 | NIST , 2005 .

[54]  Elaine B. Barker,et al.  Recommendation for key management: , 2019 .

[55]  Andreas Haeberlen,et al.  Glacier: highly durable, decentralized storage despite massive correlated failures , 2005, NSDI.

[56]  Michael Dahlin,et al.  BAR fault tolerance for cooperative services , 2005, SOSP '05.

[57]  Emin Gün Sirer,et al.  Securing BGP Using External Security Monitors , 2006 .

[58]  Andreas Haeberlen,et al.  The Case for Byzantine Fault Detection , 2006, HotDep.

[59]  Atul Singh,et al.  Eclipse Attacks on Overlay Networks: Threats and Defenses , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[60]  Andreas Haeberlen,et al.  Experiences in building and operating ePOST, a reliable peer-to-peer application , 2006, EuroSys '06.

[61]  Emin Gün Sirer,et al.  Experience with an Object Reputation System for Peer-to-Peer Filesharing , 2006, NSDI.

[62]  John C.-I. Chuang,et al.  Network monitors and contracting systems: competition and innovation , 2006, SIGCOMM.

[63]  William E. Burr,et al.  Recommendation for Key Management, Part 1: General (Revision 3) , 2006 .

[64]  William H. Sanders,et al.  A Parsimonious Approach for Obtaining Resource-Efficient and Trustworthy Execution , 2007, IEEE Transactions on Dependable and Secure Computing.

[65]  Jeffrey S. Chase,et al.  Strong accountability for network storage , 2007, TOS.

[66]  Elaine B. Barker,et al.  SP 800-57. Recommendation for Key Management, Part 1: General (revised) , 2007 .

[67]  Robert Grimm,et al.  Ensuring Content Integrity for Untrusted Peer-to-Peer Content Distribution Networks , 2007, NSDI.

[68]  Scott Shenker,et al.  An Accountability Interface for the Internet , 2007 .

[69]  Scott Shenker,et al.  Attested append-only memory: making adversaries stick to their word , 2007, SOSP.