The design of a robust peer-to-peer system

Peer-to-peer (P2P) overlay networks have recently become one of the hottest topics in OS research. These networks bring with them the promise of harnessing idle storage and network resources from client machines that voluntarily join the system; self-configuration and automatic load balancing; censorship resistance; and extremely good scalability due to the use of symmetric algorithms. However, the use of unreliable client machines leads to two defects of these systems that precludes their use in a number of applications: storage is inherently unreliable, and lookup algorithms have long latencies. In this paper we propose a design of a robust peer-to-peer storage service, composed not of client nodes, but server nodes that are dedicated to running the peer-to-peer application. We argue that our system overcomes the defects of peer-to-peer systems while retaining their nice properties with the exception of utilizing spare resources of client machines. Our system is capable of surviving arbitrary failures of its nodes (Byzantine faults) and we expect it to perform and scale well, even in a wide-area network.

[1]  Miguel Castro,et al.  BASE: Using abstraction to improve fault tolerance , 2003, TOCS.

[2]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[3]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[4]  Fred B. Schneider,et al.  COCA: a secure distributed online certification authority , 2002 .

[5]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[6]  Ben Y. Zhao,et al.  Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and , 2001 .

[7]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[8]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[9]  Doug Terry,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[10]  Michael K. Reiter,et al.  Byzantine quorum systems , 1997, STOC '97.

[11]  Robert Tappan Morris,et al.  Security Considerations for Peer-to-Peer Distributed Hash Tables , 2002, IPTPS.

[12]  Ralph C. Merkle,et al.  A Digital Signature Based on a Conventional Encryption Function , 1987, CRYPTO.

[13]  John R. Douceur,et al.  The Sybil Attack , 2002, IPTPS.

[14]  Miguel Castro,et al.  Using abstraction to improve fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[15]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[16]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[17]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[18]  Miguel Castro,et al.  Proactive recovery in a Byzantine-fault-tolerant system , 2000, OSDI.