Fault Tolerance for Super-Peers of P2P Systems

This paper presents an efficient fault-tolerant approach for the super-peers of peer-to-peer (P2P) file sharing systems. In the super-peer based P2P file sharing system, peers are organized into multiple groups. In each group, it has a special peer called super peer to serve the regular peers within the group. In this hierarchical architecture, if the super peer departs (fails), any file queries to its serving regular peers cannot be delivered. In the proposed approach, we propose a multiple publication technique to make each regular peer logically connect with two or more super peers in other groups. If a regular peer finds that its serving super peer cannot work, one of its other connected super peers will be selected as its new serving super peer to continuously process the file queries. To examine the effectiveness of the proposed approach, comprehensive simulations are performed to quantify the performance and overhead of the proposed approach.

[1]  Carl Mayer,et al.  Reducing planned outages for book hardware maintenance with concurrent book replacement , 2007, IBM J. Res. Dev..

[2]  Thomas F. Wenisch,et al.  SimFlex: a fast, accurate, flexible full-system simulation framework for performance evaluation of server architecture , 2004, PERV.

[3]  John Paul Shen,et al.  Scaling and characterizing database workloads: bridging the gap between research and practice , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[4]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[5]  Yiming Hu,et al.  A Super-Peer Based Lookup in Structured Peer-to-Peer Systems , 2003, ISCA PDCS.

[6]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM 2001.

[7]  Guillaume Urvoy-Keller,et al.  Hierarchical Peer-To-Peer Systems , 2003, Parallel Process. Lett..

[8]  Ian Clarke,et al.  Freenet: A Distributed Anonymous Information Storage and Retrieval System , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[9]  Li Xiao,et al.  Fast and low-cost search schemes by exploiting localities in P2P networks , 2005, J. Parallel Distributed Comput..

[10]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[11]  Li Xiao,et al.  Dynamic Layer Management in Superpeer Architectures , 2005, IEEE Trans. Parallel Distributed Syst..

[12]  David A. Patterson,et al.  Recovery Oriented Computing: A New Research Agenda for a New Century , 2002, HPCA.

[13]  Joefon Jann,et al.  Dynamic reconfiguration: Basic building blocks for autonomic computing on IBM pSeries servers , 2003, IBM Syst. J..

[14]  Dong Tang,et al.  Assessment of the Effect of Memory Page Retirement on System RAS Against Hardware Faults , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[15]  Yunhao Liu,et al.  Dynamic layer management in super-peer architectures , 2004 .

[16]  Zarka Cvetanovic Performance analysis of the Alpha 21364-based HP GS1280 multiprocessor , 2003, ISCA '03.

[17]  Ellen W. Zegura,et al.  How to model an internetwork , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[18]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[19]  Babak Falsafi,et al.  TRUSS: a reliable, scalable server architecture , 2005, IEEE Micro.

[20]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[21]  Pat Conway,et al.  The AMD Opteron Processor for Multiprocessor Servers , 2003, IEEE Micro.

[22]  Brendan Murphy Automating Software Failure Reporting , 2004, ACM Queue.

[23]  Bianca Schroeder,et al.  A Large-Scale Study of Failures in High-Performance Computing Systems , 2006, IEEE Transactions on Dependable and Secure Computing.

[24]  Hector Garcia-Molina,et al.  Designing a super-peer network , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[25]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.