Load balancing with multiple hash functions in peer-to-peer networks

Peer-to-peer (P2P) networks have grown in popularity in recent years. One of the typical applications of P2P networks is file-sharing. Effective load balancing in such applications is important since the distribution of the number of requests for individual files can be heavily skewed. In the basic design of these networks each file is stored at a single node (i.e., server) which will become a hotspot if the file is popular. In this paper, we focus on the file-replication strategy that utilize multiple hash functions. Such a strategy typically sets aside a large number of hash functions. When the demand for a file exceeds the overall capacity of the current servers, a previously unused hash function is used to obtain a new server ID where the file will be replicated. The central problems are how to choose an unused hash function when replicating a file and how to choose a used hash function when requesting the file. Our solution to the file-replication problem is to choose the unused hash function with the smallest index, and our solution to the file-request problem is to choose a used hash function uniformly at random. Our main contribution is to develop a set of distributed algorithms that implement the above solutions and to evaluate their performance. In particular, we analyze a random binary search algorithm and random gap-removal algorithm

[1]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[2]  Robert G. Gallager,et al.  Discrete Stochastic Processes , 1995 .

[3]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[4]  Richard M. Karp,et al.  Load balancing in dynamic structured P2P systems , 2004, IEEE INFOCOM 2004.

[5]  Ben Y. Zhao,et al.  Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and , 2001 .

[6]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[7]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[8]  Peter B. Danzig,et al.  A Hierarchical Internet Object Cache , 1996, USENIX ATC.

[9]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[10]  Rajmohan Rajaraman,et al.  Fast fault-tolerant concurrent access to shared objects , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[11]  Rajmohan Rajaraman,et al.  Accessing Nearby Copies of Replicated Objects in a Distributed Environment , 1997, SPAA '97.

[12]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[13]  Richard M. Karp,et al.  Load Balancing in Structured P2P Systems , 2003, IPTPS.

[14]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[15]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[16]  Michael B. Jones,et al.  Overlook: scalable name service on an overlay network , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.