Using locality of reference to improve performance of peer-to-peer applications

Peer-to-peer, or simply P2P, systems have recently emerged as a popular paradigm for building distributed applications. One key aspect of the P2P system design is the mechanism used for content location. A number of different approaches are currently in use. In particular, the location algorithm used in Gnutella, a popular and extensively analyzed P2P file sharing application, is based on flooding of messages in the network, which results in significant processing overhead on the participant nodes and thus, poor performance.In this paper, we provide an extensive performance evaluation of alternative algorithms for content location and retrieval in P2P systems, in particular, the Freenet and Gnutella systems. We compare the original Freenet and Gnutella algorithms, a previously proposed interest-based algorithm and two new algorithms which also explore locality of interest among peers to efficiently allow content location. Unlike previous proposals, the new algorithms organize the peers into communities that share interests. Two peers are said to have common interest if they share some of the locally stored files.In order to evaluate the performance of these algorithms, we use a previously developed Freenet simulator and build a new Gnutella simulator, which includes several realistic system characteristics. We show that the new community-based algorithms improve the original Gnutella content location latency (and thus the system QoS) and system load by up to 31% and 30%, respectively. Our algorithms also reduce the average Freenet request and response path lengths by up to 39% and 31%, respectively. Furthermore, we show that, compared to the previously proposed interest-based algorithm, our new algorithms improve query latency by up to 27% without a significant increase in the load.

[1]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[2]  Hector Garcia-Molina,et al.  Efficient search in peer to peer networks , 2004 .

[3]  Ernesto Damiani,et al.  Choosing reputable servents in a P2P network , 2002, WWW.

[4]  Ian T. Foster,et al.  Mapping the Gnutella Network: Macroscopic Properties of Large-Scale Peer-to-Peer Systems , 2002, IPTPS.

[5]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[6]  C. Lee Giles,et al.  Self-Organization and Identification of Web Communities , 2002, Computer.

[7]  Ravindra K. Ahuja,et al.  Network Flows: Theory, Algorithms, and Applications , 1993 .

[8]  Donald F. Towsley,et al.  Modeling peer-peer file sharing systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[9]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[10]  Ian Clarke,et al.  Freenet: A Distributed Anonymous Information Storage and Retrieval System , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[11]  Jean G. Vaucher,et al.  Experimenting with Gnutella Communities , 2002, DCW.

[12]  Ian Clarke,et al.  Protecting Free Expression Online with Freenet , 2002, IEEE Internet Comput..

[13]  Edith Cohen,et al.  Associative search in peer to peer networks: harnessing latent semantics , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[14]  David R. Karger,et al.  Building peer-to-peer systems with chord, a distributed lookup service , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[15]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002, ICS '02.

[16]  Ben Y. Zhao,et al.  Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and , 2001 .

[17]  Lada A. Adamic,et al.  Search in Power-Law Networks , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Bruce M. Maggs,et al.  Enabling efficient content location and retrieval in peer-to-peer systems by exploiting locality in interests , 2002, CCRV.

[19]  Evangelos P. Markatos,et al.  Tracing a Large-Scale Peer to Peer System: An Hour in the Life of Gnutella , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[20]  B. Hayes Graph Theory in Practice: Part II , 2000, American Scientist.

[21]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[22]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[23]  Gary William Flake,et al.  Self-organization of the web and identification of communities , 2002 .

[24]  D. Nogueira,et al.  A methodology for workload characterization of file-sharing peer-to-peer networks , 2002, 2002 IEEE International Workshop on Workload Characterization.

[25]  Jacky C. Chu,et al.  Availability and locality measurements of peer-to-peer file systems , 2002, SPIE ITCom.

[26]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[27]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[28]  Bruce M. Maggs,et al.  Efficient content location using interest-based locality in peer-to-peer systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).