SIL: A model for analyzing scalable peer-to-peer search networks

The popularity of peer-to-peer search networks continues to grow, even as limitations to the scalability of existing systems become apparent. We propose a simple model for search networks, called the Search/Index Links (SIL) model, which can be used to analyze the scalability, search latency and fault tolerance of search networks. The model describes two kinds of links: search links, over which content searches are routed, and index links, over which content indexes are replicated. The combination of routing and indexing in the same network is extremely useful for building scalable, efficient search networks. While the SIL model can be used to examine existing networks, it can also be used to discover new organizations by defining desirable (or undesirable) properties of SIL graphs and then examining the topologies that exhibit (or lack) those properties. We define one such undesirable property, which we call redundancy, and show how it can be prevented by avoiding specific topological features. Using analytical and simulation results, we argue that one new organization discovered via our analysis, parallel search clusters, is superior to existing supemode networks in many cases: for example, in a network with 10,000 nodes, our analysis shows that the average node in a supernode network is up to 16 times more loaded than the average node in a cluster network. At the same time, a cluster network has better fault tolerance than, and similar search latency to, a supernode network.

[1]  H. Garcia-Molina Semantic Overlay Networks , 2003 .

[2]  Roger Dingledine,et al.  The Free Haven Project: Distributed Anonymous Storage Service , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[3]  Eli Upfal,et al.  Building low-diameter P2P networks , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[4]  Hector Garcia-Molina,et al.  Designing a super-peer network , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[5]  Renée J. Miller,et al.  Mapping data in peer-to-peer systems: semantics and algorithmic issues , 2003, SIGMOD '03.

[6]  Renée J. Miller,et al.  Data mapping in peer-to-peer systems: Semantics and algorithmic issues , 2003, SIGMOD 2003.

[7]  Scott Shenker,et al.  Can Heterogeneity Make Gnutella Scalable? , 2002, IPTPS.

[8]  Dimitrios Gunopulos,et al.  A local search mechanism for peer-to-peer networks , 2002, CIKM '02.

[9]  Hector Garcia-Molina,et al.  Efficient search in peer to peer networks , 2004 .

[10]  Beng Chin Ooi,et al.  An adaptive peer-to-peer network for distributed caching of OLAP results , 2002, SIGMOD '02.

[11]  M. E. Muller,et al.  A Note on the Generation of Random Normal Deviates , 1958 .

[12]  Hector Garcia-Molina,et al.  Improving search in peer-to-peer networks , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[13]  Roberto J. Bayardo,et al.  Make it fresh, make it quick: searching a network of personal webservers , 2003, WWW '03.

[14]  Hector Garcia-Molina,et al.  Evaluating GUESS and non-forwarding peer-to-peer search , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[15]  Alon Y. Halevy,et al.  Piazza: data management infrastructure for semantic web applications , 2003, WWW '03.

[16]  Fausto Giunchiglia,et al.  Data Management for Peer-to-Peer Computing : A Vision , 2002, WebDB.

[17]  Hector Garcia-Molina,et al.  Routing indices for peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[18]  Hector Garcia-Molina,et al.  Ad Hoc, self-supervising peer-to-peer search networks , 2005, TOIS.

[19]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002, ICS '02.

[20]  Edith Cohen,et al.  Replication strategies in unstructured peer-to-peer networks , 2002, SIGCOMM.

[21]  Angelos D. Keromytis,et al.  SOS: secure overlay services , 2002, SIGCOMM '02.

[22]  Scott Shenker,et al.  Complex Queries in Dht-based Peer-to-peer Networks , 2002 .

[23]  Felix Naumann,et al.  Semantic Overlay Clusters within Super-Peer Networks , 2003, DBISP2P.

[24]  Vijay Gopalakrishnan,et al.  Efficient Peer-To-Peer Searches Using Result-Caching , 2003, IPTPS.

[25]  Ian T. Foster,et al.  Mapping the Gnutella Network: Macroscopic Properties of Large-Scale Peer-to-Peer Systems , 2002, IPTPS.

[26]  David J. DeWitt,et al.  Locating Data Sources in Large Distributed Systems , 2003, VLDB.

[27]  Scott Shenker,et al.  Querying the Internet with PIER , 2003, VLDB.

[28]  Thu D. Nguyen,et al.  Text-Based Content Search and Retrieval in Ad-hoc P2P Communities , 2002, NETWORKING Workshops.

[29]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[30]  Partha Dasgupta,et al.  Structuring Peer-to-Peer Networks Using Interest-Based Communities , 2003, DBISP2P.

[31]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[32]  Lada A. Adamic,et al.  Search in Power-Law Networks , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Christopher R. Palmer,et al.  Generating network topologies that obey power laws , 2000, Globecom '00 - IEEE. Global Telecommunications Conference. Conference Record (Cat. No.00CH37137).

[34]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[35]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM 2001.

[36]  Sandhya Dwarkadas,et al.  Peer-to-peer information retrieval using self-organizing semantic overlay networks , 2003, SIGCOMM '03.

[37]  Scott Shenker,et al.  Making gnutella-like P2P systems scalable , 2003, SIGCOMM '03.

[38]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.