Search strategies for scientific collaboration networks

Can we improve P2P search by looking into our social network? In this paper, we argue that P2P networks built upon specific communities (e.g., scientific social networks) could achieve such a goal, by providing an implicit personalization to the output results set. Existing work in social networks investigating co-authorship relations has shown that scientific collaboration networks are scale-free. At the same time, P2P systems based on synthesized small-world networks have emerged, with a positive impact on search efficiency. We propose to use existing social collaboration graphs as foundation for the P2P topology instead of creating purely technological topologies. To get an insight into the relationship between scientific collaboration and co-authorship, we compared both for an existing collaboration network. Based on this analysis, we then generated a large P2P collaboration network derived from co-authorship data collections as basis for our experiments. The most prevalent search type in the scientific context is keyword search for relevant publications. We investigate different search strategies suitable in that context and show our initial experimental results.

[1]  Steffen Staab,et al.  Remindin': semantic query routing in peer-to-peer networks based on social metaphors , 2004, WWW '04.

[2]  C. Lee Giles,et al.  Digital Libraries and Autonomous Citation Indexing , 1999, Computer.

[3]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Christoph Schmitz Self-Organization of a Small World by Topic , 2004, LWA.

[5]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[6]  Gurmeet Singh Manku,et al.  Symphony: Distributed Hashing in a Small World , 2003, USENIX Symposium on Internet Technologies and Systems.

[7]  Vwani P. Roychowdhury,et al.  Percolation search in power law networks: making unstructured peer-to-peer networks scalable , 2004 .

[8]  Anand Sivasubramaniam,et al.  Semantic small world: an overlay network for peer-to-peer search , 2004, Proceedings of the 12th IEEE International Conference on Network Protocols, 2004. ICNP 2004..

[9]  Bruce M. Maggs,et al.  Efficient content location using interest-based locality in peer-to-peer systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[10]  Munindar P. Singh,et al.  Searching social networks , 2003, AAMAS '03.

[11]  Torsten Suel,et al.  ODISSEA: A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval , 2003, WebDB.

[12]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Wolf-Tilo Balke,et al.  Progressive distributed top-k retrieval in peer-to-peer networks , 2005, 21st International Conference on Data Engineering (ICDE'05).

[14]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[15]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[16]  Wolfgang Nejdl,et al.  Knowing Where to Search: Personalized Search Strategies for Peers in P2P Networks , 2004, Workshop on Peer-to-Peer Information Retrieval.

[17]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[18]  Gurmeet Singh Manku,et al.  SETS: search enhanced by topic segmentation , 2003, SIGIR.

[19]  David K. Y. Yau,et al.  Small world overlay P2P networks , 2004, Twelfth IEEE International Workshop on Quality of Service, 2004. IWQOS 2004..

[20]  Hector Garcia-Molina,et al.  Limited reputation sharing in P2P systems , 2004, EC '04.

[21]  Dimitrios Gunopulos,et al.  Exploiting locality for scalable information retrieval in peer-to-peer networks , 2005, Inf. Syst..

[22]  Lada A. Adamic,et al.  Local Search in Unstructured Networks , 2002, ArXiv.

[23]  David R. Karger,et al.  OverCite: A Cooperative Digital Research Library , 2005, IPTPS.

[24]  Lada A. Adamic,et al.  Search in Power-Law Networks , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Sandhya Dwarkadas,et al.  On scaling latent semantic indexing for large peer-to-peer systems , 2004, SIGIR '04.

[26]  Ellen W. Zegura,et al.  Adding structure to unstructured peer-to-peer networks: the use of small-world graphs , 2005, J. Parallel Distributed Comput..

[27]  Richard P. Martin,et al.  PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[28]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[29]  Ian T. Foster,et al.  Locating Data in (Small-World?) Peer-to-Peer Scientific Collaborations , 2002, IPTPS.

[30]  Leandro Navarro-Moldes,et al.  P2P architecture for scientific collaboration , 2004, 13th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises.

[31]  Vana Kalogeraki,et al.  Finding good peers in peer-to-peer networks , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.