Locating Data in (Small-World?) Peer-to-Peer Scientific Collaborations

Data-sharing scientific collaborations have particular characteristics, potentially different from the current peer-to-peer environments. In this paper we advocate the benefits of exploiting emergent patterns in self-configuring networks specialized for scientific data-sharing collaborations. We speculate that a peer-to-peer scientific collaboration network will exhibit small-world topology, as do a large number of social networks for which the same pattern has been documented. We propose a solution for locating data in decentralized, scientific, data-sharing environments that exploits the small-worlds topology. The research challenge we raise is: what protocols should be used to allow a self-configuring peer-to-peer network to form small worlds similar to the way in which the humans that use the network do in their social interactions?

[1]  L. Lueking,et al.  SAM and the Particle Physics Data Grid , 2001 .

[2]  Gesine Reinert,et al.  Small worlds , 2001, Random Struct. Algorithms.

[3]  Paul Avery,et al.  The griphyn project: towards petascale virtual data grids , 2001 .

[4]  Jie Wu,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2003 .

[5]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[6]  M. Crawford The Human Genome Project. , 1990, Human biology.

[7]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[8]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[9]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[10]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2002, TNET.

[11]  M V Olson,et al.  The human genome project. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Ben Y. Zhao,et al.  Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and , 2001 .

[13]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[14]  Ian Clarke,et al.  Freenet: A Distributed Anonymous Information Storage and Retrieval System , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[15]  Anne-Marie Kermarrec,et al.  Reliable probabilistic communication in large-scale information dissemination systems , 2000 .