Distributed Uniform Sampling in Unstructured Peer-to-Peer Networks

Uniform sampling in networks is at the core of a wide variety of randomized algorithms. Random sampling can be performed by modeling the system as an undirected graph with associated transition probabilities and defining a corresponding Markov chain (MC). A random walk of prescribed minimum length, performed on this graph, yields a stationary distribution, and the corresponding random sample. This sample, however, is not uniform when network nodes have a non-uniform degree distribution. This poses a significant practical challenge since typical large scale real-world unstructured networks tend to have non-uniform degree distributions, e.g., power-law degree distribution in unstructured peer-to-peer networks. In this paper, we present a distributed algorithm that enables efficient uniform sampling in large unstructured non-uniform networks. Specifically, we prescribe necessary conditions for uniform sampling in such networks and present distributed algorithms that satisfy these requirements. We empirically evaluate the performance of our algorithm in comparison to known algorithms. The performance parameters include computational complexity, length of random walk, and uniformity of the sampling. Simulation results support our claims of performance improvements due to our algorithm.

[1]  David R. Karger,et al.  Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems , 2004, IPTPS.

[2]  宮沢 政清,et al.  P. Bremaud 著, Markov Chains, (Gibbs fields, Monte Carlo simulation and Queues), Springer-Verlag, 1999年 , 2000 .

[3]  Suresh Jagannathan,et al.  Randomized Protocols for Duplicate Elimination in Peer-to-Peer Storage Systems , 2005, Peer-to-Peer Computing.

[4]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[5]  Deborah Estrin,et al.  Coping with irregular spatio-temporal sampling in sensor networks , 2004, CCRV.

[6]  Steve Chien,et al.  Approximating Aggregate Queries about Web Pages via Random Walks , 2000, VLDB.

[7]  Suresh Jagannathan,et al.  Unstructured peer-to-peer networks for sharing processor cycles , 2006, Parallel Comput..

[8]  Jared Saia,et al.  Choosing a random peer , 2004, PODC '04.

[9]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[10]  Suresh Jagannathan,et al.  Search with probabilistic guarantees in unstructured peer-to-peer networks , 2005, Fifth IEEE International Conference on Peer-to-Peer Computing (P2P'05).

[11]  Christos Gkantsidis,et al.  Random walks in peer-to-peer networks , 2004, IEEE INFOCOM 2004.

[12]  Krishna P. Gummadi,et al.  Measuring and analyzing the characteristics of Napster and Gnutella hosts , 2003, Multimedia Systems.

[13]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[14]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[15]  J. Saia,et al.  Scalable Byzantine Agreement , 2004 .

[16]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002 .

[17]  J. Propp,et al.  Exact sampling with coupled Markov chains and applications to statistical mechanics , 1996 .

[18]  Dahlia Malkhi,et al.  Estimating network size from local information , 2003, Information Processing Letters.

[19]  Marc Najork,et al.  On near-uniform URL sampling , 2000, Comput. Networks.

[20]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[21]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[22]  Michael K. Reiter,et al.  Probabilistic quorum systems , 1997, PODC '97.

[23]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[24]  David M. Pennock,et al.  Methods for Sampling Pages Uniformly from the World Wide Web , 2001 .

[25]  Suresh Jagannathan,et al.  Randomized leader election , 2007, Distributed Computing.

[26]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[27]  Stephen P. Boyd,et al.  Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..

[28]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[29]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002, ICS '02.

[30]  Richard J. Lipton,et al.  Random walks, universal traversal sequences, and the complexity of maze problems , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[31]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[32]  Jeffrey Considine,et al.  Approximately uniform random sampling in sensor networks , 2004, DMSN '04.