Near-optimal random walk sampling in distributed networks

Performing random walks in networks is a fundamental primitive that has found numerous applications in communication networks such as token management, load balancing, network topology discovery and construction, search, and peer-to-peer membership management. While several such algorithms are ubiquitous, and use numerous random walk samples, the walks themselves have always been performed naively. In this paper, we focus on the problem of performing random walk sampling efficiently in a distributed network. Given bandwidth constraints, the goal is to minimize the number of rounds and messages required to obtain several random walk samples in a continuous online fashion. We present the first round and message optimal distributed algorithms that present a significant improvement on all previous approaches. The theoretical analysis and comprehensive experimental evaluation of our algorithms show that they perform very well in different types of networks of differing topologies. In particular, our results show how several random walks can be performed continuously (when source nodes are provided only at runtime, i.e., online), such that each walk of length ℓ can be performed exactly in just Õ(√ℓD) rounds (where D is the diameter of the network), and O(ℓ) messages. This significantly improves upon both, the naive technique that requires O(ℓ) rounds and O(ℓ) messages, and the sophisticated algorithm of [13] that has the same round complexity as this paper but requires Ω(m√ℓ) messages (where m is the number of edges in the network). Our theoretical results are corroborated through extensive experiments on various topological data sets. Our algorithms are fully decentralized, lightweight, and easily implementable, and can serve as building blocks in the design of topologically-aware networks.

[1]  Brian F. Cooper Quickly Routing Searches Without Having to Move Content , 2005, IPTPS.

[2]  Sreenivas Gollapudi,et al.  Estimating PageRank on graph streams , 2008, PODS.

[3]  Anne-Marie Kermarrec,et al.  Peer-to-Peer Membership Management for Gossip-Based Protocols , 2003, IEEE Trans. Computers.

[4]  Andrei Z. Broder,et al.  Generating random spanning trees , 1989, 30th Annual Symposium on Foundations of Computer Science.

[5]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[6]  David Peleg,et al.  Distributed Computing: A Locality-Sensitive Approach , 1987 .

[7]  Indranil Gupta,et al.  AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems , 2007, IEEE Transactions on Parallel and Distributed Systems.

[8]  Tarek A. El-Ghazawi,et al.  A self-stabilizing distributed algorithm for spanning tree construction in wireless ad hoc networks , 2003, J. Parallel Distributed Comput..

[9]  M. Karonski Collisions among Random Walks on a Graph , 1993 .

[10]  Alain Bui,et al.  Random Distributed Self-stabilizing Structures Maintenance , 2004, ISSADS.

[11]  Lada A. Adamic,et al.  Search in Power-Law Networks , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Christos Gkantsidis,et al.  Hybrid search schemes for unstructured peer-to-peer networks , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[13]  Shlomi Dolev,et al.  Spanders: distributed spanning expanders , 2010, SAC '10.

[14]  Sreenivas Gollapudi,et al.  Estimating PageRank on graph streams , 2008, PODS.

[15]  Ming Zhong,et al.  Non-uniform random membership management in peer-to-peer networks , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[16]  David R. Karger,et al.  Simple Efficient Load-Balancing Algorithms for Peer-to-Peer Systems , 2004, SPAA '04.

[17]  Danupon Nanongkai,et al.  Fast distributed random walks , 2009, PODC '09.

[18]  Jennifer L. Welch,et al.  Random walk for self-stabilizing group communication in ad hoc networks , 2002, IEEE Transactions on Mobile Computing.

[19]  Jon M. Kleinberg,et al.  Spatial gossip and resource location protocols , 2001, JACM.

[20]  Ming Zhong,et al.  Random walk based node sampling in self-organizing networks , 2006, OPSR.

[21]  Prasad Tetali,et al.  Efficient distributed random walks with applications , 2010, PODC '10.

[22]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM '04.

[23]  Alain Bui,et al.  Random Walks in Distributed Computing: A Survey , 2004, IICS.

[24]  Indranil Gupta,et al.  AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems , 2009, IEEE Trans. Parallel Distributed Syst..

[25]  Navin Goyal,et al.  Expanders via random spanning trees , 2008, SODA.

[26]  Kai-Yeung Siu,et al.  Distributed construction of random expander networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[27]  David Bruce Wilson,et al.  Generating random spanning trees more quickly than the cover time , 1996, STOC '96.

[28]  Danupon Nanongkai,et al.  A tight unconditional lower bound on distributed randomwalk computation , 2011, PODC '11.

[29]  Noga Alon,et al.  Many random walks are faster than one , 2007, SPAA '08.

[30]  Christos Gkantsidis,et al.  Random walks in peer-to-peer networks , 2004, IEEE INFOCOM 2004.

[31]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM 2004.

[32]  Judit Bar-Ilan,et al.  Random Leaders and Random Spanning Trees , 1989, WDAG.

[33]  Dmitri Loguinov,et al.  Graph-theoretic analysis of structured peer-to-peer systems: routing distances and fault resilience , 2003, IEEE/ACM Transactions on Networking.

[34]  Amos Israeli,et al.  Token management schemes and random walks yield self-stabilizing mutual exclusion , 1990, PODC '90.

[35]  Indranil Gupta,et al.  AVCOL: Availability-aware information aggregation in large distributed systems under uncollaborative behavior , 2009, Comput. Networks.

[36]  Maleq Khan,et al.  Theory of communication networks , 2010 .

[37]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[38]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002, ICS '02.

[39]  Richard J. Lipton,et al.  Random walks, universal traversal sequences, and the complexity of maze problems , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).