Random Walks in the Edge Sampling Model

The random walk is an important tool to analyze the structural features of graphs such as the community structure and the PageRank. The problem of generating a random walk may be hard if we are not given full access to the graph. The main component of this thesis is solving the problem in one such model with restricted access to the graph, the edge sampling model. We design Sampling-AS, a randomized algorithm that efficiently samples the endpoint of a random walk, unless some unlikely event happens during the execution of the algorithm. We also propose Sampling-LS, a randomized algorithm that always samples the endpoint of a random walk; however, its performance is not as good. Moreover, we slightly modify both algorithms to improve their performance on some special classes of graphs such as regular graphs, random graphs and fast mixing graphs. Finally, we consider some applications for both algorithms.

[1]  Surender Baswana,et al.  Streaming algorithm for graph spanners - single pass and constant processing time per edge , 2008, Inf. Process. Lett..

[2]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[3]  Mark Jerrum,et al.  Approximating the Permanent , 1989, SIAM J. Comput..

[4]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[5]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  David R. Karger,et al.  Using randomized sparsification to approximate minimum cuts , 1994, SODA '94.

[7]  David R. Karger,et al.  Random Sampling in Cut, Flow, and Network Design Problems , 1999, Math. Oper. Res..

[8]  Ziv Bar-Yossef,et al.  Reductions in streaming algorithms, with an application to counting triangles in graphs , 2002, SODA '02.

[9]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[10]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[11]  Christian Sohler,et al.  Counting triangles in data streams , 2006, PODS.

[12]  Andrei Z. Broder,et al.  Generating random spanning trees , 1989, 30th Annual Symposium on Foundations of Computer Science.

[13]  Joan Feigenbaum,et al.  On graph problems in a semi-streaming model , 2005, Theor. Comput. Sci..

[14]  Nicholas C. Wormald,et al.  The mixing time of the giant component of a random graph , 2006, Random Struct. Algorithms.

[15]  Amin Saberi,et al.  On certain connectivity properties of the internet topology , 2006, J. Comput. Syst. Sci..

[16]  Pili Hu,et al.  Accelerating graph mining algorithms via uniform random edge sampling , 2016, 2016 IEEE International Conference on Communications (ICC).

[17]  L. Babai Monte-Carlo algorithms in graph isomorphism testing , 2006 .

[18]  P. Diaconis,et al.  Trailing the Dovetail Shuffle to its Lair , 1992 .

[19]  Noga Alon,et al.  The Space Complexity of Approximating the Frequency Moments , 1999 .

[20]  Sreenivas Gollapudi,et al.  Estimating PageRank on graph streams , 2008, PODS.

[21]  Harald Niederreiter,et al.  Probability and computing: randomized algorithms and probabilistic analysis , 2006, Math. Comput..

[22]  Sergey Brin,et al.  Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.

[23]  David P. Woodruff,et al.  An optimal algorithm for the distinct elements problem , 2010, PODS '10.

[24]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[25]  Sudipto Guha Tight results for clustering and summarizing data streams , 2009, ICDT '09.

[26]  Kai-Yeung Siu,et al.  Distributed construction of random expander networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[27]  Samir Khuller,et al.  Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity , 2008, APPROX-RANDOM.

[28]  Adam Tauman Kalai,et al.  Trust-based recommendation systems: an axiomatic approach , 2008, WWW.

[29]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[30]  Donald F. Towsley,et al.  Estimating and sampling graphs with multidimensional random walks , 2010, IMC '10.

[31]  David R. Karger,et al.  Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems , 2004, IPTPS.

[32]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[33]  Atish Das Sarma,et al.  Fast Distributed PageRank Computation , 2013, ICDCN.

[34]  B. Bollobás Random Graphs: The Evolution of Random Graphs—the Giant Component , 2001 .

[35]  Erich Kaltofen,et al.  Black box linear algebra with the linbox library , 2002 .

[36]  Prasad Tetali,et al.  Distributed Random Walks , 2013, JACM.

[37]  Jon M. Kleinberg,et al.  Protocols and impossibility results for gossip-based communication mechanisms , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[38]  Michael Elkin,et al.  Streaming and fully dynamic centralized algorithms for constructing and maintaining sparse spanners , 2007, TALG.

[39]  Andrew McGregor,et al.  Finding Graph Matchings in Data Streams , 2005, APPROX-RANDOM.

[40]  B. Reed,et al.  The Evolution of the Mixing Rate , 2007, math/0701474.