Efficient Algorithms for Sampling and Clustering of Large Nonuniform Networks

We propose efficient algorithms for two key tasks in the analy sis of large nonuniform networks: uniform node sampling and cluster detection. Our sampling technique is based on augmenting a simple, but slowly mixing uniform MCMC sampler with a regular random walk in order to speed up its convergence; however the combined MCMC chain is then only sampled when it is in its “uniform sampling” mode. Our clustering algorithm determines the relevant neighbourhood of a given node u in the network by first estimating the Fiedler vector of a Dirichlet matrix wit h u fixed at zero potential, and then finding the neighbourhood of u that yields a minimal weighted Cheeger ratio, where the edge weights are determined by differences in the estimated node potentials. Both of our algorithms are based on local computations, i.e. operations on the full adjacenc y matrix of the network are not used. The algorithms are evaluated experimentally using three types of nonuniform networks: Dorogovtsev-Goltsev-Mendes “pseudofractal graphs”, scientific collaboration networks, and r andomised “caveman graphs”.

[1]  Fang Wu,et al.  Finding communities in linear time: a physics approach , 2003, ArXiv.

[2]  Fan Chung Graham,et al.  A chip-firing game and Dirichlet eigenvalues , 2002, Discret. Math..

[3]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Santosh S. Vempala,et al.  On clusterings: Good, bad and spectral , 2004, JACM.

[5]  宮沢 政清,et al.  P. Bremaud 著, Markov Chains, (Gibbs fields, Monte Carlo simulation and Queues), Springer-Verlag, 1999年 , 2000 .

[6]  Shang-Hua Teng,et al.  Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[7]  Albert-Laszlo Barabasi,et al.  Deterministic scale-free networks , 2001 .

[8]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[9]  PothenAlex,et al.  Partitioning sparse matrices with eigenvectors of graphs , 1990 .

[10]  Olle Häggström Finite Markov Chains and Algorithmic Applications , 2002 .

[11]  Christos Gkantsidis,et al.  Spectral analysis of Internet topologies , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[12]  M. Fiedler A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory , 1975 .

[13]  David M. Pennock,et al.  Methods for Sampling Pages Uniformly from the World Wide Web , 2001 .

[14]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[15]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[17]  Satu Elisa Virtanen,et al.  PROPERTIES OF NONUNIFORM RANDOM GRAPH MODELS , 2003 .

[18]  David Kempe,et al.  A decentralized algorithm for spectral analysis , 2004, STOC '04.

[19]  G. Roberts,et al.  An Approach to Diagnosing Total Variation Convergence of MCMC Algorithms , 1997 .

[20]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[21]  Satu Virtanen,et al.  Clustering the Chilean Web , 2003, Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726).

[22]  Stephen P. Boyd,et al.  Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..

[23]  S. N. Dorogovtsev,et al.  Pseudofractal scale-free web. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Marc Najork,et al.  On near-uniform URL sampling , 2000, Comput. Networks.

[25]  Steve Chien,et al.  Approximating Aggregate Queries about Web Pages via Random Walks , 2000, VLDB.

[26]  Bart Selman,et al.  Natural communities in large linked networks , 2003, KDD '03.

[27]  Stephen Guattery,et al.  On the Quality of Spectral Separators , 1998, SIAM J. Matrix Anal. Appl..