A Local Clustering Algorithm for Massive Graphs and Its Application to Nearly Linear Time Graph Partitioning

We study the design of local algorithms for massive graphs. A local graph algorithm is one that finds a solution containing or near a given vertex without looking at the whole graph. We present a local clustering algorithm. Our algorithm finds a good cluster---a subset of vertices whose internal connections are significantly richer than its external connections---near a given vertex. The running time of our algorithm, when it finds a nonempty local cluster, is nearly linear in the size of the cluster it outputs. The running time of our algorithm also depends polylogarithmically on the size of the graph and polynomially on the conductance of the cluster it produces. Our clustering algorithm could be a useful primitive for handling massive graphs, such as social networks and web-graphs. As an application of this clustering algorithm, we present a partitioning algorithm that finds an approximate sparsest cut with nearly optimal balance. Our algorithm takes time nearly linear in the number edges of the graph....

[1]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[2]  Yuval Peres,et al.  Finding sparse cuts locally using evolving sets , 2008, STOC '09.

[3]  Shang-Hua Teng,et al.  Finding local communities in protein networks , 2009, BMC Bioinformatics.

[4]  Jirí Síma,et al.  On the NP-Completeness of Some Graph Cluster Measures , 2005, SOFSEM.

[5]  Weiqiang Wang,et al.  A metascalable computing framework for large spatiotemporal-scale atomistic simulations , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[6]  Frank Thomson Leighton,et al.  Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms , 1999, JACM.

[7]  Nisheeth K. Vishnoi,et al.  On partitioning graphs via single commodity flows , 2008, STOC.

[8]  Elad Hazan,et al.  O(sqrt(log(n)) Approximation to SPARSEST CUT in Õ(n2) Time , 2004, SIAM J. Comput..

[9]  Satish Rao,et al.  Graph partitioning using single commodity flows , 2006, STOC '06.

[10]  Shang-Hua Teng,et al.  Spectral Sparsification of Graphs , 2008, SIAM J. Comput..

[11]  Nisheeth K. Vishnoi,et al.  Towards an SDP-based approach to spectral methods: a nearly-linear-time algorithm for graph partitioning and decomposition , 2010, SODA '11.

[12]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[13]  David F. Gleich,et al.  Algorithms and Models for the Web Graph , 2014, Lecture Notes in Computer Science.

[14]  Sanjeev Arora,et al.  0(sqrt (log n)) Approximation to SPARSEST CUT in Õ(n2) Time , 2004, FOCS.

[15]  Shang-Hua Teng,et al.  Nearly-Linear Time Algorithms for Preconditioning and Solving Symmetric, Diagonally Dominant Linear Systems , 2006, SIAM J. Matrix Anal. Appl..

[16]  Shang-Hua Teng,et al.  Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[17]  Miklós Simonovits,et al.  Random Walks in a Convex Body and an Improved Volume Algorithm , 1993, Random Struct. Algorithms.

[18]  Fan Chung Graham,et al.  Local Partitioning for Directed Graphs Using PageRank , 2007, WAW.

[19]  Sanjeev Arora,et al.  A combinatorial, primal-dual approach to semidefinite programs , 2007, STOC '07.

[20]  Andy Haas,et al.  Immersive and Interactive Exploration of Billion-Atom Systems , 2003, Presence: Teleoperators & Virtual Environments.

[21]  Reid Andersen,et al.  A local algorithm for finding dense subgraphs , 2007, TALG.

[22]  Sanjeev Arora,et al.  O(/spl radic/log n) approximation to SPARSEST CUT in O/spl tilde/(n/sup 2/) time , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[23]  Jonah Sherman,et al.  Breaking the Multicommodity Flow Barrier for O(vlog n)-Approximations to Sparsest Cut , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[24]  Elad Hazan,et al.  O(√log n) approximation to SPARSEST CUT in Õ(n 2) time , 2004, IEEE Annual Symposium on Foundations of Computer Science.

[25]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[26]  Vahab S. Mirrokni,et al.  Local Computation of PageRank Contributions , 2007, WAW.

[27]  Jure Leskovec,et al.  Planetary-scale views on a large instant-messaging network , 2008, WWW.

[28]  Satish Rao,et al.  Expander flows, geometric embeddings and graph partitioning , 2004, STOC '04.

[29]  Miklós Simonovits,et al.  The mixing rate of Markov chains, an isoperimetric inequality, and computing the volume , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[30]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[31]  Fan Chung Graham,et al.  Local Partitioning for Directed Graphs Using PageRank , 2007, Internet Math..

[32]  Santosh S. Vempala,et al.  On clusterings: Good, bad and spectral , 2004, JACM.

[33]  Milena Mihail,et al.  Conductance and convergence of Markov chains-a combinatorial treatment of expanders , 1989, 30th Annual Symposium on Foundations of Computer Science.

[34]  Antonio Gulli,et al.  The indexable web is more than 11.5 billion pages , 2005, WWW '05.