Incremental Streaming Graph Partitioning

Graph partitioning is an NP-hard problem whose efficient approximation has long been a subject of interest. The I/O bounds of contemporary computing environments favor incremental or streaming graph partitioning methods. Methods have sought a balance between latency, simplicity, accuracy, and memory size. In this paper, we apply an incremental approach to streaming partitioning that tracks changes with a lightweight proxy to trigger partitioning as the clustering error increases. We evaluate its performance on the DARPA/MIT Graph Challenge streaming stochastic block partition dataset, and find that it can dramatically reduce the invocation of partitioning, which can provide an order of magnitude speedup.

[1]  William Song,et al.  Streaming graph challenge: Stochastic block partition , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[2]  Shang-Hua Teng,et al.  A Local Clustering Algorithm for Massive Graphs and Its Application to Nearly Linear Time Graph Partitioning , 2008, SIAM J. Comput..

[3]  Andrew Knyazev,et al.  Preconditioned spectral clustering for stochastic block partition streaming graph challenge (Preliminary version at arXiv.) , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[4]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[5]  Sanjay Ranka,et al.  Parallel Incremental Graph Partitioning , 1997, IEEE Trans. Parallel Distributed Syst..

[6]  T. D. Morley,et al.  Eigenvalues of the Laplacian of a graph , 1985 .

[7]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[8]  Gabriel Kliot,et al.  Streaming graph partitioning for large distributed graphs , 2012, KDD.

[9]  Andrew V. Knyazev,et al.  Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method , 2001, SIAM J. Sci. Comput..

[10]  Lexing Ying,et al.  Robust and efficient multi-way spectral clustering , 2016, ArXiv.

[11]  Tiago P. Peixoto Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Austin R. Benson,et al.  Incrementally Updated Spectral Embeddings , 2019, ArXiv.

[13]  J. Matou On Approximate Geometric K-clustering , 1999 .