An Efficient Map-Reduce Algorithm for the Incremental Computation of All-Pairs Shortest Paths in Social Networks

Today's social networks are getting larger, and the need to analyze datasets with millions of nodes and billions of edges is not uncommon any more. As a network of social relationships evolves by the addition of new nodes and edges, fast algorithms are desirable for the recomputation of key network measures such as actor centrality. The distributed computing paradigm offers a scalable approach to addressing the recomputation challenge. This paper develops a Map-Reduce implementation of an incremental All-Pairs Shortest Path (APSP) algorithm. The incremental nature of the approach allows for performing minimal work in updating centrality measures, while the Map-Reduce implementation makes it scalable to large data. The key idea of the incremental APSP algorithm [1] is based on the efficient use of past information about the shortest paths between any node and the neighbors of the newly added node. A presented parallelized version of the algorithm relies on a three-step iterative execution of the "map" and "reduce" jobs. Experiences with its implementation are reported in application to a real-world dataset containing 7115 nodes. The experimental runs were performed using the Amazon's EMR service.

[1]  Sushant S. Khopkar Incremental algorithms for centrality metric calculations in social network analysis , 2010 .

[2]  Peter S Loubai A NETWORK EVALUATION PROCEDURE , 1967 .

[3]  Valerie King,et al.  Fully dynamic algorithms for maintaining all-pairs shortest paths and transitive closure in digraphs , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[4]  David Eppstein,et al.  Sparsification—a technique for speeding up dynamic graph algorithms , 1997, JACM.

[5]  Giuseppe F. Italiano,et al.  Incremental algorithms for minimal length paths , 1991, SODA '90.

[6]  Thomas W. Reps,et al.  An Incremental Algorithm for a Generalization of the Shortest-Path Problem , 1996, J. Algorithms.

[7]  V. V. Rodionov The parametric problem of shortest distances , 1968 .

[8]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[9]  Robert E. Tarjan,et al.  Maintaining bridge-connected and biconnected components on-line , 1992, Algorithmica.

[10]  Giuseppe F. Italiano,et al.  A new approach to dynamic all pairs shortest paths , 2003, STOC '03.

[11]  Giuseppe F. Italiano,et al.  Fully dynamic all pairs shortest paths with real edge weights , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[12]  Philip N. Klein,et al.  Faster Shortest-Path Algorithms for Planar Graphs , 1997, J. Comput. Syst. Sci..

[13]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[14]  Jimmy J. Lin,et al.  Book Reviews: Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer , 2010, CL.

[15]  Chih-Chung Lin,et al.  On the dynamic shortest path problem , 1991 .