Fast incremental SimRank on link-evolving graphs

SimRank is an arresting measure of node-pair similarity based on hyperlinks. It iteratively follows the concept that 2 nodes are similar if they are referenced by similar nodes. Real graphs are often large, and links constantly evolve with small changes over time. This paper considers fast incremental computations of SimRank on link-evolving graphs. The prior approach [12] to this issue factorizes the graph via a singular value decomposition (SVD) first, and then incrementally maintains this factorization for link updates at the expense of exactness. Consequently, all node-pair similarities are estimated in O(r4n2) time on a graph of n nodes, where r is the target rank of the low-rank approximation, which is not negligibly small in practice. In this paper, we propose a novel fast incremental paradigm. (1) We characterize the SimRank update matrix ΔS, in response to every link update, via a rank-one Sylvester matrix equation. By virtue of this, we devise a fast incremental algorithm computing similarities of n2 node-pairs in O(Kn2) time for K iterations. (2) We also propose an effective pruning technique capturing the “affected areas” of ΔS to skip unnecessary computations, without loss of exactness. This can further accelerate the incremental SimRank computation to O(K(nd+|AFF|)) time, where d is the average in-degree of the old graph, and |AFF| (≤ n2) is the size of “affected areas” in ΔS, and in practice, |AFF| ≪ n2. Our empirical evaluations verify that our algorithm (a) outperforms the best known link-update algorithm [12], and (b) runs much faster than its batch counterpart when link updates are small.

[1]  Pavel Berkhin,et al.  A Survey on PageRank Computing , 2005, Internet Math..

[2]  Yasuhiro Fujiwara,et al.  Fast and Exact Top-k Search for Random Walk with Restart , 2012, Proc. VLDB Endow..

[3]  Yizhou Sun,et al.  Fast computation of SimRank for static and dynamic information networks , 2010, EDBT '10.

[4]  Pavel Velikhov,et al.  Accuracy estimate and optimization techniques for SimRank computation , 2008, The VLDB Journal.

[5]  Xuemin Lin,et al.  SimFusion+: extending simfusion towards efficient estimation on large and dynamic networks , 2012, SIGIR '12.

[6]  Hongyan Liu,et al.  Fast Single-Pair SimRank Computation , 2010, SDM.

[7]  Xuemin Lin,et al.  Towards efficient SimRank computation on large networks , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[8]  Yasuhiro Fujiwara,et al.  Efficient search algorithm for SimRank , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[9]  Jaideep Srivastava,et al.  Incremental page rank computation on evolving graphs , 2005, WWW '05.

[10]  Jian Pei,et al.  More is Simpler: Effectively and Efficiently Assessing Node-Pair Similarities Based on Hyperlinks , 2013, Proc. VLDB Endow..

[11]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[12]  Dániel Fogaras,et al.  Practical Algorithms and Lower Bounds for Similarity Search in Massive Graphs , 2007, IEEE Transactions on Knowledge and Data Engineering.

[13]  Christopher Olston,et al.  What's new on the web?: the evolution of the web from a search engine perspective , 2004, WWW '04.

[14]  Hong Chen,et al.  Parallel SimRank computation on large graphs with iterative aggregation , 2010, KDD.

[15]  Xuemin Lin,et al.  IRWR: incremental random walk with restart , 2013, SIGIR.

[16]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[17]  Niklas Carlsson,et al.  Evolution of an online social aggregation network: an empirical study , 2009, IMC '09.

[18]  Ioannis Antonellis,et al.  Simrank++: query rewriting through link analysis of the clickgraph (poster) , 2007, Proc. VLDB Endow..

[19]  Dániel Fogaras,et al.  Scaling link-based similarity search , 2005, WWW '05.

[20]  Laks V. S. Lakshmanan,et al.  On Top-k Structural Similarity Search , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[21]  Sreenivas Gollapudi,et al.  Estimating PageRank on graph streams , 2008, PODS.

[22]  Ashish Goel,et al.  Fast Incremental and Personalized PageRank , 2010, Proc. VLDB Endow..