The performance of page rank algorithm under degree preserving perturbations

Page rank is a ranking algorithm based on a random surfer model which is used in Google search engine and many other domains. Because of its initial success in Google search engine, page rank has become the de-facto choice when it comes to ranking nodes in a network structure. Despite the ubiquitous utility of the algorithm, little is known about the effect of topology on the performance of the page rank algorithm. Hence this paper discusses the performance of page rank algorithm under different topological conditions. We use scale-free networks and random networks along with a custom search engine we implemented in order to experimentally prove that the performance of page rank algorithm is deteriorated when the random network is perturbed. In contrast, scale-free topology is proven to be resilient against degree preserving perturbations which aids the page rank algorithm to deliver consistent results across multiple networks that are perturbed to varying proportions. Not only does the top ranking results emerge as stable nodes, but the overall performance of the algorithm is proven to be remarkably resilient which deepens our understanding about the risks in applying page rank algorithm without an initial analysis on the underlying network structure. The results conclusively suggests that while page rank algorithm can be applied to scale-free networks with relatively low risk, applying page rank algorithm to other topologies can be risky as well as misleading. Therefore, the success of the page rank algorithm in real world in search engines such as Google is at least partly due to the fact that the world wide web is a scale-free network. Since the world wide web is constantly evolving, we postulate that if the topological structure of the world wide web changes significantly so that it loses its scale-free nature to some extent, the page rank algorithm will not be as effective.

[1]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[2]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[3]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[4]  Gourab Ghoshal,et al.  Ranking stability and super-stable nodes in complex networks. , 2011, Nature communications.

[5]  Peter Kuhn,et al.  A Stochastic Markov Chain Model to Describe Lung Cancer Growth and Metastasis , 2012, PloS one.

[6]  Ashish Goel,et al.  Fast Incremental and Personalized PageRank over Distributed Main Memory Databases , 2010, ArXiv.

[7]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[8]  Sergei Maslov,et al.  Promise and Pitfalls of Extending Google's PageRank Algorithm to Citation Networks , 2008, The Journal of Neuroscience.

[9]  Shlomo Moran,et al.  Rank-Stability and Rank-Similarity of Link-Based Web Ranking Algorithms in Authority-Connected Graphs , 2005, Information Retrieval.

[10]  Rong Jin,et al.  Learning to Rank by Optimizing NDCG Measure , 2009, NIPS.

[11]  Tie-Yan Liu,et al.  A Theoretical Analysis of NDCG Type Ranking Measures , 2013, COLT.

[12]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[13]  Pradeep Ravikumar,et al.  On NDCG Consistency of Listwise Ranking Methods , 2011, AISTATS.

[14]  Ashish Goel,et al.  Fast Incremental and Personalized PageRank , 2010, Proc. VLDB Endow..

[15]  E. A. Leicht,et al.  Large-scale structure of time evolving citation networks , 2007, 0706.0015.

[16]  Michael I. Jordan,et al.  Stable algorithms for link analysis , 2001, SIGIR '01.

[17]  Steve Chien,et al.  Link Evolution: Analysis and Algorithms , 2004, Internet Math..

[18]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..