Ranking nodes in growing networks: When PageRank fails

PageRank is arguably the most popular ranking algorithm which is being applied in real systems ranging from information to biological and infrastructure networks. Despite its outstanding popularity and broad use in different areas of science, the relation between the algorithm’s efficacy and properties of the network on which it acts has not yet been fully understood. We study here PageRank’s performance on a network model supported by real data, and show that realistic temporal effects make PageRank fail in individuating the most valuable nodes for a broad range of model parameters. Results on real data are in qualitative agreement with our model-based findings. This failure of PageRank reveals that the static approach to information filtering is inappropriate for a broad class of growing systems, and suggest that time-dependent algorithms that are based on the temporal linking patterns of these systems are needed to better rank the nodes.

[1]  Massimo Franceschet,et al.  PageRank , 2010, Commun. ACM.

[2]  G. B. A. Barab'asi Competition and multiscaling in evolving networks , 2000, cond-mat/0011029.

[3]  R. Pastor-Satorras,et al.  Activity driven modeling of time varying networks , 2012, Scientific Reports.

[4]  Giulio Cimini,et al.  Temporal effects in the growth of networks , 2011, Physical review letters.

[5]  Priscilla S. Markwood,et al.  The Long Tail: Why the Future of Business is Selling Less of More , 2006 .

[6]  Albert-László Barabási,et al.  The origin of bursts and heavy tails in human dynamics , 2005, Nature.

[7]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Matús Medo Network-based information filtering algorithms: ranking and recommendation , 2012, ArXiv.

[9]  Nima Sarshar,et al.  Experience versus talent shapes the structure of the Web , 2008, Proceedings of the National Academy of Sciences.

[10]  Matús Medo,et al.  Statistical validation of high-dimensional models of growing networks , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Komal Kumar Bhatia,et al.  Page Ranking Algorithms: A Survey , 2009, 2009 IEEE International Advance Computing Conference.

[12]  Pavel Berkhin,et al.  A Survey on PageRank Computing , 2005, Internet Math..

[13]  Albert-László Barabási,et al.  Quantifying Long-Term Scientific Impact , 2013, Science.

[14]  Santo Fortunato,et al.  Characterizing and modeling the dynamics of online popularity , 2010, Physical review letters.

[15]  Sergio Gómez,et al.  Ranking in interconnected multilayer networks reveals versatile nodes , 2015, Nature Communications.

[16]  Sergey N. Dorogovtsev,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW (Physics) , 2003 .

[17]  Sergei Maslov,et al.  Ranking scientific publications using a model of network traffic , 2006, ArXiv.

[18]  Ying Ding,et al.  Discovering author impact: A PageRank perspective , 2010, Inf. Process. Manag..

[19]  Piet Van Mieghem,et al.  Are friends overrated? A study for the social news aggregator Digg.com , 2012, Comput. Commun..

[20]  Sergei Maslov,et al.  Finding scientific gems with Google's PageRank algorithm , 2006, J. Informetrics.

[21]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[22]  James Hendler,et al.  Google’s PageRank and Beyond: The Science of Search Engine Rankings , 2007 .

[23]  Junghoo Cho,et al.  Impact of search engines on page popularity , 2004, WWW '04.

[24]  Chris Anderson,et al.  The Long Tail: Why the Future of Business is Selling Less of More , 2006 .

[25]  Matús Medo,et al.  The effect of the initial network configuration on preferential attachment , 2013, ArXiv.

[26]  Stefano Allesina,et al.  Googling Food Webs: Can an Eigenvector Measure Species' Importance for Coextinctions? , 2009, PLoS Comput. Biol..

[27]  Virginia Dom,et al.  Ranking species in mutualistic networks , 2015 .

[28]  M. A. Muñoz,et al.  Ranking species in mutualistic networks , 2015, Scientific Reports.

[29]  Vince Grolmusz,et al.  When the Web meets the cell: using personalized PageRank for analyzing protein interaction networks , 2011, Bioinform..

[30]  Yi-Cheng Zhang,et al.  Identification and modeling of discoverers in online social systems , 2015, ArXiv.

[31]  Alessandro Vespignani,et al.  Large-scale topological and dynamical properties of the Internet. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[33]  Johan Bollen,et al.  Journal status , 2006, Scientometrics.

[34]  Kostas Tsioutsiouliklis,et al.  \Googlearchy": How a Few Heavily-Linked Sites Dominate Politics on the Web , 2003 .

[35]  Marián Boguñá,et al.  Approximating PageRank from In-Degree , 2007, WAW.

[36]  Bin Jiang,et al.  Self-organized natural roads for predicting traffic flow: a sensitivity study , 2008, 0804.1630.

[37]  Yi-Cheng Zhang,et al.  Leaders in Social Networks, the Delicious Case , 2011, PloS one.

[38]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[39]  A Vespignani,et al.  Topical interests and the mitigation of search engine bias , 2006, Proceedings of the National Academy of Sciences.

[40]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[41]  Sophie Ahrens,et al.  Recommender Systems , 2012 .

[42]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[43]  Guido Caldarelli,et al.  A New Metrics for Countries' Fitness and Products' Complexity , 2012, Scientific Reports.

[44]  Yi Zhao,et al.  Bringing PageRank to the citation analysis , 2008, Inf. Process. Manag..

[45]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[46]  Matúš Medo,et al.  Identification and impact of discoverers in online social systems , 2015, Scientific Reports.

[47]  Gourab Ghoshal,et al.  Ranking stability and super-stable nodes in complex networks. , 2011, Nature communications.

[48]  James Caverlee,et al.  PageRank for ranking authors in co-citation networks , 2009, J. Assoc. Inf. Sci. Technol..

[49]  Sergei Maslov,et al.  Promise and Pitfalls of Extending Google's PageRank Algorithm to Citation Networks , 2008, The Journal of Neuroscience.

[50]  Amy Nicole Langville,et al.  Google's PageRank and beyond - the science of search engine rankings , 2006 .