Bounded link prediction for very large networks

Evaluating link prediction methods is a hard task in very large complex networks due to the prohibitive computational cost. However, if we consider the lower bound of node pairs’ similarity scores, this task can be greatly optimized. In this paper, we study CN index in the bounded link prediction framework, which is applicable to enormous heterogeneous networks. Specifically, we propose a fast algorithm based on the parallel computing scheme to obtain all node pairs with CN values larger than the lower bound. Furthermore, we propose a general measurement, called self-predictability, to quantify the performance of similarity indices in link prediction, which can also indicate the link predictability of networks with respect to given similarity indices.

[1]  Qiang Yang,et al.  Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: Foreword , 2008, Knowledge Discovery and Data Mining.

[2]  M. Larkin Software , 2003, The Lancet.

[3]  Linyuan Lu,et al.  Link prediction based on local random walk , 2010, 1001.2467.

[4]  Yizhou Li,et al.  Prediction of adverse drug reactions by a network based external link prediction method , 2013 .

[5]  Harry Eugene Stanley,et al.  Reputation and impact in academic careers , 2013, Proceedings of the National Academy of Sciences.

[6]  Stan Matwin,et al.  Fast Unsupervised Online Drift Detection Using Incremental Kolmogorov-Smirnov Test , 2016, KDD.

[7]  Seymour Geisser,et al.  8. Predictive Inference: An Introduction , 1995 .

[8]  Massimo Marchiori,et al.  How the science of complex networks can help developing strategies against terrorism , 2004 .

[9]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[10]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[11]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[12]  B. Snel,et al.  Function prediction and protein networks. , 2003, Current opinion in cell biology.

[13]  Huan Liu,et al.  Social recommendation: a review , 2013, Social Network Analysis and Mining.

[14]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[15]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[16]  Wen-Xu Wang,et al.  Time-series–based prediction of complex oscillator networks via compressive sensing , 2011 .

[17]  Z. Wang,et al.  The structure and dynamics of multilayer networks , 2014, Physics Reports.

[18]  A. Barabasi,et al.  Network link prediction by global silencing of indirect correlations , 2013, Nature Biotechnology.

[19]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[20]  Sophie Ahrens,et al.  Recommender Systems , 2012 .

[21]  Alexis Papadimitriou,et al.  Fast and accurate link prediction in social networking systems , 2012, J. Syst. Softw..

[22]  M. Lawera Predictive inference : an introduction , 1995 .

[23]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Steven D. Gribble,et al.  Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation , 2012 .

[25]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[26]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[27]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[28]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[29]  Albert-László Barabási,et al.  Quantifying Long-Term Scientific Impact , 2013, Science.

[30]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[31]  Pavel Yu. Chebotarev,et al.  The Matrix-Forest Theorem and Measuring Relations in Small Social Groups , 2006, ArXiv.

[32]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[33]  Linyuan Lü,et al.  Toward link predictability of complex networks , 2015, Proceedings of the National Academy of Sciences.

[34]  Michael Chertkov,et al.  Message passing for optimization and control of a power grid: model of a distribution system with redundancy. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[36]  Valdis E. Krebs,et al.  Uncloaking Terrorist Networks , 2002, First Monday.

[37]  Chid Apte,et al.  Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 21-24, 2011 , 2011, KDD.

[38]  Sabrina S Wilson Radiology , 1938, Glasgow Medical Journal.

[39]  Ido Guy,et al.  Proceedings of the 16th ACM Conference on Recommender Systems , 2012, RecSys 2012.

[40]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Mohammad Al Hasan,et al.  A Survey of Link Prediction in Social Networks , 2011, Social Network Data Analytics.

[42]  J. Rogers Chaos , 1876 .

[43]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[44]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[45]  M. De Domenico,et al.  The Anatomy of a Scientific Rumor , 2013, Scientific Reports.

[46]  Markus Zanker,et al.  Proceedings of the fourth ACM conference on Recommender systems , 2010, RecSys 2010.

[47]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.