A Survey of Link Prediction in Complex Networks

Networks have become increasingly important to model complex systems composed of interacting elements. Network data mining has a large number of applications in many disciplines including protein-protein interaction networks, social networks, transportation networks, and telecommunication networks. Different empirical studies have shown that it is possible to predict new relationships between elements attending to the topology of the network and the properties of its elements. The problem of predicting new relationships in networks is called link prediction. Link prediction aims to infer the behavior of the network link formation process by predicting missed or future relationships based on currently observed connections. It has become an attractive area of study since it allows us to predict how networks will evolve. In this survey, we will review the general-purpose techniques at the heart of the link prediction problem, which can be complemented by domain-specific heuristic methods in practice.

[1]  Tao Zhou,et al.  Link prediction in weighted networks: The role of weak ties , 2010 .

[2]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[3]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[4]  T. Sørensen,et al.  A method of establishing group of equal amplitude in plant sociobiology based on similarity of species content and its application to analyses of the vegetation on Danish commons , 1948 .

[5]  Paul Van Dooren,et al.  A MEASURE OF SIMILARITY BETWEEN GRAPH VERTICES . WITH APPLICATIONS TO SYNONYM EXTRACTION AND WEB SEARCHING , 2002 .

[6]  Ji Zhu,et al.  Link Prediction for Partially Observed Networks , 2013, ArXiv.

[7]  Fei Tan,et al.  Link Prediction in Complex Networks: A Mutual Information Perspective , 2014, PloS one.

[8]  Ziv Bar-Joseph,et al.  Evaluation of different biological data and computational classification methods for use in protein interaction prediction , 2006, Proteins.

[9]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.

[10]  Dimitris Kanellopoulos,et al.  Handling imbalanced datasets: A review , 2006 .

[11]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[12]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[13]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[14]  Pavel Yu. Chebotarev,et al.  Matrix-Forest Theorems , 2006, ArXiv.

[15]  Roded Sharan,et al.  A Propagation-based Algorithm for Inferring Gene-Disease Assocations , 2008, German Conference on Bioinformatics.

[16]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[17]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[18]  Qunsheng Peng,et al.  Vectorization of line drawing image based on junction analysis , 2014, Science China Information Sciences.

[19]  Ciro Cattuto,et al.  What's in a crowd? Analysis of face-to-face behavioral networks , 2010, Journal of theoretical biology.

[20]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[21]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[22]  Jinyan Li,et al.  Assessing and predicting protein interactions using both local and global network topological metrics. , 2008, Genome informatics. International Conference on Genome Informatics.

[23]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[24]  Linyuan Lu,et al.  Link prediction based on local random walk , 2010, 1001.2467.

[25]  Pabitra Mitra,et al.  Similarity Measures for Link Prediction Using Power Law Degree Distribution , 2013, ICONIP.

[26]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[27]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[28]  Albert-László Barabási,et al.  Hierarchical organization in complex networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Linyuan Lü,et al.  Similarity index based on local paths for link prediction of complex networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[31]  Charu C. Aggarwal,et al.  When will it happen?: relationship prediction in heterogeneous information networks , 2012, WSDM '12.

[32]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[33]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[34]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[35]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[36]  Jianquan Liu,et al.  Link prediction: the power of maximal entropy random walk , 2011, CIKM '11.

[37]  David Liben-Nowell,et al.  An algorithmic approach to social networks , 2005 .

[38]  Hsinchun Chen,et al.  The topology of dark networks , 2008, Commun. ACM.

[39]  Bin Wu,et al.  Link Prediction Based on Local Information , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[40]  Padhraic Smyth,et al.  Prediction and ranking algorithms for event-based network data , 2005, SKDD.

[41]  Gueorgi Kossinets,et al.  Empirical Analysis of an Evolving Social Network , 2006, Science.

[42]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[43]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[44]  Wei Tang,et al.  Supervised Link Prediction Using Multiple Sources , 2010, 2010 IEEE International Conference on Data Mining.

[45]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[46]  Ronald Rousseau,et al.  Similarity measures in scientometric research: The Jaccard index versus Salton's cosine formula , 1989, Inf. Process. Manag..

[47]  Panagiotis Symeonidis,et al.  Transitive node similarity: predicting and recommending links in signed social networks , 2014, World Wide Web.

[48]  Bo Yang,et al.  Graph-based features for supervised link prediction , 2011, The 2011 International Joint Conference on Neural Networks.

[49]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[50]  Zan Huang Link Prediction Based on Graph Topology: The Predictive Value of Generalized Clustering Coefficient , 2010 .

[51]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[52]  B. McCune,et al.  Analysis of Ecological Communities , 2002 .

[53]  Peng Wang,et al.  Link prediction in social networks: the state-of-the-art , 2014, Science China Information Sciences.

[54]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[55]  Mehrbakhsh Nilashi,et al.  Collaborative filtering recommender systems , 2013 .

[56]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[57]  Christopher M. Danforth,et al.  An evolutionary algorithm approach to link prediction in dynamic social networks , 2013, J. Comput. Sci..

[58]  Zhen Liu,et al.  Link prediction in complex networks: A local naïve Bayes model , 2011, ArXiv.

[59]  Ryutaro Ichise,et al.  Finding Experts by Link Prediction in Co-authorship Networks , 2007, FEWS.

[60]  Rayleigh The Problem of the Random Walk , 1905, Nature.

[61]  Srinivasan Parthasarathy,et al.  Local Probabilistic Models for Link Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[62]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[63]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[64]  Charles Elkan,et al.  Link Prediction via Matrix Factorization , 2011, ECML/PKDD.

[65]  Lise Getoor,et al.  Collective entity resolution in relational data , 2007, TKDD.

[66]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[67]  Z. Burda,et al.  Localization of the maximal entropy random walk. , 2008, Physical review letters.

[68]  Toon Calders,et al.  Depth-First Non-Derivable Itemset Mining , 2005, SDM.

[69]  Roberto Tamassia,et al.  Handbook on Graph Drawing and Visualization , 2013 .

[70]  Limsoon Wong,et al.  Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions , 2006, BioDM.

[71]  Alexis Papadimitriou,et al.  Fast and accurate link prediction in social networking systems , 2012, J. Syst. Softw..

[72]  Fernando Berzal Galiano,et al.  Adaptive degree penalization for link prediction , 2016, J. Comput. Sci..

[73]  Hsinchun Chen,et al.  Link prediction approach to collaborative filtering , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[74]  Timothy Ravasi,et al.  From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks , 2013, Scientific Reports.

[75]  Srikanta J. Bedathur,et al.  Towards time-aware link prediction in evolving social networks , 2009, SNA-KDD '09.

[76]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[77]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[78]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[79]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[80]  Mohammad Al Hasan,et al.  A Survey of Link Prediction in Social Networks , 2011, Social Network Data Analytics.

[81]  James Demmel,et al.  Accurate Singular Values of Bidiagonal Matrices , 1990, SIAM J. Sci. Comput..

[82]  Valdis E. Krebs,et al.  Mapping Networks of Terrorist Cells , 2001 .

[83]  Armando Blanco,et al.  ProphNet: A generic prioritization method through propagation of information , 2014, BMC Bioinformatics.

[84]  Jérôme Kunegis,et al.  Learning spectral graph transformations for link prediction , 2009, ICML '09.

[85]  Roberto Tamassia,et al.  Handbook of Graph Drawing and Visualization (Discrete Mathematics and Its Applications) , 2007 .

[86]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[87]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[88]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.