An information-theoretic model for link prediction in complex networks

Various structural features of networks have been applied to develop link prediction methods. However, because different features highlight different aspects of network structural properties, it is very difficult to benefit from all of the features that might be available. In this paper, we investigate the role of network topology in predicting missing links from the perspective of information theory. In this way, the contributions of different structural features to link prediction are measured in terms of their values of information. Then, an information-theoretic model is proposed that is applicable to multiple structural features. Furthermore, we design a novel link prediction index, called Neighbor Set Information (NSI), based on the information-theoretic model. According to our experimental results, the NSI index performs well in real-world networks, compared with other typical proximity indices.

[1]  Sophie Ahrens,et al.  Recommender Systems , 2012 .

[2]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[3]  Linyuan Lü,et al.  Similarity index based on local paths for link prediction of complex networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[5]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[6]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[7]  Timothy Ravasi,et al.  From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks , 2013, Scientific Reports.

[8]  Hui Chen,et al.  A literature survey on smart cities , 2015, Science China Information Sciences.

[9]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[10]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[11]  Gueorgi Kossinets Effects of missing data in social networks , 2006, Soc. Networks.

[12]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[13]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[14]  Fei Tan,et al.  Link Prediction in Complex Networks: A Mutual Information Perspective , 2014, PloS one.

[15]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[17]  S. Brenner,et al.  The structure of the nervous system of the nematode Caenorhabditis elegans. , 1986, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[20]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[21]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[22]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[23]  Zhen Liu,et al.  Local degree blocking model for link prediction in complex networks. , 2014, Chaos.

[24]  V Latora,et al.  Efficient behavior of small-world networks. , 2001, Physical review letters.

[25]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[26]  Jaideep Srivastava,et al.  Correlations between Community Structure and Link Formation in Complex Networks , 2013, PloS one.

[27]  Zhen Liu,et al.  Local degree blocking model for missing link prediction in complex networks , 2014, ArXiv.

[28]  Tao Zhou,et al.  Evaluating network models: A likelihood analysis , 2011, ArXiv.

[29]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[30]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[32]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[33]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[34]  Albert,et al.  Topology of evolving networks: local events and universality , 2000, Physical review letters.

[35]  Peng Wang,et al.  Link prediction in social networks: the state-of-the-art , 2014, Science China Information Sciences.

[36]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[37]  Zhen Liu,et al.  Link prediction in complex networks: A local naïve Bayes model , 2011, ArXiv.

[38]  Ratul Mahajan,et al.  Measuring ISP topologies with Rocketfuel , 2004, IEEE/ACM Transactions on Networking.

[39]  Linyuan Lü,et al.  Toward link predictability of complex networks , 2015, Proceedings of the National Academy of Sciences.

[40]  P. Jaccard Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines , 1901 .

[41]  Linyuan Lu,et al.  Potential Theory for Directed Networks , 2012, PloS one.

[42]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[43]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[44]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[45]  Dongyun Yi,et al.  Mining the evolution of networks using Local-Cross-Communities-Paradigm , 2013 .

[46]  Hui Tian,et al.  Hidden link prediction based on node centrality and weak ties , 2013 .