Nonnegative matrix factorization algorithms for link prediction in temporal networks using graph communicability

Abstract Networks derived from many disciplines, such as social relations, web contents, and cancer progression, are temporal and incomplete. Link prediction in temporal networks is of theoretical interest and practical significance because spurious links are critical for investigating evolving mechanisms. In this study, we address the temporal link prediction problem in networks, i.e. predicting links at time T + 1 based on a given temporal network from time 1 to T . To address the relationships among matrix decomposition-based algorithms, we prove the equivalence between the eigendecomposition and nonnegative matrix factorization (NMF) algorithms, which serves as the theoretical foundation for designing NMF-based algorithms for temporal link prediction. A novel NMF-based algorithm is proposed based on such equivalence. The algorithm factorizes each network to obtain features using graph communicability, and then collapses the feature matrices to predict temporal links. Compared with state-of-the-art methods, the proposed algorithm exhibits significantly improved accuracy by avoiding the collapse of temporal networks. Experimental results of a number of artificial and real temporal networks illustrate that the proposed method is not only more accurate but also more robust than state-of-the-art approaches.

[1]  Ernesto Estrada,et al.  Communicability graph and community structures in complex networks , 2009, Appl. Math. Comput..

[2]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[3]  Xiaoke Ma,et al.  Eigenspaces of networks reveal the overlapping and hierarchical community structure more precisely , 2010 .

[4]  Martin A. Nowak,et al.  Evolutionary dynamics on graphs , 2005, Nature.

[5]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[6]  Michele Benzi,et al.  The Physics of Communicability in Complex Networks , 2011, ArXiv.

[7]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Jianhua Ruan,et al.  A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity , 2013, Bioinform..

[9]  C. Landry,et al.  An in Vivo Map of the Yeast Protein Interactome , 2008, Science.

[10]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[11]  Tao Zhou,et al.  Link prediction in weighted networks: The role of weak ties , 2010 .

[12]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[13]  Gueorgi Kossinets Effects of missing data in social networks , 2006, Soc. Networks.

[14]  I. Jolliffe Principal Component Analysis , 2002 .

[15]  A. Barabasi,et al.  High-Quality Binary Protein Interaction Map of the Yeast Interactome Network , 2008, Science.

[16]  Hyeong Jun An,et al.  Estimating the size of the human interactome , 2008, Proceedings of the National Academy of Sciences.

[17]  Ernesto Estrada,et al.  Communicability in complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[19]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[20]  Tamara G. Kolda,et al.  Link Prediction on Evolving Data Using Matrix and Tensor Factorizations , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[21]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[22]  G. Berx,et al.  Regulatory networks defining EMT during cancer initiation and progression , 2013, Nature Reviews Cancer.

[23]  Gary D Bader,et al.  A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules , 2001, Science.

[24]  Jae-Gil Lee,et al.  A Unifying Framework of Mining Trajectory Patterns of Various Temporal Tightness , 2015, IEEE Transactions on Knowledge and Data Engineering.

[25]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[26]  A. Barabasi,et al.  Network link prediction by global silencing of indirect correlations , 2013, Nature Biotechnology.

[27]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[28]  A. Barabasi,et al.  Uncovering disease-disease relationships through the incomplete interactome , 2015, Science.

[29]  Linyuan Lü,et al.  Similarity index based on local paths for link prediction of complex networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Jing Zhao,et al.  Prediction of Links and Weights in Networks by Reliable Routes , 2015, Scientific Reports.

[31]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[32]  Jari Saramäki,et al.  Temporal Networks , 2011, Encyclopedia of Social Network Analysis and Mining.

[33]  Timothy Ravasi,et al.  From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks , 2013, Scientific Reports.

[34]  Ulrike von Luxburg,et al.  Hitting and commute times in large random neighborhood graphs , 2014, J. Mach. Learn. Res..

[35]  Gene H. Golub,et al.  Matrix computations , 1983 .

[36]  Tamara G. Kolda,et al.  Temporal Link Prediction Using Matrix and Tensor Factorizations , 2010, TKDD.

[37]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[38]  Jennifer Neville,et al.  Temporal-Relational Classifiers for Prediction in Evolving Domains , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[39]  I. Gutman,et al.  Resistance distance and Laplacian spectrum , 2003 .

[40]  Dietrich Lehmann,et al.  Nonsmooth nonnegative matrix factorization (nsNMF) , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Hernán A. Makse,et al.  Influence maximization in complex networks through optimal percolation , 2015, Nature.

[42]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[43]  Zhaoshui He,et al.  Symmetric Nonnegative Matrix Factorization: Algorithms and Applications to Probabilistic Clustering , 2011, IEEE Transactions on Neural Networks.

[44]  A. Barabasi,et al.  Quantifying social group evolution , 2007, Nature.

[45]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[46]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[47]  Long Gao,et al.  Modeling disease progression using dynamics of pathway connectivity , 2014, Bioinform..

[48]  Yan Liu,et al.  Predicting who rated what in large-scale datasets , 2007, SKDD.

[49]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.