Finding missing edges and communities in incomplete networks

Many algorithms have been proposed for predicting missing edges in networks, but they do not usually take account of which edges are missing. We focus on networks which have missing edges of the form that is likely to occur in real networks, and compare algorithms that find these missing edges. We also investigate the effect of this kind of missing data on community detection algorithms.

[1]  A Díaz-Guilera,et al.  Self-similar community structure in a network of human interactions. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[3]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[4]  Paul Erdös,et al.  On random graphs, I , 1959 .

[5]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[6]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[7]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  P. V. Marsden,et al.  NETWORK DATA AND MEASUREMENT , 1990 .

[9]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[10]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[11]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[12]  Krista Gile Model-based Assessment of the Impact of Missing Data on Inference for Networks 1 , 2006 .

[13]  John L.P. Thompson,et al.  Missing data , 2004, Amyotrophic lateral sclerosis and other motor neuron disorders : official publication of the World Federation of Neurology, Research Group on Motor Neuron Diseases.

[14]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[15]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[16]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[17]  Linyuan Lu,et al.  Uncovering missing links with cold ends , 2011, ArXiv.

[18]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[19]  Linyuan Lü,et al.  Similarity index based on local paths for link prediction of complex networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Steve Gregory,et al.  Detecting communities in networks by merging cliques , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[21]  Jean-Loup Guillaume,et al.  Fast unfolding of community hierarchies in large networks , 2008, ArXiv.

[22]  Thomas W. Valente,et al.  The stability of centrality measures when networks are sampled , 2003, Soc. Networks.

[23]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[24]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[25]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[26]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[27]  Samuel Leinhardt,et al.  The structural implications of measurement error in sociometry , 1973 .

[28]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[30]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[31]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[32]  Steve Gregory,et al.  Finding overlapping communities in networks by label propagation , 2009, ArXiv.

[33]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[34]  Gueorgi Kossinets Effects of missing data in social networks , 2006, Soc. Networks.

[35]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .

[37]  F. Göbel,et al.  Random walks on graphs , 1974 .

[38]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[39]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[40]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[41]  H. Dawah,et al.  Structure of the parasitoid communities of grass-feeding chalcid wasps , 1995 .

[42]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[43]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  Martin G. Everett,et al.  Two algorithms for computing regular equivalence , 1993 .

[45]  L. Amaral,et al.  The web of human sexual contacts , 2001, Nature.

[46]  A. Barab,et al.  Evolution of the social network of scienti $ c collaborations , 2002 .

[47]  Valdis E. Krebs,et al.  Mapping Networks of Terrorist Cells , 2001 .

[48]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  B. Wang,et al.  Information filtering based on transferring similarity. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[50]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[51]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[52]  Kathleen M. Carley,et al.  On the robustness of centrality measures under conditions of imperfect data , 2006, Soc. Networks.