Discovering missing links in networks using vertex similarity measures

Vertex similarity measure is a useful tool to discover the hidden relationships of vertices in a complex network. We introduce relation strength similarity (RSS), a vertex similarity measure that could better capture potential relationships of real world network structure. RSS is unique in that is is an asymmetric measure which could be used for a more general purpose social network analysis; allows users to explicitly specify the relation strength between neighboring vertices for initialization; and offers a discovery range parameter could be adjusted by users for extended network degree search. To show the potential of vertex similarity measures and the superiority of RSS over other measures, we conduct experiments on two real networks, a biological network and a coauthorship network. Experimental results show that RSS is better in discovering the hidden relationships of the networks.

[1]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Pradeep Ravikumar,et al.  Adaptive Name Matching in Information Integration , 2003, IEEE Intell. Syst..

[3]  Xiaolong Zhang,et al.  SNDocRank: a social network-based video search ranking framework , 2010, MIR '10.

[4]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[5]  George Karypis,et al.  Enhancing link-based similarity through the use of non-numerical labels and prior information , 2010, MLG '10.

[6]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[7]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[8]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[9]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[10]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[11]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[12]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Peter Clark,et al.  Knowledge entry as the graphical assembly of components , 2001, K-CAP '01.

[14]  Xiaolong Zhang,et al.  CollabSeer: a search engine for collaboration discovery , 2011, JCDL '11.

[15]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[16]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[17]  Yizhou Sun,et al.  Fast computation of SimRank for static and dynamic information networks , 2010, EDBT '10.

[18]  M. Migliore,et al.  An algorithm to find all paths between two nodes in a graph , 1990 .

[19]  M. de Rijke,et al.  Discovering missing links in Wikipedia , 2005, LinkKDD '05.

[20]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[21]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[22]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[23]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[24]  E. Levanon,et al.  Preferential attachment in the protein network evolution. , 2003, Physical review letters.

[25]  Xiaolong Zhang,et al.  Capturing missing edges in social networks using vertex similarity , 2011, K-CAP '11.

[26]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[27]  Xiaolong Zhang,et al.  Social network document ranking , 2010, JCDL '10.

[28]  Yizhou Sun,et al.  P-Rank: a comprehensive structural similarity measure over information networks , 2009, CIKM.