Using Link Prediction to Estimate the Collaborative Influence of Researchers

The influence of a particular individual in a scientific collaboration network could be measured in several ways. Estimating influence commonly requires calculating computationally costly global measures, which may be impractical on networks with hundreds of thousands of vertices. In this paper, we introduce new local measures to estimate the collaborative influence of individual researchers in a collaboration network. Our approach is based on the link prediction technique, and its underlying rationale is to assess how the presence/absence of a researcher affects the link prediction outcome in the network as a whole. It is natural to assume that the absence of a researcher with strong influence in the network will cause negative impact in the correct link prediction. Scientists are represented as vertices in the collaboration graph, and a vertex removal and corresponding link prediction process are performed iteratively for all vertices, each vertex being handled independently. The SVM supervised learning model has been adopted as link predictor. The proposed approach has been tested on real collaboration networks relative to multiple time periods, processing the networks in order to assign more relevance to recent than to older collaborations. The experimental tests suggest that our measure of impact on link prediction has high negative correlation with standard vertex importance measures such as between ness and closeness centrality.

[1]  Tony Hey,et al.  The Fourth Paradigm , 2009 .

[2]  Mohammad Al Hasan,et al.  A Survey of Link Prediction in Social Networks , 2011, Social Network Data Analytics.

[3]  Ernesto Estrada,et al.  Communicability in complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  S. Mahadevan,et al.  Identifying influential nodes in weighted networks based on evidence theory , 2013 .

[7]  J. A. Rodríguez-Velázquez,et al.  Subgraph centrality in complex networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Tamara G. Kolda,et al.  Link Prediction on Evolving Data Using Matrix and Tensor Factorizations , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[9]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[10]  Ke Hu,et al.  Link Prediction in Complex Networks by Multi Degree Preferential-Attachment Indices , 2012, ArXiv.

[11]  Linyuan Lu,et al.  Role of weak ties in link prediction of complex networks , 2009, CIKM-CNIKM.

[12]  Lise Getoor,et al.  Combining Collective Classification and Link Prediction , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[13]  Roberto Marcondes Cesar Junior,et al.  scriptLattes: an open-source knowledge extraction system from the Lattes platform , 2009, Journal of the Brazilian Computer Society.

[14]  Jiawei Han,et al.  Citation Prediction in Heterogeneous Bibliographic Networks , 2012, SDM.

[15]  Alireza Abbasi,et al.  h-Type hybrid centrality measures for weighted networks , 2013, Scientometrics.

[16]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[17]  Chen Gui,et al.  Link Prediction Based on Weighted Networks , 2012, AsiaSim.

[18]  Wei Chu,et al.  Stochastic Relational Models for Discriminative Link Prediction , 2006, NIPS.

[19]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[20]  Bin Zhu,et al.  The Hl-index: improvement of H-index based on quality of citing papers , 2013, Scientometrics.

[21]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[22]  John Skvoretz,et al.  Node centrality in weighted networks: Generalizing degree and shortest paths , 2010, Soc. Networks.

[23]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[24]  Charles Elkan,et al.  Link Prediction via Matrix Factorization , 2011, ECML/PKDD.

[25]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[27]  Jennifer Neville,et al.  Temporal-Relational Classifiers for Prediction in Evolving Domains , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[28]  Jörn Altmann,et al.  Evaluating scholars based on their academic collaboration activities: two indices, the RC-index and the CC-index, for quantifying collaboration activities of researchers and scientific communities , 2010, Scientometrics.

[29]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .

[30]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[31]  Roberto Marcondes Cesar Junior,et al.  Towards Automatic Discovery of co-authorship Networks in the Brazilian Academic Areas , 2011, 2011 IEEE Seventh International Conference on e-Science Workshops.

[32]  Weiguo Fan,et al.  C-index: A weighted network node centrality measure for collaboration competence , 2013, J. Informetrics.

[33]  M. Newman,et al.  Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Tamara G. Kolda,et al.  Temporal Link Prediction Using Matrix and Tensor Factorizations , 2010, TKDD.

[35]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[36]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[37]  Srinivasan Parthasarathy,et al.  Local Probabilistic Models for Link Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[38]  Charu C. Aggarwal,et al.  When will it happen?: relationship prediction in heterogeneous information networks , 2012, WSDM '12.

[39]  Lawrence B. Holder,et al.  Discovering Structural Anomalies in Graph-Based Data , 2007 .

[40]  S. Mahadevan,et al.  A modified evidential methodology of identifying influential nodes in weighted networks , 2013 .

[41]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[42]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[43]  Paul T. Groth,et al.  Measuring the Dynamic Bi-directional Influence between Content and Social Networks , 2010, International Semantic Web Conference.

[44]  Jano Moreira de Souza,et al.  Group and link analysis of multi-relational scientific social networks , 2013, J. Syst. Softw..

[45]  Inderjit S. Dhillon,et al.  Multi-scale link prediction , 2012, CIKM '12.

[46]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[47]  Isaac Olusegun Osunmakinde,et al.  Temporality in Link Prediction: Understanding Social Complexity , 2009 .

[48]  John G. Breslin,et al.  Social Semantic Web , 2009, Handbook of Semantic Web Technologies.

[49]  Janardhan Rao Doppa,et al.  Chance-Constrained Programs for Link Prediction , 2009 .