Uncovering mechanisms of co-authorship evolution by multirelations-based link prediction

The ways in which a hybrid mechanism jointly influences co-authorship evolution is clarified.The mechanisms are categorized into different groups and the contributions of different categorized mechanisms are computed.An improved meta-path based model called multirelations-based link prediction in heterogeneous bibliographic network is proposed.Experiments presented in Library and Information Science (LIS) show the co-authorship prediction accuracy is significantly improved with a hybrid mechanism denoted by a combination of predictors with weights. A single mechanism is insufficient for providing a comprehensive understanding of co-authorship formation and evolution because people choose to co-author with diverse motivations. The ways in which a hybrid mechanism jointly influences co-authorship evolution is not yet very clear, which leads to the following research questions: (1) how does each mechanism leverage with each other and how can multiple mechanisms be combined into the best hybrid mechanism? (2) How can the mechanisms be categorized into different groups and how does each group contribute to co-authorship evolution? This paper addresses these questions by using an improved meta-path based model called multirelations-based link prediction, which denotes every mechanism and their combinations as predictors in heterogeneous networks and quantitatively evaluates predictors via link prediction. Experiments are conducted in Library and Information Science (LIS). The result shows that the most appropriate mechanism is a hybrid mechanism denoted by a combination of predictors with weights. In addition, the contributions of different categorized mechanisms are compared, where the author-based mechanisms are more important than the keyword-based and journal-based mechanisms. The result also indicates that there is information loss when projecting from a heterogeneous bibliographic network to a homogeneous co-authorship network. Our study could add more predictive information into the model and apply the method in other types of heterogeneous networks in the future. Display Omitted

[1]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Nada Lavrac,et al.  A Methodology for Mining Document-Enriched Heterogeneous Information Networks , 2011, Comput. J..

[3]  Nitesh V. Chawla,et al.  Multi-relational Link Prediction in Heterogeneous Information Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[4]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[5]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[6]  Tao Zhou,et al.  Predicting link directions via a recursive subgraph-based ranking , 2012, ArXiv.

[7]  Stephanie M Fullerton,et al.  Research guidelines in the era of large-scale collaborations: an analysis of Genome-wide Association Study Consortia. , 2012, American journal of epidemiology.

[8]  Heiko Rieger,et al.  Random walks on complex networks. , 2004, Physical review letters.

[9]  Shuguang Han,et al.  Coauthor Prediction for Junior Researchers , 2013, SBP.

[10]  Ying Ding,et al.  Applying centrality measures to impact analysis: A coauthorship network analysis , 2009 .

[11]  Z. Neda,et al.  Measuring preferential attachment in evolving networks , 2001, cond-mat/0104131.

[12]  Yizhou Sun,et al.  Mining heterogeneous information networks: a structural analysis approach , 2013, SKDD.

[13]  H. White,et al.  “Structural Equivalence of Individuals in Social Networks” , 2022, The SAGE Encyclopedia of Research Design.

[14]  Erjia Yan,et al.  Predicting and recommending collaborations: An author-, institution-, and country-level analysis , 2014, J. Informetrics.

[15]  Donald de B. Beaver,et al.  Reflections on Scientific Collaboration (and its study): Past, Present, and Future , 2001, Scientometrics.

[16]  Marián Boguñá,et al.  Popularity versus similarity in growing networks , 2011, Nature.

[17]  Srinivasan Parthasarathy,et al.  Local Probabilistic Models for Link Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[18]  Jing Zhao,et al.  Prediction of Links and Weights in Networks by Reliable Routes , 2015, Scientific Reports.

[19]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[20]  Zhen Liu,et al.  Link prediction in complex networks: A local naïve Bayes model , 2011, ArXiv.

[21]  Cassidy R. Sugimoto,et al.  Do Altmetrics Work? Twitter and Ten Other Social Web Services , 2013, PloS one.

[22]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[23]  Leo Egghe,et al.  Measuring co-authors’ contribution to an article’s visibility , 2012, Scientometrics.

[24]  Johan Bollen,et al.  Co-authorship networks in the digital library research community , 2005, Inf. Process. Manag..

[25]  Jennifer Neville,et al.  Temporal-Relational Classifiers for Prediction in Evolving Domains , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[26]  Ronald Rousseau,et al.  This item is the archived peer-reviewed author-version of: Recommending research collaborations using link prediction and random forest classifiers , 2022 .

[27]  Ying Ding,et al.  Scientific collaboration and endorsement: Network analysis of coauthorship and citation networks , 2011, J. Informetrics.

[28]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[29]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[30]  J. S. Katz,et al.  What is research collaboration , 1997 .

[31]  Ronald Rousseau,et al.  Similarity measures in scientometric research: The Jaccard index versus Salton's cosine formula , 1989, Inf. Process. Manag..

[32]  Donald de B. Beaver,et al.  Studies in scientific collaboration , 1978, Scientometrics.

[33]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[34]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[35]  Tao Zhou,et al.  Measuring multiple evolution mechanisms of complex networks , 2014, Scientific Reports.

[36]  Weimao Ke,et al.  Studying the emerging global brain: Analyzing and visualizing the impact of co-authorship teams , 2005, Complex..

[37]  Tao Zhou,et al.  Link prediction in weighted networks: The role of weak ties , 2010 .

[38]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[39]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[40]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[41]  José Luis Ortega,et al.  Influence of co-authorship networks in the research impact: Ego network analyses from Microsoft Academic Search , 2014, J. Informetrics.

[42]  Stefan Kuhlmann,et al.  Across institutional boundaries?: Research collaboration in German public sector nanoscience , 2008 .

[43]  Jianhua Ruan,et al.  A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity , 2013, Bioinform..

[44]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.