Predicting and recommending collaborations: An author-, institution-, and country-level analysis

This study examines collaboration dynamics with the goal to predict and recommend collaborations starting from the current topology. Author-, institution-, and country-level collaboration networks are constructed using a ten-year data set on library and information science publications. Different statistical approaches are applied to these collaboration networks. The study shows that, for the employed data set in particular, higher-level collaboration networks (i.e., country-level collaboration networks) tend to yield more accurate prediction outcomes than lower-level ones (i.e., institution- and author-level collaboration networks). Based on the recommended collaborations of the data set, this study finds that neighbor-information-based approaches are more clustered on a 2-D multidimensional scaling map than topology-based ones. Limitations of the applied approaches on sparse collaboration networks are also discussed.

[1]  Karin D. Knorr-Cetina,et al.  Scientific Communities or Transepistemic Arenas of Research? A Critique of Quasi-Economic Models of Science , 1982 .

[2]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[3]  Hildrun Kretschmer,et al.  Collaboration and distances between German immunological institutes – a trend analysis , 2006, Journal of biomedical discovery and collaboration.

[4]  Ichiro Sakata,et al.  Link prediction in citation networks , 2012, J. Assoc. Inf. Sci. Technol..

[5]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[6]  François Rousselot,et al.  PageRank for bibliographic networks , 2008, Scientometrics.

[7]  Tao Zhou,et al.  Predicting link directions via a recursive subgraph-based ranking , 2012, ArXiv.

[8]  Benjamin F. Jones,et al.  Supporting Online Material Materials and Methods Figs. S1 to S3 References the Increasing Dominance of Teams in Production of Knowledge , 2022 .

[9]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[10]  Alessandro Vespignani,et al.  Weighted evolving networks: coupling topology and weight dynamics. , 2004, Physical review letters.

[11]  Elisabeth Logan,et al.  A bibliometric analysis of collaboration in a medical specialty , 2005, Scientometrics.

[12]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[13]  Renaud Lambiotte,et al.  Communities, knowledge creation, and information diffusion , 2009, J. Informetrics.

[14]  M. Newman Erratum: Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality (Physical Review e (2001) 64 (016132)) , 2006 .

[15]  Koen Frenken,et al.  The geographical and institutional proximity of research collaboration , 2007 .

[16]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[17]  G. N. Gilbert,et al.  Problem Areas and Research Networks in Science , 1975 .

[18]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[19]  Raf Guns,et al.  Generalizing link prediction: Collaboration at the University of Antwerp as a case study , 2009, ASIST.

[20]  Katy Börner,et al.  A Multi-Level Systems Perspective for the Science of Team Science , 2010, Science Translational Medicine.

[21]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[22]  Andrew Abbott,et al.  Chaos of disciplines , 2001 .

[23]  Jie Tang,et al.  Detecting Community Kernels in Large Social Networks , 2011, 2011 IEEE 11th International Conference on Data Mining.

[24]  Yannis Manolopoulos,et al.  Generalized comparison of graph-based ranking algorithms for publications and authors , 2006, J. Syst. Softw..

[25]  O. Persson,et al.  Understanding Patterns of International Scientific Collaboration , 1992 .

[26]  Hsinchun Chen,et al.  Link prediction approach to collaborative filtering , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[27]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[28]  R. Boschma Proximity and Innovation: A Critical Assessment , 2005 .

[29]  Cassidy R. Sugimoto,et al.  Institutional interactions: Exploring social, cognitive, and geographic relationships between institutions as demonstrated through citation networks , 2011, J. Assoc. Inf. Sci. Technol..

[30]  M E Newman,et al.  Scientific collaboration networks. I. Network construction and fundamental results. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[32]  Hildrun Kretschmer,et al.  Connection and stratification in research collaboration: An analysis of the COLLNET network , 2006, Inf. Process. Manag..

[33]  Purnamrita Sarkar,et al.  Theoretical Justification of Popular Link Prediction Heuristics , 2011, IJCAI.

[34]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[35]  Hugo Horta,et al.  Holding a post-doctoral position before becoming a faculty member: does it bring benefits for the scholarly enterprise? , 2009 .

[36]  J. Moody The Structure of a Social Science Collaboration Network: Disciplinary Cohesion from 1963 to 1999 , 2004 .

[37]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[38]  Johan Bollen,et al.  Co-authorship networks in the digital library research community , 2005, Inf. Process. Manag..

[39]  Birger Larsen,et al.  Data fusion according to the principle of polyrepresentation , 2009 .

[40]  Ying Ding,et al.  Discovering author impact: A PageRank perspective , 2010, Inf. Process. Manag..

[41]  Jano Moreira de Souza,et al.  Group and link analysis of multi-relational scientific social networks , 2013, J. Syst. Softw..

[42]  N. Babchuk,et al.  Collaboration in sociology and other scientific disciplines: A comparative trend analysis of scholarship in the social, physical, and mathematical sciences , 1999 .

[43]  Benjamin F. Jones,et al.  Multi-University Research Teams: Shifting Impact, Geography, and Stratification in Science , 2008, Science.

[44]  Jennifer Neville,et al.  Temporal-Relational Classifiers for Prediction in Evolving Domains , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[45]  S. Fiore Interdisciplinarity as Teamwork , 2008 .

[46]  T. Vicsek,et al.  Weighted network modules , 2007, cond-mat/0703706.

[47]  Albert,et al.  Topology of evolving networks: local events and universality , 2000, Physical review letters.

[48]  J. Hoekman,et al.  The geography of collaborative knowledge production in Europe , 2009 .

[49]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[50]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[51]  Z. Neda,et al.  Measuring preferential attachment in evolving networks , 2001, cond-mat/0104131.

[52]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[53]  M. Pencina,et al.  Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond , 2008, Statistics in medicine.

[54]  Ronald Rousseau,et al.  Predicting and recommending potential research collaborations , 2013 .

[55]  Hildrun Kretschmer,et al.  Author productivity and geodesic distance in bibliographic co-authorship networks, and visibility on the Web , 2004, Scientometrics.

[56]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[57]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[58]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[59]  Peter Ingwersen,et al.  Polyrepresentation of information needs and semantic entities: elements of a cognitive theory for information retrieval interaction , 1994, SIGIR '94.

[60]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[61]  M. Newman,et al.  Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[62]  Nitesh V. Chawla,et al.  LPmade: Link Prediction Made Easy , 2011, J. Mach. Learn. Res..

[63]  Ying Ding,et al.  Applying centrality measures to impact analysis: A coauthorship network analysis , 2009 .