Toward a better scientific collaboration success prediction model through the feature space expansion

The problem with the prediction of scientific collaboration success based on the previous collaboration of scholars using machine learning techniques is addressed in this study. As the exploitation of collaboration network is essential in collaborator discovery systems, in this article an attempt is made to understand how to exploit the information embedded in collaboration networks. We benefit the link structure among the scholars and also among the scholars and the concepts to extract set of features that are correlated with the collaboration success and increase the prediction performance. The effect of considering other aggregate methods in addition to average and maximum, for computing the collaboration features based on the feature of the members is examined as well. A dataset extracted from Northwestern University’s SciVal Expert is used for evaluating the proposed approach. The results demonstrate the capability of the proposed collaboration features in order to increase the prediction performance in combination with the widely-used features like h-index and average citation counts. Consequently, the introduced features are appropriate to incorporate in collaborator discovery systems.

[1]  Mei Song,et al.  Conceptualizing and advancing research networking systems , 2012, TCHI.

[2]  Kjeld Schmidt,et al.  Constructing CSCW: The First Quarter Century , 2013, Computer Supported Cooperative Work (CSCW).

[3]  Paul F. Skilton Does the human capital of teams of natural science authors predict citation frequency? , 2009, Scientometrics.

[4]  Theodoros Lappas,et al.  Finding a team of experts in social networks , 2009, KDD.

[5]  John Whitfield,et al.  Collaboration: Group theory , 2008, Nature.

[6]  Shou-De Lin,et al.  On team formation with expertise query in collaborative social networks , 2015, Knowledge and Information Systems.

[7]  Lawrence D. Fu,et al.  Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature , 2010, Scientometrics.

[8]  Atish Das Sarma,et al.  Multi-skill Collaborative Teams based on Densest Subgraphs , 2011, SDM.

[9]  Kara L Hall,et al.  The ecology of team science: understanding contextual influences on transdisciplinary collaboration. , 2008, American journal of preventive medicine.

[10]  Mike Thelwall,et al.  Determinants of research citation impact in nanoscience and nanotechnology , 2013, J. Assoc. Inf. Sci. Technol..

[11]  Christina Courtright,et al.  Context in information behavior research , 2007 .

[12]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[13]  Schahram Dustdar,et al.  Composing Near-Optimal Expert Teams: A Trade-Off between Skills and Connectivity , 2010, OTM Conferences.

[14]  R. Wears,et al.  Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals. , 2002, JAMA.

[15]  Daren Yu,et al.  Discovery of factors influencing citation impact based on a soft fuzzy rough set model , 2012, Scientometrics.

[16]  Binshan Lin,et al.  Effect of team diversity on software project performance , 2007, Ind. Manag. Data Syst..

[17]  Marina Jirotka,et al.  Supporting Scientific Collaboration: Methods, Tools and Concepts , 2013, Computer Supported Cooperative Work (CSCW).

[18]  Yan Zhang,et al.  To better stand on the shoulder of giants , 2012, JCDL '12.

[19]  R. Wigand,et al.  Measuring social capital through network analysis and its influence on individual performance , 2014 .

[20]  Douglas A. Reynolds,et al.  Language identification using Gaussian mixture model tokenization , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[21]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[22]  Katy Börner,et al.  A Multi-Level Systems Perspective for the Science of Team Science , 2010, Science Translational Medicine.

[23]  Andrea Schiffauerova,et al.  Effect of collaboration network structure on knowledge creation and technological performance: the case of biotechnology in Canada , 2013, Scientometrics.

[24]  L. Egghe An improvement of the h-index: the g-index , 2006 .

[25]  Daniel L. Fay,et al.  Research collaboration in universities and academic entrepreneurship: the-state-of-the-art , 2012, The Journal of Technology Transfer.

[26]  Jonathon N. Cummings,et al.  Who collaborates successfully?: prior experience reduces collaboration barriers in distributed interdisciplinary research , 2008, CSCW.

[27]  Tian Yu,et al.  Citation impact prediction for scientific papers using stepwise regression analysis , 2014, Scientometrics.

[28]  D. Sonnenwald Scientific collaboration , 2007, Annu. Rev. Inf. Sci. Technol..

[29]  Howard Gadlin,et al.  Collaboration and Team Science , 2012, Journal of Investigative Medicine.

[30]  Gaganmeet Kaur Awal,et al.  Team formation in social networks based on collective intelligence – an evolutionary approach , 2014, Applied Intelligence.

[31]  G. Olson,et al.  Scientific Collaboration on the Internet , 2008 .

[32]  Maryam Fazel-Zarandi,et al.  Inferring and validating skills and competencies over time , 2013, Appl. Ontology.

[33]  Aristides Gionis,et al.  Estimating Number of Citations Using Author Reputation , 2007, SPIRE.