Predicting the results of evaluation procedures of academics

Background The 2010 reform of the Italian university system introduced the National Scientific Habilitation (ASN) as a requirement for applying to permanent professor positions. Since the CVs of the 59,149 candidates and the results of their assessments have been made publicly available, the ASN constitutes an opportunity to perform analyses about a nation-wide evaluation process. Objective The main goals of this paper are: (i) predicting the ASN results using the information contained in the candidates’ CVs; (ii) identifying a small set of quantitative indicators that can be used to perform accurate predictions. Approach Semantic technologies are used to extract, systematize and enrich the information contained in the applicants’ CVs, and machine learning methods are used to predict the ASN results and to identify a subset of relevant predictors. Results For predicting the success in the role of associate professor, our best models using all and the top 15 predictors make accurate predictions (F-measure values higher than 0.6) in 88% and 88.6% of the cases, respectively. Similar results have been achieved for the role of full professor. Evaluation The proposed approach outperforms the other models developed to predict the results of researchers’ evaluation procedures. Conclusions Such results allow the development of an automated system for supporting both candidates and committees in the future ASN sessions and other scholars’ evaluation procedures.

[1]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[2]  Lutz Bornmann,et al.  Selecting scientific excellence through committee peer review - A citation analysis of publications previously published to approval or rejection of post-doctoral research fellowship applicants , 2006, Scientometrics.

[3]  Peter F. Patel-Schneider,et al.  OWL 2 Web Ontology Language , 2009 .

[4]  Andrea Giovanni Nuzzolese,et al.  Do altmetrics work for assessing research quality? , 2018, Scientometrics.

[5]  Rickard Danell,et al.  Can the quality of scientific work be predicted using information on the author's track record? , 2011, J. Assoc. Inf. Sci. Technol..

[6]  Dag W. Aksnes,et al.  A macro study of self-citation , 2003, Scientometrics.

[7]  R. Muhonen,et al.  The Bologna Process and Internationalization – Consequences for Italian Academic Life , 2009 .

[8]  Elizabeth S. Vieira,et al.  How good is a model based on bibliometric indicators in predicting the final decisions made by peers? , 2014, J. Informetrics.

[9]  Moreno Marzolla Quantitative analysis of the Italian National Scientific Qualification , 2015, J. Informetrics.

[10]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[11]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[12]  Giovanni Abramo,et al.  Allocative efficiency in public research funding: can bibliometrics help? , 2009, ArXiv.

[13]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  L. Bornmann,et al.  Does the Committee Peer Review Select the Best Applicants for Funding? An Investigation of the Selection Process for Two European Molecular Biology Organization Programmes , 2008, PloS one.

[16]  Loet Leydesdorff,et al.  How are new citation-based journal indicators adding to the bibliometric toolbox? , 2009, J. Assoc. Inf. Sci. Technol..

[17]  Concha Bielza,et al.  Genetic algorithms and Gaussian Bayesian networks to uncover the predictive core set of bibliometric indices , 2016, J. Assoc. Inf. Sci. Technol..

[18]  Cassandra C. Elrod,et al.  Information Science and Technology , 2019, Washington Information Directory 2019–2020.

[19]  Elizabeth S. Vieira,et al.  Definition of a model based on bibliometric indicators for assessing applicants to academic positions , 2014, J. Assoc. Inf. Sci. Technol..

[20]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[21]  Angelo Di Iorio,et al.  Open data to evaluate academic researchers: an experiment with the Italian Scientific Habilitation , 2019, ISSI.

[22]  Lutz Bornmann,et al.  How to analyze percentile citation impact data meaningfully in bibliometrics: The statistical analysis of distributions, percentile rank classes, and top-cited papers , 2013, J. Assoc. Inf. Sci. Technol..

[23]  Jonas Lindahl,et al.  Predicting research excellence at the individual level: The importance of publication rate, top journal publications, and top 10% publications in the case of early career mathematicians , 2018, J. Informetrics.

[24]  Lokman I. Meho,et al.  Using the h-index to rank influential information scientists , 2006, J. Assoc. Inf. Sci. Technol..

[25]  D. Rojas,et al.  Predicting academic career outcomes by predoctoral publication record , 2018, PeerJ.

[26]  Pablo Jensen,et al.  Testing bibliometric indicators by their prediction of scientists promotions , 2008, Scientometrics.

[27]  Andrea Giovanni Nuzzolese,et al.  Enhancing Open Data to Linked Open Data with ODMiner , 2016, LD4IE@ISWC.

[28]  Massimo Franceschet,et al.  The first Italian research assessment exercise: A bibliometric perspective , 2009, J. Informetrics.

[29]  Charles Oppenheim,et al.  Citation counts and the Research Assessment Exercise V: Archaeology and the 2001 RAE , 2003, J. Documentation.

[30]  Anthony F. J. van Raan Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups , 2013, Scientometrics.

[31]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[32]  Andrea Giovanni Nuzzolese,et al.  The practice of self-citations: a longitudinal study , 2019, Scientometrics.

[33]  Jim Taylor,et al.  The Assessment of Research Quality in UK Universities: Peer Review or Metrics? , 2011 .

[34]  Massimo Franceschet,et al.  A cluster analysis of scholar and journal bibliometric indicators , 2009, J. Assoc. Inf. Sci. Technol..

[35]  Predicting the results of evaluation procedures of academics , 2019, PeerJ. Computer science.

[36]  Lutz Bornmann,et al.  Do altmetrics correlate with the quality of papers? A large-scale empirical study based on F1000Prime data , 2017, PloS one.

[37]  Lutz Bornmann,et al.  Convergent validation of peer review decisions using the h index: Extent of and reasons for type I and type II errors , 2007, J. Informetrics.

[38]  Anthony F. J. van Raan,et al.  Peer review and bibliometric indicators of scientific performance: A comparison of cum laude doctorates with ordinary doctorates in physics , 1987, Scientometrics.

[39]  Lawrence D. Fu,et al.  Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature , 2010, Scientometrics.