Evaluating research and researchers by the journal impact factor: is it better than coin flipping?

Abstract The journal impact factor (JIF) is the average of the number of citations of the papers published in a journal, calculated according to a specific formula; it is extensively used for the evaluation of research and researchers. The method assumes that all papers in a journal have the same scientific merit, which is measured by the JIF of the publishing journal. This implies that the number of citations measures scientific merits but the JIF does not evaluate each individual paper by its own number of citations. Therefore, in the comparative evaluation of two papers, the use of the JIF implies a risk of failure, which occurs when a paper in the journal with the lower JIF is compared to another with fewer citations in the journal with the higher JIF. To quantify this risk of failure, this study calculates the failure probabilities, taking advantage of the lognormal distribution of citations. In two journals whose JIFs are ten-fold different, the failure probability is low. However, in most cases when two papers are compared, the JIFs of the journals are not so different. Then, the failure probability can be close to 0.5, which is equivalent to evaluating by coin flipping.

[1]  Lutz Bornmann,et al.  Can the journal impact factor be used as a criterion for the selection of junior researchers? A large-scale empirical study based on ResearcherID data , 2017, J. Informetrics.

[2]  Ludo Waltman,et al.  A review of the literature on citation impact indicators , 2015, J. Informetrics.

[3]  Vincent A. Traag,et al.  Use of the journal impact factor for assessing individual articles need not be wrong , 2017, ArXiv.

[4]  Tibor Braun,et al.  Cross-field normalization of scientometric indicators , 1996, Scientometrics.

[5]  P. Seglen,et al.  Education and debate , 1999, The Ethics of Public Health.

[6]  Ewen Callaway,et al.  Beat it, impact factor! Publishing elite turns against controversial metric , 2016, Nature.

[7]  Gjalt-Jorn Ygram Peters Why not to use the journal impact factor as a criterion for the selection of junior researchers: A comment on Bornmann and Williams (2017) , 2017, J. Informetrics.

[8]  Mike Thelwall,et al.  A combined bibliometric indicator to predict article impact , 2011, Inf. Process. Manag..

[9]  Lutz Bornmann,et al.  Use of the journal impact factor as a criterion for the selection of junior researchers: A rejoinder on a comment by Peters (2017) , 2017, J. Informetrics.

[10]  P. Lawrence The politics of publication , 2003, Nature.

[11]  Loet Leydesdorff,et al.  Professional and citizen bibliometrics: complementarities and ambivalences in the development and use of indicators—a state-of-the-art report , 2016, Scientometrics.

[12]  Jian Wang,et al.  Bias Against Novelty in Science: A Cautionary Tale for Users of Bibliometric Indicators , 2015 .

[13]  L. Bornmann,et al.  How good is research really? , 2013, EMBO reports.

[14]  Alonso Rodríguez-Navarro,et al.  Research assessment based on infrequent achievements: A comparison of the United States and Europe in terms of highly cited papers and Nobel Prizes , 2016, J. Assoc. Inf. Sci. Technol..

[15]  Ronald Rousseau,et al.  Science deserves to be judged by its contents, not by its wrapping: Revisiting Seglen's work on journal impact and research evaluation , 2017, PloS one.

[16]  John Tregoning How will you judge me if not by impact factor? , 2018, Nature.

[17]  Judit Bar-Ilan,et al.  Citation success index - An intuitive pair-wise journal comparison metric , 2016, J. Informetrics.

[18]  Marcello Gallucci,et al.  A conceptual and empirical examination of justifications for dichotomization. , 2009, Psychological methods.

[19]  Lutz Bornmann,et al.  Core elements in the process of citing publications: Conceptual overview of the literature , 2017, J. Informetrics.

[20]  Alonso Rodríguez-Navarro,et al.  Double rank analysis for research assessment , 2018, J. Informetrics.

[21]  Giovanni Abramo,et al.  Citations versus journal impact factor as proxy of quality: could the latter ever be preferable? , 2010, Scientometrics.

[22]  Vincent Larivière,et al.  History of the journal impact factor: Contingencies and consequences , 2009, Scientometrics.

[23]  Mike Thelwall,et al.  The metric tide: report of the independent review of the role of metrics in research assessment and management , 2015 .

[24]  Vincent A. Traag,et al.  Systematic analysis of agreement between metrics and peer review in the UK REF , 2019, Palgrave Communications.

[25]  Tibor Braun,et al.  Relative indicators and relational charts for comparative assessment of publication output and citation impact , 1986, Scientometrics.

[26]  Eugene Garfield,et al.  Impact factors, and why they won't go away , 2001, Nature.

[27]  Mike Thelwall,et al.  Citation count distributions for large monodisciplinary journals , 2016, J. Informetrics.

[28]  Vincent Larivière,et al.  A simple proposal for the publication of journal citation distributions , 2016, bioRxiv.

[29]  Loet Leydesdorff,et al.  UvA-DARE ( Digital Academic Repository ) Citations : Indicators of Quality ? The Impact Fallacy , 2016 .

[30]  Alonso Rodríguez-Navarro,et al.  A Simple Index for the High-Citation Tail of Citation Distribution to Quantify Research Performance in Countries and Institutions , 2011, PloS one.

[31]  Lutz Bornmann,et al.  How to evaluate individual researchers working in the natural and life sciences meaningfully? A proposal of methods based on percentiles of citations , 2013, Scientometrics.

[32]  Björn Hammarfelt,et al.  Indicators as judgment devices : An empirical study of citizen bibliometrics in research evaluation , 2017 .

[33]  Alonso Rodríguez-Navarro,et al.  Probability and expected frequency of breakthroughs - a robust method of research assessment based on the double rank property of citation distributions , 2018, ArXiv.