Peer-Selected “Best Papers”—Are They Really That “Good”?

Background Peer evaluation is the cornerstone of science evaluation. In this paper, we analyze whether or not a form of peer evaluation, the pre-publication selection of the best papers in Computer Science (CS) conferences, is better than random, when considering future citations received by the papers. Methods Considering 12 conferences (for several years), we collected the citation counts from Scopus for both the best papers and the non-best papers. For a different set of 17 conferences, we collected the data from Google Scholar. For each data set, we computed the proportion of cases whereby the best paper has more citations. We also compare this proportion for years before 2010 and after to evaluate if there is a propaganda effect. Finally, we count the proportion of best papers that are in the top 10% and 20% most cited for each conference instance. Results The probability that a best paper will receive more citations than a non best paper is 0.72 (95% CI = 0.66, 0.77) for the Scopus data, and 0.78 (95% CI = 0.74, 0.81) for the Scholar data. There are no significant changes in the probabilities for different years. Also, 51% of the best papers are among the top 10% most cited papers in each conference/year, and 64% of them are among the top 20% most cited. Discussion There is strong evidence that the selection of best papers in Computer Science conferences is better than a random selection, and that a significant number of the best papers are among the top cited papers in the conference.

[1]  Tom Coupé,et al.  Peer Review versus Citations - An Analysis of Best Paper Prizes , 2013 .

[2]  Padraig Cunningham,et al.  Relative status of journal and conference publications in computer science , 2010, Commun. ACM.

[3]  D. Wardle Do 'Faculty of 1000' (F1000) ratings of ecological publications serve as reasonable predictors of their future impact? , 2010 .

[4]  Joseph P. Romano,et al.  A Review of Bootstrap Confidence Intervals , 1988 .

[5]  Massimo Franceschet,et al.  The first Italian research assessment exercise: A bibliometric perspective , 2009, J. Informetrics.

[6]  Lutz Bornmann,et al.  Are there better indices for evaluation purposes than the h index? A comparison of nine different variants of the h index using data from biomedicine , 2008, J. Assoc. Inf. Sci. Technol..

[7]  Judit Bar-Ilan,et al.  Web of Science with the Conference Proceedings Citation Indexes: the case of computer science , 2010, Scientometrics.

[8]  Jacques Wainer,et al.  Invisible work in standard bibliometric evaluation of computer science , 2011, Commun. ACM.

[9]  Ed J. Rinia,et al.  COMPARATIVE ANALYSIS OF A SET OF BIBLIOMETRIC INDICATORS AND CENTRAL PEER REVIEW CRITERIA. EVALUATION OF CONDENSED MATTER PHYSICS IN THE NETHERLANDS , 1998 .

[10]  Jacques Wainer,et al.  How productivity and impact differ across computer science subareas , 2013, CACM.

[11]  Francisco Herrera,et al.  h-Index: A review focused in its variants, computation and standardization for different scientific fields , 2009, J. Informetrics.

[12]  Lutz Bornmann,et al.  Does the h-index for ranking of scientists really work? , 2005, Scientometrics.

[13]  Charles Oppenheim,et al.  The correlation between citation counts and the 1992 research assessment exercise ratings for British research in genetics, anatomy and archaeology , 1997, J. Documentation.

[14]  Loet Leydesdorff,et al.  The validation of (advanced) bibliometric indicators through peer assessments: A comparative study using data from InCites and F1000 , 2012, J. Informetrics.

[15]  Massimo Franceschet,et al.  The role of conference publications in CS , 2010, Commun. ACM.

[16]  Ludo Waltman,et al.  F1000 Recommendations as a Potential New Data Source for Research Evaluation: A Comparison With Citations , 2014, J. Assoc. Inf. Sci. Technol..

[17]  Ralf Zurbruegg,et al.  Do lead articles signal higher quality in the digital age? Evidence from finance journals , 2013, Scientometrics.

[18]  Jacques Wainer,et al.  Correlations between bibliometrics and peer evaluation for all disciplines: the evaluation of Brazilian scientists , 2013, Scientometrics.

[19]  Ludo Waltman,et al.  F1000 recommendations as a new data source for research evaluation: A comparison with citations , 2013, ArXiv.

[20]  Moshe Y. Vardi Conferences vs. journals in computing research , 2009, CACM.

[21]  Rajan Sen,et al.  Citation Rates of Award-Winning ASCE Papers , 2012 .

[22]  Elizabeth S. Vieira,et al.  How good is a model based on bibliometric indicators in predicting the final decisions made by peers? , 2014, J. Informetrics.

[23]  Joseph Y. Halpern,et al.  Journals for certification, conferences for rapid dissemination , 2011, CACM.

[24]  Moshe Y. Vardi The financial meltdown and computing , 2009, CACM.

[25]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[26]  Moshe Y. Vardi Is the image crisis over? , 2009, Commun. ACM.

[27]  Evaristo Jiménez-Contreras,et al.  Reviewers’ Ratings and Bibliometric Indicators: Hand in Hand When Assessing Over Research Proposals? , 2013, PloS one.

[28]  Lutz Bornmann,et al.  Are there better indices for evaluation purposes than the h index? A comparison of nine different variants of the h index using data from biomedicine , 2008, J. Assoc. Inf. Sci. Technol..