Does double‐blind peer review reduce bias? Evidence from a top computer science conference

Peer review is widely regarded as essential for advancing scientific research. However, reviewers may be biased by authors' prestige or other characteristics. Double-blind peer review, in which the authors' identities are masked from the reviewers, has been proposed as a way to reduce reviewer bias. Although intuitive, evidence for the effectiveness of double-blind peer review in reducing bias is limited and mixed. Here, we examine the effects of double-blind peer review on prestige bias by analyzing the peer review files of 5027 papers submitted to the International Conference on Learning Representations (ICLR), a top computer science conference that changed its reviewing policy from single-blind peer review to doubleblind peer review in 2018. We find that after switching to double-blind review, the scores given to the most prestigious authors significantly decreased. However, because many of these papers were above the threshold for acceptance, the change did not affect paper acceptance decisions significantly. Nevertheless, we show that double-blind peer review may have improved the quality of the selections by limiting other (non-author-prestige) biases. Specifically, papers rejected in the single-blind format are cited more than those rejected under the double-blind format, suggesting that double-blind review better identifies poorer quality papers. Interestingly, an apparently unrelated change – the change of rating scale from 10 to 4 points – likely reduced prestige bias significantly, to an extent that affected papers’ acceptance. These results provide some support for the effectiveness of double-blind review in reducing prestige bias, while opening new research directions on the impact of peer review formats. * To whom correspondence should be addressed: Misha Teplitskiy, tepl@umich.edu

[1]  R. Blank The Effects of Double-Blind versus Single-Blind Reviewing: Experimental Evidence from The American Economic Review , 1991 .

[2]  R. Downey,et al.  Rating the ratings: Assessing the psychometric quality of rating data , 1980 .

[3]  Rosa Rodriguez-Sánchez,et al.  Confirmatory bias in peer review , 2020, Scientometrics.

[4]  Richard Smith,et al.  Peer Review: A Flawed Process at the Heart of Science and Journals , 2006 .

[5]  S. B. Friedman,et al.  The effects of blinding on acceptance of research papers by peer review. , 1994, JAMA.

[6]  Animesh Garg,et al.  De-anonymization of authors through arXiv submissions during double-blind review , 2020, ArXiv.

[7]  Alberto Bacchelli,et al.  Does single blind peer review hinder newcomers? , 2017, Scientometrics.

[8]  Min Zhang,et al.  Reviewer bias in single- versus double-blind peer review , 2017, Proceedings of the National Academy of Sciences.

[9]  R. Merton The Matthew Effect in Science , 1968, Science.

[10]  Albert-László Barabási,et al.  Quantifying Long-Term Scientific Impact , 2013, Science.

[11]  G. Breen Nepotism and sexism in peer-review , 1997, Nature.

[12]  Cassidy R. Sugimoto,et al.  Bias in peer review , 2013, J. Assoc. Inf. Sci. Technol..

[13]  Lauren A. Rivera,et al.  Scaling Down Inequality: Rating Scales, Gender Bias, and the Architecture of Evaluation , 2019, American Sociological Review.

[14]  Andrew McCallum,et al.  Open Scholarship and Peer Review: a Time for Experimentation , 2013 .

[15]  Shlomo Argamon,et al.  Automatically profiling the author of an anonymous text , 2009, CACM.

[16]  Laurel L. Haak,et al.  Race, Ethnicity, and NIH Research Awards , 2011, Science.

[17]  C. Wennerås,et al.  Nepotism and sexism in peer-review , 1997, Nature.

[18]  Yang Song,et al.  An Overview of Microsoft Academic Service (MAS) and Applications , 2015, WWW.

[19]  Yuriy Brun,et al.  Effectiveness of anonymization in double-blind review , 2017, Commun. ACM.

[20]  J. Bartko,et al.  The fate of published articles, submitted again , 1982, Behavioral and Brain Sciences.

[21]  Richard T. Snodgrass,et al.  Single- versus double-blind reviewing: an analysis of the literature , 2006, SGMD.

[22]  Ludo Waltman,et al.  A review of the literature on citation impact indicators , 2015, J. Informetrics.

[23]  M. Kocher,et al.  Single-blind vs Double-blind Peer Review in the Setting of Author Prestige. , 2016, JAMA.

[24]  Rosa Rodriguez-Sánchez,et al.  The Game Between a Biased Reviewer and His Editor , 2019, Sci. Eng. Ethics.

[25]  Vincent Larivière,et al.  Long-term variations in the aging of scientific literature: From exponential growth to steady-state science (1900-2004) , 2008, J. Assoc. Inf. Sci. Technol..

[26]  Jian Wang,et al.  Bias Against Novelty in Science: A Cautionary Tale for Users of Bibliometric Indicators , 2015 .

[27]  Vincent Larivière,et al.  The impact factor's Matthew Effect: A natural experiment in bibliometrics , 2009, J. Assoc. Inf. Sci. Technol..

[28]  Francisco Grimaldo Moreno,et al.  Hidden connections: Network effects on editorial decisions in four computer science journals , 2018, J. Informetrics.