Engineering Impacts of Anonymous Author Code Review: A Field Experiment

Code review is a powerful technique to ensure high quality software and spread knowledge of best coding practices between engineers. Unfortunately, code reviewers may have biases about authors of the code they are reviewing, which can lead to inequitable experiences and outcomes. In principle, anonymous author code review can reduce the impact of such biases by withholding an author's identity from a reviewer. In this paper, to understand the engineering effects of using author anonymous code review in a practical setting, we applied the technique to 5217 code reviews performed by 300 software engineers at Google. Our results suggest that during anonymous author code review, reviewers can frequently guess authors identities; that focus is reduced on reviewer-author power dynamics; and that the practice poses a barrier to offline, high-bandwidth conversations. Based on our findings, we recommend that those who choose to implement anonymous author code review should reveal the time zone of the author by default, have a break-the-glass option for revealing author identity, and reveal author identity directly after the review.

[1]  Yuriy Brun,et al.  Effectiveness of anonymization in double-blind review , 2017, Commun. ACM.

[2]  Mark Ware,et al.  Peer review in scholarly journals: Perspective of the scholarly community - Results from an international study , 2008, Inf. Serv. Use.

[3]  Alberto Bacchelli,et al.  Expectations, outcomes, and challenges of modern code review , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[4]  Kathryn S. McKinley Improving publication quality by reducing bias with double-blind reviewing and author response , 2008, SIGP.

[5]  R. Fletcher,et al.  The effects of blinding on the quality of peer review. A randomized trial. , 1990, JAMA.

[6]  Shane McIntosh,et al.  An empirical study of the impact of modern code review practices on software quality , 2015, Empirical Software Engineering.

[7]  Hajimu Iida,et al.  "Was My Contribution Fairly Reviewed?" A Framework to Study the Perception of Fairness in Modern Code Reviews , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[8]  Min Zhang,et al.  Reviewer bias in single- versus double-blind peer review , 2017, Proceedings of the National Academy of Sciences.

[9]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[10]  R. Blank The Effects of Double-Blind versus Single-Blind Reviewing: Experimental Evidence from The American Economic Review , 1991 .

[11]  Michael W. Godfrey,et al.  The influence of non-technical factors on code review , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[12]  D. Ratner,et al.  Blinded vs. unblinded peer review of manuscripts submitted to a dermatology journal: a randomized multi‐rater study , 2011, The British journal of dermatology.

[13]  Sami Schalk Metaphorically Speaking: Ableist Metaphors in Feminist Writing , 2013 .

[14]  Joseph S. Valacich,et al.  The Effect of Perceived Novelty on the Adoption of Information Technology Innovations: A Risk/Reward Perspective , 2010, Decis. Sci..

[15]  Gráinne M. Fitzsimons,et al.  Lean in Messages Increase Attributions of Women’s Responsibility for Gender Inequality , 2018, Journal of personality and social psychology.

[16]  Julian G. Ratcliffe Moving Software Quality Upstream: The Positive Impact of Lightweight Peer Code Review , 2009 .

[17]  M. Kocher,et al.  Single-blind vs Double-blind Peer Review in the Setting of Author Prestige. , 2016, JAMA.

[18]  Michael W. Godfrey,et al.  Investigating code review quality: Do people and participation matter? , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[19]  J. G. Adair,et al.  The Hawthorne effect: A reconsideration of the methodological artifact. , 1984 .

[20]  Wayne M. Wormley,et al.  Effects of Race on Organizational Experiences, Job Performance Evaluations, and Career Outcomes , 1990 .

[21]  S. B. Friedman,et al.  The effects of blinding on acceptance of research papers by peer review. , 1994, JAMA.

[22]  Georgios Gousios,et al.  Work Practices and Challenges in Pull-Based Development: The Integrator's Perspective , 2014, ICSE.

[23]  A. Yankauer,et al.  How blind is blind review? , 1991, American journal of public health.

[24]  N. Black,et al.  Effect of blinding and unmasking on the quality of peer review: a randomized trial. , 1998, JAMA.

[25]  Gabriele Bavota,et al.  Four eyes are better than two: On the impact of code reviews on software quality , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[26]  Michael J. Burke,et al.  Sex Discrimination in Simulated Employment Contexts: A Meta-analytic Investigation , 2000 .

[27]  Emerson Murphy-Hill,et al.  Gender differences and bias in open source: pull request acceptance of women versus men , 2017, PeerJ Comput. Sci..

[28]  Emerson Murphy-Hill,et al.  Predicting Developers' Negative Feelings about Code Review , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[29]  C. Goldin,et al.  Orchestrating Impartiality: The Impact of "Blind" Auditions on Female Musicians , 1997 .

[30]  E. F. Stone-Romero,et al.  THE EFFECTS OF PHYSICAL ATTRACTIVENESS ON JOB‐RELATED OUTCOMES: A META‐ANALYSIS OF EXPERIMENTAL STUDIES , 2003 .

[31]  T. Tregenza,et al.  Double-blind review favours increased representation of female authors. , 2008, Trends in ecology & evolution.

[32]  Hajimu Iida,et al.  Investigating Code Review Practices in Defective Files: An Empirical Study of the Qt System , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[33]  Christian Bird,et al.  Characteristics of Useful Code Reviews: An Empirical Study at Microsoft , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[34]  Hajimu Iida,et al.  Revisiting Code Ownership and Its Relationship with Software Quality in the Scope of Modern Code Review , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[35]  Luke Church,et al.  Modern Code Review: A Case Study at Google , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[36]  Christopher T. Robertson,et al.  Blinding as a Solution to Bias: Strengthening Biomedical Science, Forensic Science, and Law , 2016 .

[37]  Emerson Murphy-Hill,et al.  Enabling the Study of Software Development Behavior With Cross-Tool Logs , 2020, IEEE Software.

[38]  Nancy J. Nersessian,et al.  Studying the influence of culture in global software engineering: thinking in terms of cultural models , 2012, ICIC.

[39]  Richard D. Arvey,et al.  Age Bias in Laboratory and Field Settings: A Meta‐Analytic Investigation , 2004 .

[40]  Christian Bird,et al.  Convergent Software Peer Review Practices , 2013 .

[41]  Olof Åslund,et al.  Do Anonymous Job Application Procedures Level the Playing Field? , 2012 .

[42]  D. Rennie,et al.  Does masking author identity improve peer review quality? A randomized controlled trial. PEER Investigators. , 1998, JAMA.