An exploratory study on confusion in code reviews

Context: Code review is a widely used technique of systematic examination of code changes which aims at increasing software quality. Code reviews provide several benefits for the project, including finding bugs, knowledge transfer, and assurance of adherence to project guidelines and coding style. However, code reviews have a major cost: they can delay the merge of the code change, and thus, impact the overall development process. This cost can be even higher if developers do not understand something, i.e., when developers face confusion during the code review. Objective: This paper studies the phenomenon of confusion in code reviews. Understanding confusion is an important starting point to help reducing the cost of code reviews and enhance the effectiveness of this practice, and hence, improve the development process. Method: We conducted two complementary studies. The first one aimed at identifying the reasons for confusion in code reviews, its impacts, and the coping strategies developers use to deal with it. Then, we surveyed developers to identify the most frequently experienced reasons for confusion, and conducted a systematic mapping study of solutions proposed for those reasons in the scientific literature. Results: From the first study, we build a framework with 30 reasons for confusion, 14 impacts, and 13 coping strategies. The results of Felipe Ebert Eindhoven University of Technology, The Netherlands E-mail: f.ebert@tue.nl Fernando Castor Federal University of Pernambuco, Brazil E-mail: castor@cin.ufpe.br Nicole Novielli University of Bari, Italy E-mail: nicole.novielli@uniba.it Alexander Serebrenik Eindhoven University of Technology, The Netherlands E-mail: a.serebrenik@tue.nl 2 Felipe Ebert et al. the systematic mapping study shows 38 articles addressing the most frequent reasons for confusion. From those articles, we found 19 different solutions for confusion proposed in the literature, and nine impacts were established related to the most frequent reasons for confusion. Conclusions: Based on the solutions identified in the mapping study, or the lack of them, we propose an actionable guideline for developers on how to cope with confusion during code reviews; we also make several suggestions how tool builders can support code reviews. Additionally, we propose a research agenda for researchers studying code reviews.

[1]  Andrew Begel,et al.  Eye movements in code review , 2018, EMIP@ETRA.

[2]  Pavol Návrat,et al.  Untangling Development Tasks with Software Developer's Activity , 2015, 2015 IEEE/ACM 2nd International Workshop on Context for Software Development.

[3]  Reiner Hähnle,et al.  Can Formal Methods Improve the Efficiency of Code Reviews? , 2016, IFM.

[4]  Michael W. Godfrey,et al.  The influence of non-technical factors on code review , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[5]  Foutse Khomh,et al.  Do code review practices impact design quality? A case study of the Qt, VTK, and ITK projects , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[6]  Jesús M. González-Barahona,et al.  Using Metrics to Track Code Review Performance , 2017, EASE.

[7]  Jeffrey C. Carver,et al.  Identifying the characteristics of vulnerable code changes: an empirical study , 2014, SIGSOFT FSE.

[8]  Bo Guo,et al.  Interactively Decomposing Composite Changes to Support Code Review and Regression Testing , 2017, 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC).

[9]  W. Foddy Constructing Questions for Interviews and Questionnaires: Theory and Practice in Social Research , 1993 .

[10]  Marti J. Anderson,et al.  A new method for non-parametric multivariate analysis of variance in ecology , 2001 .

[11]  Mika Mäntylä,et al.  What Types of Defects Are Really Discovered in Code Reviews? , 2009, IEEE Transactions on Software Engineering.

[12]  Thomas D. LaToza,et al.  Maintaining mental models: a study of developer work habits , 2006, ICSE.

[13]  Michael E. Fagan Design and Code Inspections to Reduce Errors in Program Development , 1976, IBM Syst. J..

[14]  Robert Feldt,et al.  Behavioral software engineering - guidelines for qualitative studies , 2017, ArXiv.

[15]  Jeffrey C. Carver,et al.  Understanding the Impressions, Motivations, and Barriers of One Time Code Contributors to FLOSS Projects: A Survey , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[16]  Kenichi Matsumoto,et al.  Which review feedback did long-term contributors get on OSS projects? , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[17]  Gina Venolia,et al.  Can peer code reviews be exploited for later information needs? , 2009, 2009 31st International Conference on Software Engineering - Companion Volume.

[18]  Michael Fagan Design and Code Inspections to Reduce Errors in Program Development , 1976, IBM Syst. J..

[19]  Christian Bird,et al.  Convergent contemporary software peer review practices , 2013, ESEC/FSE 2013.

[20]  Deborah Finfgeld-Connett,et al.  Use of content analysis to conduct knowledge-building and theory-generating qualitative systematic reviews , 2014 .

[21]  Emerson R. Murphy-Hill,et al.  Do Developers Read Compiler Error Messages? , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[22]  Alexander Serebrenik,et al.  Perceptions of Diversity on Git Hub: A User Survey , 2015, 2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering.

[23]  A. Strauss,et al.  The discovery of grounded theory: strategies for qualitative research aldine de gruyter , 1968 .

[24]  Alberto Bacchelli,et al.  Expectations, outcomes, and challenges of modern code review , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[25]  Kurt Schneider,et al.  Associating working memory capacity and code change ordering with code review performance , 2019, Empirical Software Engineering.

[26]  Nava Tintarev,et al.  Does Reviewer Recommendation Help Developers , 2020 .

[27]  Jorge C. A. de Figueiredo,et al.  Automatic Decomposition of Java Open Source Pull Requests: A Replication Study , 2018, SOFSEM.

[28]  Alexander Serebrenik,et al.  How do community smells influence code smells? , 2018, ICSE.

[29]  Barry Boehm,et al.  Top 10 list [software development] , 2001 .

[30]  K. R. Clarke,et al.  Non‐parametric multivariate analyses of changes in community structure , 1993 .

[31]  Phillip G. Armour The five orders of ignorance , 2000, CACM.

[32]  Robert C. Martin Agile Software Development, Principles, Patterns, and Practices , 2002 .

[33]  Fabio Palomba,et al.  Information Needs in Contemporary Code Review , 2018, Proc. ACM Hum. Comput. Interact..

[34]  A. Scott,et al.  A Cluster Analysis Method for Grouping Means in the Analysis of Variance , 1974 .

[35]  A. Graesser,et al.  Confusion can be beneficial for learning. , 2014 .

[36]  Andrew Begel,et al.  Analyze this! 145 questions for data scientists in software engineering , 2013, ICSE.

[37]  Alberto Bacchelli,et al.  What makes a code change easier to review: an empirical investigation on code change reviewability , 2018, ESEC/SIGSOFT FSE.

[38]  Luke Church,et al.  Modern Code Review: A Case Study at Google , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[39]  Walter F. Tichy,et al.  Rcs — a system for version control , 1985, Softw. Pract. Exp..

[40]  Andy Huber,et al.  Peer reviews in software: a practical guide , 2002, SOEN.

[41]  I. William Zartman Comparative Case Studies , 2011 .

[42]  Michael W. Godfrey,et al.  Studying Pull Request Merges: A Case Study of Shopify's Active Merchant , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[43]  Miryung Kim,et al.  Interactive Code Review for Systematic Changes , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[44]  David A. Wagner,et al.  A Large-Scale Study of Modern Code Review and Security in Open Source Projects , 2017, PROMISE.

[45]  Premkumar T. Devanbu,et al.  Will They Like This? Evaluating Code Contributions with Language Models , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[46]  Daniela E. Damian,et al.  Selecting Empirical Methods for Software Engineering Research , 2008, Guide to Advanced Empirical Software Engineering.

[47]  Shuvendu K. Lahiri,et al.  Helping Developers Help Themselves: Automatic Decomposition of Code Review Changesets , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[48]  Hajimu Iida,et al.  Who does what during a code review? Datasets of OSS peer review repositories , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[49]  Kurt Schneider,et al.  Comparing Pre Commit Reviews and Post Commit Reviews Using Process Simulation , 2016, 2016 IEEE/ACM International Conference on Software and System Processes (ICSSP).

[50]  Qiang Zhou,et al.  Poster: Guiding Developers to Make Informative Commenting Decisions in Source Code , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion).

[51]  Everton L. G. Alves,et al.  Refactoring-Aware Code Review: A Systematic Mapping Study , 2019, 2019 IEEE/ACM 3rd International Workshop on Refactoring (IWoR).

[52]  Kenichi Matsumoto,et al.  Do Review Feedbacks Influence to a Contributor's Time Spent on OSS Projects? , 2018, 2018 IEEE International Conference on Big Data, Cloud Computing, Data Science & Engineering (BCD).

[53]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[54]  Fabio Palomba,et al.  Fine-grained just-in-time defect prediction , 2019, J. Syst. Softw..

[55]  Alexander Serebrenik,et al.  Going Farther Together: The Impact of Social Capital on Sustained Participation in Open Source , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[56]  Michael W. Godfrey,et al.  Investigating technical and non-technical factors influencing modern code review , 2015, Empirical Software Engineering.

[57]  Janice Singer,et al.  Ethical Issues in Empirical Studies of Software Engineering , 2002, IEEE Trans. Software Eng..

[58]  Shane McIntosh,et al.  An empirical study of the impact of modern code review practices on software quality , 2015, Empirical Software Engineering.

[59]  C. Steele,et al.  Stereotype threat and the intellectual test performance of African Americans. , 1995, Journal of personality and social psychology.

[60]  Nicole Novielli,et al.  Confusion Detection in Code Reviews , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[61]  Premkumar T. Devanbu,et al.  Gender and Tenure Diversity in GitHub Teams , 2015, CHI.

[62]  Patrice Bellot,et al.  Uncertainty detection in natural language: a probabilistic model , 2016, WIMS.

[63]  Michael W. Godfrey,et al.  Investigating code review quality: Do people and participation matter? , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[64]  Dongmei Zhang,et al.  How do software engineers understand code changes?: an exploratory study in industry , 2012, SIGSOFT FSE.

[65]  Gabriele Bavota,et al.  Four eyes are better than two: On the impact of code reviews on software quality , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[66]  Gemma Catolino,et al.  An extensive evaluation of ensemble techniques for software change prediction , 2019, J. Softw. Evol. Process..

[67]  Nicole Novielli,et al.  Confusion in Code Reviews: Reasons, Impacts, and Coping Strategies , 2019, 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[68]  Nicole Novielli,et al.  Communicative Intention in Code Review Questions , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[69]  Csaba Faragó Variance of Source Code Quality Change Caused by Version Control Operations , 2015, Acta Cybern..

[70]  Bo Guo,et al.  Decomposing Composite Changes for Code Review and Regression Test Selection in Evolving Software , 2019, Journal of Computer Science and Technology.

[71]  Carolyn Penstein Rosé,et al.  Exploring the Effect of Confusion in Discussion Forums of Massive Open Online Courses , 2015, L@S.

[72]  Sunghun Kim,et al.  Partitioning Composite Code Changes to Facilitate Code Review , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[73]  Hajimu Iida,et al.  Mining the Modern Code Review Repositories: A Dataset of People, Process and Product , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[74]  José Maria N. David,et al.  Towards code reviewer recommendation: a systematic review and mapping of the literature , 2019, CIbSE.

[75]  Akito Monden,et al.  Analyzing individual performance of source code review using reviewers' eye movement , 2006, ETRA.

[76]  Gabriele Bavota,et al.  Mining Version Histories for Detecting Code Smells , 2015, IEEE Transactions on Software Engineering.

[77]  Yangjoo Park,et al.  Expressing Uncertainty in Computer-Mediated Discourse: Language as a Marker of Intellectual Work , 2012 .

[78]  Christian Bird,et al.  Code Reviewing in the Trenches: Challenges and Best Practices , 2018, IEEE Software.

[79]  Sidney D'Mello,et al.  Confusion and its dynamics during device comprehension with breakdown scenarios. , 2014, Acta psychologica.

[80]  Zeki Mazan,et al.  Will it pass? Predicting the outcome of a source code review , 2018 .

[81]  Shane McIntosh,et al.  An Empirical Comparison of Model Validation Techniques for Defect Prediction Models , 2017, IEEE Transactions on Software Engineering.

[82]  Hajimu Iida,et al.  "Was My Contribution Fairly Reviewed?" A Framework to Study the Perception of Fairness in Modern Code Reviews , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[83]  Shane McIntosh,et al.  An empirical study of design discussions in code review , 2018, ESEM.

[84]  Shari Lawrence Pfleeger,et al.  Personal Opinion Surveys , 2008, Guide to Advanced Empirical Software Engineering.

[85]  Zibin Zheng,et al.  Salient-class location: help developers understand code change in code review , 2018, ESEC/SIGSOFT FSE.

[86]  Daniel M. German,et al.  Understanding open source software peer review: Review processes, parameters and statistical models, and underlying behaviours and mechanisms , 2011 .

[87]  Margaret-Anne D. Storey,et al.  Understanding broadcast based peer review on open source software projects , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[88]  Kai Petersen,et al.  Guidelines for conducting systematic mapping studies in software engineering: An update , 2015, Inf. Softw. Technol..

[89]  Foutse Khomh,et al.  An Empirical Study on Factors Impacting Bug Fixing Time , 2012, 2012 19th Working Conference on Reverse Engineering.

[90]  William Foddy,et al.  Constructing Questions for Interviews and Questionnaires: Frontmatter , 1993 .

[91]  Brian H. McArdle,et al.  FITTING MULTIVARIATE MODELS TO COMMUNITY DATA: A COMMENT ON DISTANCE‐BASED REDUNDANCY ANALYSIS , 2001 .

[92]  Jing Jiang,et al.  Predicting Which Pull Requests Will Get Reopened in GitHub , 2018, 2018 25th Asia-Pacific Software Engineering Conference (APSEC).

[93]  Limin Yang,et al.  VulDigger: A Just-in-Time and Cost-Aware Tool for Digging Vulnerability-Contributing Changes , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[94]  Kurt Schneider,et al.  On the Optimal Order of Reading Source Code Changes for Review , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[95]  Gregorio Robles,et al.  Reviewing Career Paths of the OpenStack Developers , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[96]  Hajimu Iida,et al.  ReDA: A Web-Based Visualization Tool for Analyzing Modern Code Review Dataset , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[97]  Kai Petersen,et al.  Systematic Mapping Studies in Software Engineering , 2008, EASE.

[98]  Hajimu Iida,et al.  Assessing MCR Discussion Usefulness Using Semantic Similarity , 2014, 2014 6th International Workshop on Empirical Software Engineering in Practice.

[99]  Christian Bird,et al.  Gerrit software code review data from Android , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[100]  Nicole Novielli,et al.  An empirical assessment of best-answer prediction models in technical Q&A sites , 2018, Empirical Software Engineering.

[101]  Foutse Khomh,et al.  Why Did This Reviewed Code Crash? An Empirical Study of Mozilla Firefox , 2018, 2018 25th Asia-Pacific Software Engineering Conference (APSEC).

[102]  Paul Ralph,et al.  Grounded Theory in Software Engineering Research: A Critical Review and Guidelines , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[103]  Jeffrey C. Carver,et al.  Process Aspects and Social Dynamics of Contemporary Code Review: Insights from Open Source Development and Industrial Practice at Microsoft , 2017, IEEE Transactions on Software Engineering.

[104]  Ashish Sureka,et al.  Mining Peer Code Review System for Computing Effort and Contribution Metrics for Patch Reviewers , 2014, 2014 IEEE 4th Workshop on Mining Unstructured Data.

[105]  Peng Liang,et al.  Multi-Perspective Visualization to Assist Code Change Review , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[106]  Jing Wang,et al.  Comparative case studies of open source software peer review practices , 2015, Inf. Softw. Technol..

[107]  Arie van Deursen,et al.  An exploratory study of the pull-based software development model , 2014, ICSE.