Assessing MCR Discussion Usefulness Using Semantic Similarity

Modern Code Review (MCR) is an informal practice whereby reviewers virtually discuss proposed changes by adding comments through a code review tool or mailing list. It has received much research attention due to its perceived cost-effectiveness and popularity with industrial and OSS projects. Recent studies indicate there is a positive relationship between the number of review comments and code quality. However, little research exists investigating how such discussion impacts software quality. The concern is that the informality of MCR encourages a focus on trivial, tangential, or unrelated issues. Indeed, we have observed that such comments are quite frequent and may even constitute the majority. We conjecture that an effective MCR actually depends on having a substantive quantity of comments that directly impact a proposed change (or are "useful"). To investigate this, a necessary first step requires distinguishing review comments that are useful to a proposed change from those that are not. For a large OSS projects such as our Qt case study, manual assessment of the over 72,000 comments is a daunting task. We propose to utilize semantic similarity as a practical, cost efficient, and empirically assurable approach for assisting with the manual usefulness assessment of MCR comments. Our case study results indicate that our approach can classify comments with an average F-measure score of 0.73 and reduce comment usefulness assessment effort by about 77%.

[1]  Hajimu Iida,et al.  Who does what during a code review? Datasets of OSS peer review repositories , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[2]  David Lo,et al.  Finding relevant answers in software forums , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[3]  Andy Zaidman,et al.  Modern code reviews in open-source projects: which problems do they fix? , 2014, MSR 2014.

[4]  Michael W. Godfrey,et al.  The influence of non-technical factors on code review , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[5]  David Lo,et al.  Automatic classification of software related microblogs , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[6]  Alberto Bacchelli,et al.  Expectations, outcomes, and challenges of modern code review , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[7]  Shane McIntosh,et al.  The impact of code review coverage and code review participation on software quality: a case study of the qt, VTK, and ITK projects , 2014, MSR 2014.

[8]  Hajimu Iida,et al.  Improving code review effectiveness through reviewer recommendations , 2014, CHASE.

[9]  Ahmed E. Hassan,et al.  What are developers talking about? An analysis of topics and trends in Stack Overflow , 2014, Empirical Software Engineering.