Expanding the Number of Reviewers in Open-Source Projects by Recommending Appropriate Developers

Code review is an important part of the development of any software project. Recently, many open source projects have begun practicing lightweight and tool-based code review (a.k.a modern code review) to make the process simpler and more efficient. However, those practices still require reviewers, of which there may not be sufficiently many to ensure timely decisions. In this paper, we propose a recommender-based approach to be used by open-source projects to increase the number of reviewers from among the appropriate developers. We first motivate our approach by an exploratory study of nine projects hosted on GitHub and Gerrit. Secondly, we build the recommender system itself, which, given a code change, initially searches for relevant reviewers based on similarities between the reviewing history and the files affected by the change, and then augments this set with developers who have a similar development history as these reviewers but have little or no relevant reviewing experience. To make these recommendations, we rely on collaborative filtering, and more precisely, on matrix factorization. Our evaluation shows that all nine projects could benefit from our system by using it both to get recommendations of previous reviewers and to expand their number from among the appropriate developers.

[1]  James Bennett,et al.  The Netflix Prize , 2007 .

[2]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[3]  Xu Wang,et al.  A hybrid approach to code reviewer recommendation with collaborative filtering , 2017, 2017 6th International Workshop on Software Mining (SoftwareMining).

[4]  Jorge García-Gutiérrez,et al.  RESDEC: Online Management Tool for Implementation Components Selection in Software Product Lines Using Recommender Systems , 2019, SPLC.

[5]  Manas Gaur,et al.  A Hybrid Recommender System for Patient-Doctor Matchmaking in Primary Care , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[6]  Alberto Bacchelli,et al.  Expectations, outcomes, and challenges of modern code review , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[7]  Volker Gruhn,et al.  Automatically recommending code reviewers based on their expertise: An empirical comparison , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[8]  Michael W. Godfrey,et al.  Code Review Quality: How Developers See It , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[9]  Peter C. Rigby,et al.  Mitigating Turnover with Code Review Recommendation: Balancing Expertise, Workload, and Knowledge Distribution , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[10]  Bruno Rossi,et al.  A Large-Scale Study on Source Code Reviewer Recommendation , 2018, 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA).

[11]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[12]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[13]  Michael E. Fagan Design and Code Inspections to Reduce Errors in Program Development , 1976, IBM Syst. J..

[14]  Gang Yin,et al.  Reviewer Recommender of Pull-Requests in GitHub , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[15]  Chanchal Kumar Roy,et al.  CORRECT: Code Reviewer Recommendation in GitHub Based on Cross-Project and Technology Experience , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[16]  Andy Zaidman,et al.  Modern code reviews in open-source projects: which problems do they fix? , 2014, MSR 2014.

[17]  Tang Qin,et al.  Replication Data , 2018 .

[18]  Naixue Xiong,et al.  Deep Matrix Factorization With Implicit Feedback Embedding for Recommendation System , 2019, IEEE Transactions on Industrial Informatics.

[19]  Hans De Sterck,et al.  Algorithmic Acceleration of Parallel ALS for Collaborative Filtering: Speeding up Distributed Big Data Recommendation in Spark , 2015, 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS).

[20]  Nava Tintarev,et al.  Does Reviewer Recommendation Help Developers , 2020 .

[21]  Xiaoping Fan,et al.  Core-reviewer recommendation based on Pull Request topic model and collaborator social network , 2019, Soft Computing.

[22]  Long Tran-Thanh,et al.  Efficient Thompson Sampling for Online Matrix-Factorization Recommendation , 2015, NIPS.

[23]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[24]  Gang Yin,et al.  Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment? , 2016, Inf. Softw. Technol..

[25]  David Lo,et al.  Who should review this change?: Putting text and file location analyses together for more accurate recommendations , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[26]  Igor Steinmacher,et al.  Who drives company-owned OSS projects: internal or external members? , 2018, Journal of the Brazilian Computer Society.

[27]  Vipin Balachandran,et al.  Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[28]  Ji Yun Chung,et al.  Music recommendation model by analysis of listener's musical preference factor of K-pop , 2018 .

[29]  Bernd Ludwig,et al.  Matrix factorization techniques for context aware recommendation , 2011, RecSys '11.

[30]  Weiwei Guo,et al.  Deep Natural Language Processing for Search and Recommender Systems , 2019, KDD.

[31]  Gregorio Robles,et al.  Free and open source software development: the end of the teenage years , 2017, Journal of Internet Services and Applications.

[32]  Gang Yin,et al.  RevRec: A two-layer reviewer recommendation algorithm in pull-based development model , 2018 .

[33]  Sophie Ahrens,et al.  Recommender Systems , 2012 .

[34]  Hajimu Iida,et al.  Who should review my code? A file location-based code-reviewer recommendation approach for Modern Code Review , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[35]  Jia-Huan He,et al.  CoreDevRec: Automatic Core Member Recommendation for Contribution Evaluation , 2015, Journal of Computer Science and Technology.

[36]  Luke Church,et al.  Modern Code Review: A Case Study at Google , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[37]  Gang Yin,et al.  Who Should Review this Pull-Request: Reviewer Recommendation to Expedite Crowd Collaboration , 2014, 2014 21st Asia-Pacific Software Engineering Conference.

[38]  Thomas Zimmermann,et al.  Improving Code Review by Predicting Reviewers and Acceptance of Patches , 2009 .

[39]  Jonas Poelmans,et al.  A New Cross-Validation Technique to Evaluate Quality of Recommender Systems , 2012, PerMIn.

[40]  Katharina Eggensperger,et al.  Towards an Empirical Foundation for Assessing Bayesian Optimization of Hyperparameters , 2013 .

[41]  David Lo,et al.  EnTagRec++: An enhanced tag recommendation system for software information sites , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[42]  Margaret-Anne D. Storey,et al.  Understanding broadcast based peer review on open source software projects , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[43]  Christian Bird,et al.  Automatically Recommending Peer Reviewers in Modern Code Review , 2016, IEEE Transactions on Software Engineering.

[44]  Dennis M. Wilkinson,et al.  Large-Scale Parallel Collaborative Filtering for the Netflix Prize , 2008, AAIM.

[45]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[46]  Shane McIntosh,et al.  The impact of code review coverage and code review participation on software quality: a case study of the qt, VTK, and ITK projects , 2014, MSR 2014.