An Empirical Study of Link Sharing in Review Comments

In the pull-based development, developers sometimes exchange review comments and share links, namely Uniform Resource Locators (URLs). Links are used to refer to related information from different websites, which may be beneficial to pull request evaluation. Nevertheless, little effort has been done on analyzing how links are shared and whether sharing links has any impacts on code review in GitHub. In this paper, we conduct a study of link sharing in review comments. We collect 114,810 pull requests and 251,487 review comments from 10 popular projects in GitHub. We find that 5.25% of pull requests have links in review comments on average. We divide links into two types: internal links which point to context in the same project, and external links which point to context outside of the project. We observe that 51.49% of links are internal, while 48.51% of links are external. The majority of internal links point to pull requests or blobs inside projects. We further study impacts of links. Results show that pull requests with links in review comments have more comments, more commenters and longer evaluation time than pull requests without links. These findings show that developers indeed share links and refer to related information in review comments. These results inspire future studies which enable more effective information sharing in the open source community, and improve information accessibility and navigability for software developers.

[1]  Alberto Bacchelli,et al.  Expectations, outcomes, and challenges of modern code review , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[2]  Jia-Huan He,et al.  Who should comment on this pull request? Analyzing attributes for more accurate commenter recommendation in pull-based development , 2017, Inf. Softw. Technol..

[3]  Marco Tulio Valente,et al.  Why modern open source projects fail , 2017, ESEC/SIGSOFT FSE.

[4]  Huaimin Wang,et al.  Within-ecosystem issue linking: a large-scale study of rails , 2018, SoftwareMining@ASE.

[5]  Premkumar T. Devanbu,et al.  Quality and productivity outcomes relating to continuous integration in GitHub , 2015, ESEC/SIGSOFT FSE.

[6]  Jacques Klein,et al.  Got issues? Who cares about it? A large scale investigation of issue trackers from GitHub , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[7]  Jordi Cabot,et al.  Exploring the use of labels to categorize issues in Open-Source Software projects , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[8]  Wei Liang,et al.  Nonnegative correlation coding for image classification , 2015, Science China Information Sciences.

[9]  Gang Yin,et al.  Social media in GitHub: the role of @-mention in assisting software development , 2015, Science China Information Sciences.

[10]  Jianfeng Ma,et al.  VKSE-MO: verifiable keyword search over encrypted data in multi-owner settings , 2017, Science China Information Sciences.

[11]  James D. Herbsleb,et al.  Let's talk about it: evaluating contributions through discussion in GitHub , 2014, SIGSOFT FSE.

[12]  Georgios Gousios,et al.  Work practices and challenges in pull-based development: the contributor's perspective , 2015, ICSE.

[13]  Premkumar T. Devanbu,et al.  Will They Like This? Evaluating Code Contributions with Language Models , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[14]  David Lo,et al.  Understanding inactive yet available assignees in GitHub , 2017, Inf. Softw. Technol..

[15]  Audris Mockus,et al.  Effectiveness of code contribution: from patch-based to pull-request-based tools , 2016, SIGSOFT FSE.

[16]  Chanchal Kumar Roy,et al.  Predicting Usefulness of Code Review Comments Using Textual Features and Developer Experience , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[17]  Gang Yin,et al.  Automatic Classification of Review Comments in Pull-based Development Model , 2017, SEKE.

[18]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[19]  James D. Herbsleb,et al.  Influence of social and technical factors for evaluating contribution in GitHub , 2014, ICSE.

[20]  Gang Yin,et al.  Determinants of pull-based development in the context of continuous integration , 2016, Science China Information Sciences.

[21]  Gang Yin,et al.  Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment? , 2016, Inf. Softw. Technol..

[22]  Arie van Deursen,et al.  An exploratory study of the pull-based software development model , 2014, ICSE.

[23]  Zhenchang Xing,et al.  The structure and dynamics of knowledge network in domain-specific Q&A sites: a case study of stack overflow , 2017, Empirical Software Engineering.

[24]  Leif Singer,et al.  A study of innovation diffusion through link sharing on stack overflow , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).