Analysis of Newcomers Activity in Communicative Posts on GitHub

GitHub is a large platform that allows developers to host repositories with code and collaborate on various projects. With the development and expansion of open-source software (OSS) many researchers focused on various aspects of such open-source communities. Due to the availability of a wide range of projects, newcomers have an opportunity to be involved in ones that differ in terms of skills and experience required. However, new developers often face some barriers during the onboarding process. The aim of the current paper is to investigate relations towards newcomers through sentiment analysis of comments they receive in issues and pull requests in repositories of top-10 open source projects by contributor count and top-10 fastest growing open source projects based on The State of the Octoverse 2018 report by GitHub. By applying sentiment analysis we focus on differences between reactions to contributions of ‘old’ and ‘new’ developers, and find that while the majority of comments is rated as neutral, the amount of negativity is slightly higher for newcomers.

[1]  Margaret M. Burnett,et al.  Open Source Barriers to Entry, Revisited: A Sociotechnical Perspective , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[2]  Yang Li,et al.  Sentiment analysis of commit comments in GitHub: an empirical study , 2014, MSR 2014.

[3]  Marco Aurélio Gerosa,et al.  Social Barriers Faced by Newcomers Placing Their First Contribution in Open Source Software Projects , 2015, CSCW.

[4]  Hajimu Iida,et al.  How do GitHub Users Feel with Pull-Based Development? , 2016, 2016 7th International Workshop on Empirical Software Engineering in Practice (IWESEP).

[5]  Xin Zhang,et al.  How do Multiple Pull Requests Change the Same Code: A Study of Competing Pull Requests in GitHub , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[6]  Emerson R. Murphy-Hill,et al.  Sentiment and Politeness Analysis Tools on Developer Discussions Are Unreliable, but So Are People , 2018, 2018 IEEE/ACM 3rd International Workshop on Emotion Awareness in Software Engineering (SEmotion).

[7]  Kazi Zakia Sultana,et al.  Expressions of Sentiments during Code Reviews: Male vs. Female , 2018, 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[8]  Davide Fucci,et al.  A Simple NLP-Based Approach to Support Onboarding and Retention in Open Source Communities , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[9]  William Leibzon Social network of software development at GitHub , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[10]  Marco Aurélio Gerosa,et al.  Why do newcomers abandon open source software projects? , 2013, 2013 6th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE).

[11]  Sebastiano Panichella,et al.  Supporting newcomers in software development projects , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[12]  David Lo,et al.  Network Structure of Social Coding in GitHub , 2013, 2013 17th European Conference on Software Maintenance and Reengineering.

[13]  Navdeep Singh,et al.  How Do Code Refactoring Activities Impact Software Developers' Sentiments? - An Empirical Investigation Into GitHub Commits , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[14]  Michele Marchesi,et al.  Mining Communication Patterns in Software Development: A GitHub Analysis , 2018, PROMISE.

[15]  Alexander Hars,et al.  Working for free? Motivations of participating in open source projects , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[16]  Xuanzhe Liu,et al.  A First Look at Emoji Usage on GitHub: An Empirical Study , 2018, ArXiv.

[17]  Marco Aurélio Gerosa,et al.  The hard life of open source software project newcomers , 2014, CHASE.