Unusual Events in GitHub Repositories

In large and active software projects, it becomes impractical for a developer to stay aware of all project activity. While it might not be necessary to know about each commit or issue, it is arguably important to know about the ones that are unusual. To investigate this hypothesis, we identified unusual events in 200 GitHub projects using a comprehensive list of ways in which an artifact can be unusual and asked 140 developers responsible for or affected by these events to comment on the usefulness of the corresponding information. Based on 2,096 answers, we identify the subset of unusual events that developers consider particularly useful, including large code modifications and unusual amounts of reviewing activity, along with qualitative evidence on the reasons behind these answers. Our findings provide a means for reducing the amount of information that developers need to parse in order to stay up to date with development activity in their projects.

[1]  Lin Tan,et al.  Do time of day and developer experience affect commit bugginess? , 2011, MSR '11.

[2]  Leif Singer,et al.  It was a bit of a race: Gamification of version control , 2012, 2012 Second International Workshop on Games and Software Engineering: Realizing User Engagement with Game Engineering Techniques (GAS).

[3]  Serge Demeyer,et al.  Studying software evolution information by visualizing the change history , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[4]  Filippo Lanubile,et al.  Group Awareness in Global Software Engineering , 2013, IEEE Software.

[5]  Christoph Treude,et al.  Summarizing and measuring development activity , 2015, ESEC/SIGSOFT FSE.

[6]  Michele Lanza,et al.  Commit 2.0 , 2010, Web2SE '10.

[7]  Yi Zhang,et al.  Classifying Software Changes: Clean or Buggy? , 2008, IEEE Transactions on Software Engineering.

[8]  Alexander Serebrenik,et al.  Why Developers Are Slacking Off: Understanding How Software Teams Use Slack , 2016, CSCW Companion.

[9]  Yuriy Brun,et al.  Early Detection of Collaboration Conflicts and Risks , 2013, IEEE Transactions on Software Engineering.

[10]  Rohan Padhye,et al.  NeedFeed: taming change notifications by modeling code relevance , 2014, ASE.

[11]  Jacques Klein,et al.  Got issues? Who cares about it? A large scale investigation of issue trackers from GitHub , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[12]  Christoph Treude,et al.  UEDashboard: awareness of unusual events in commit histories , 2015, ESEC/SIGSOFT FSE.

[13]  Christoph Treude,et al.  Awareness 2.0: staying aware of projects, developers and tasks using dashboards and feeds , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[14]  André van der Hoek,et al.  Towards Awareness in the Large , 2006, 2006 IEEE International Conference on Global Software Engineering (ICGSE'06).

[15]  Mary Czerwinski,et al.  FASTDash: a visual dashboard for fostering awareness in software teams , 2007, CHI.

[16]  Daniela E. Damian,et al.  The promises and perils of mining GitHub , 2009, MSR 2014.

[17]  L. M. Anderson Statistics with Confidence. Confidence Intervals and Statistical Guidelines , 1989 .

[18]  Stephen G. Eick,et al.  Seesoft-A Tool For Visualizing Line Oriented Software Statistics , 1992, IEEE Trans. Software Eng..

[19]  Paul Dourish,et al.  Unifying artifacts and activities in a visual tool for distributed software development teams , 2004, Proceedings. 26th International Conference on Software Engineering.

[20]  António Rito Silva,et al.  Improving early detection of software merge conflicts , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[21]  Mary Czerwinski,et al.  WIPDash: Work Item and People Dashboard for Software Development Teams , 2009, INTERACT.

[22]  Thomas Fritz,et al.  Determining relevancy: how software developers determine relevant information in feeds , 2011, CHI.

[23]  Christoph Treude,et al.  Assessing developer contribution with repository mining-based metrics , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[24]  Jonathan I. Maletic,et al.  What's a Typical Commit? A Characterization of Open Source Software Repositories , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[25]  G. Bavota,et al.  A Validated Set of Smells in Model-View-Controller Architectures , 2016, ICSME.

[26]  Yuriy Brun,et al.  Proactive detection of collaboration conflicts , 2011, ESEC/FSE '11.

[27]  Christoph Treude,et al.  Mutual assessment in the social programmer ecosystem: an empirical investigation of developer profile aggregators , 2013, CSCW.

[28]  James D. Herbsleb,et al.  Social coding in GitHub: transparency and collaboration in an open software repository , 2012, CSCW.

[29]  Filippo Lanubile,et al.  SocialCDE: a social awareness tool for global software teams , 2013, ESEC/FSE 2013.

[30]  Michael W. Godfrey,et al.  Software process recovery using Recovered Unified Process Views , 2010, 2010 IEEE International Conference on Software Maintenance.

[31]  Ivan Beschastnikh,et al.  Comparing Repositories Visually with RepoGrams , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[32]  André van der Hoek,et al.  Palantir: raising awareness among configuration management workspaces , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[33]  Thomas Zimmermann,et al.  Information needs for software development analytics , 2012, 2012 34th International Conference on Software Engineering (ICSE).