Analysing change profiles of open source software projects using burst detection

Software evolution refers to the phenomenon of continuous software change and growth after its initial development. A version control system records all information about these changes. Several research studies in the past have studied the historical records of changes of open source software (OSS) projects and found them useful for understanding the software evolution process. However, most of them investigate the distributions of changes types, change size, and change effort in an isolated manner. There is no work, to the best of our knowledge, which takes a combined view of various dimensions of a change. This study examines the change activity in 106 OSS projects from three points of view: change purpose (type), change size, and change effort. The common patterns in change type, change size, and change effort are highlighted using the burst detection technique. The burst detection technique helps in identifying the peaks in the time series and compares them with the peaks of other time series. The results indicate that the change-type activity of OSS projects is significantly related with change effort, and change size for high and moderate-activity clusters. Though for low-activity cluster, this commonality of patterns is not there for all types of changes.

[1]  Kevin Crowston,et al.  Defining Open Source Software Project Success , 2003, ICIS.

[2]  Johanna Smeyers-Verbeke,et al.  Visual presentation of data by means of box plots , 2005 .

[3]  Jesús M. González-Barahona,et al.  Studying the laws of software evolution in a long-lived FLOSS project , 2013, J. Softw. Evol. Process..

[4]  Stefan Koch,et al.  Effort, co‐operation and co‐ordination in an open source software project: GNOME , 2002, Inf. Syst. J..

[5]  Dewayne E. Perry,et al.  Toward understanding the rhetoric of small source code changes , 2005, IEEE Transactions on Software Engineering.

[6]  Alain Abran,et al.  Analysis of maintenance work categories through measurement , 1991, Proceedings. Conference on Software Maintenance 1991.

[7]  J. A. Cuesta-Albertos,et al.  Trimmed $k$-means: an attempt to robustify quantizers , 1997 .

[8]  Shen Beijun,et al.  Mining GitHub: Why Commit Stops -- Exploring the Relationship between Developer's Commit Pattern and File Version Evolution , 2013, 2013 20th Asia-Pacific Software Engineering Conference (APSEC).

[9]  E. Burch,et al.  Modeling software maintenance requests: a case study , 1997, 1997 Proceedings International Conference on Software Maintenance.

[10]  Yutao Ma,et al.  Empirical Evidence on Developer's Commit Activity for Open-Source Software Projects , 2013, SEKE.

[11]  E. Burton Swanson,et al.  Characteristics of application software maintenance , 1978, CACM.

[12]  Andrea Capiluppi,et al.  Models for the evolution of OS projects , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[13]  Huzefa H. Kagdi,et al.  Impact analysis of change requests on source code based on interaction and commit histories , 2014, MSR 2014.

[14]  Ahmed E. Hassan,et al.  Automated classification of change messages in open source projects , 2008, SAC '08.

[15]  Shane McIntosh,et al.  Predicting Build Co-changes with Source Code Change and Commit Categories , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[16]  Yi Wang,et al.  Measuring the evolution of open source software systems with their communities , 2007, SOEN.

[17]  Gerardo Canfora,et al.  An eclectic approach for change impact analysis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[18]  Keith H. Bennett,et al.  Software maintenance and evolution: a roadmap , 2000, ICSE '00.

[19]  Giancarlo Succi,et al.  Analysis of Open Source Software Development Iterations by Means of Burst Detection Techniques , 2009, OSS.

[20]  Munish Saini,et al.  Change profile analysis of open-source software systems to understand their evolutionary behavior , 2018, Frontiers of Computer Science.

[21]  Stephen R. Schach,et al.  Determining the Distribution of Maintenance Categories: Survey versus Measurement , 2003, Empirical Software Engineering.

[22]  W. Cleveland LOWESS: A Program for Smoothing Scatterplots by Robust Locally Weighted Regression , 1981 .

[23]  Jonathan I. Maletic,et al.  Journal of Software Maintenance and Evolution: Research and Practice Survey a Survey and Taxonomy of Approaches for Mining Software Repositories in the Context of Software Evolution , 2022 .

[24]  Dirk Riehle,et al.  The Commit Size Distribution of Open Source Software , 2009 .

[25]  Daniel M. Germán,et al.  What do large commits tell us?: a taxonomical study of large commits , 2008, MSR '08.

[26]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[27]  Maurizio Morisio,et al.  Characteristics of open source projects , 2003, Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings..

[28]  E. Burton Swanson,et al.  The dimensions of maintenance , 1976, ICSE '76.