Can we predict types of code changes? An empirical analysis

There exist many approaches that help in pointing developers to the change-prone parts of a software system. Although beneficial, they mostly fall short in providing details of these changes. Fine-grained source code changes (SCC) capture such detailed code changes and their semantics on the statement level. These SCC can be condition changes, interface modifications, inserts or deletions of methods and attributes, or other kinds of statement changes. In this paper, we explore prediction models for whether a source file will be affected by a certain type of SCC. These predictions are computed on the static source code dependency graph and use social network centrality measures and object-oriented metrics. For that, we use change data of the Eclipse platform and the Azureus 3 project. The results show that Neural Network models can predict categories of SCC types. Furthermore, our models can output a list of the potentially change-prone files ranked according to their change-proneness, overall and per change type category.

[1]  Stan Matwin,et al.  Mining the maintenance history of a legacy software system , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[2]  Jens H. Weber,et al.  Predicting maintainability with object-oriented metrics -an empirical comparison , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[3]  Yuming Zhou,et al.  Predicting object-oriented software maintainability using multivariate adaptive regression splines , 2007, J. Syst. Softw..

[4]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[5]  Harald C. Gall,et al.  Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction , 2007, IEEE Transactions on Software Engineering.

[6]  Nicolas Ducheneaut,et al.  Socialization in an Open Source Software Community: A Socio-Technical Analysis , 2005, Computer Supported Cooperative Work (CSCW).

[7]  Premkumar T. Devanbu,et al.  Latent social structure in open source projects , 2008, SIGSOFT '08/FSE-16.

[8]  Ramanath Subramanyam,et al.  Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects , 2003, IEEE Trans. Software Eng..

[9]  Lionel C. Briand,et al.  Dynamic coupling measurement for object-oriented software , 2004, IEEE Transactions on Software Engineering.

[10]  Martin P. Robillard,et al.  Non-essential changes in version histories , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[11]  H. Dieter Rombach,et al.  A Controlled Expeniment on the Impact of Software Structure on Maintainability , 1987, IEEE Transactions on Software Engineering.

[12]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[13]  Shih-Kun Huang,et al.  Mining version histories to verify the learning process of Legitimate Peripheral Participants , 2005, MSR '05.

[14]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[15]  Audris Mockus,et al.  Predicting risk of software changes , 2000, Bell Labs Technical Journal.

[16]  Brendan Murphy,et al.  Can developer-module networks predict failures? , 2008, SIGSOFT '08/FSE-16.

[17]  Romain Robbes,et al.  Logical Coupling Based on Fine-Grained Change Information , 2008, 2008 15th Working Conference on Reverse Engineering.

[18]  Premkumar T. Devanbu,et al.  Open Borders? Immigration in Open Source Projects , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[19]  Taghi M. Khoshgoftaar,et al.  Improving code churn predictions during the system test and maintenance phases , 1994, Proceedings 1994 International Conference on Software Maintenance.

[20]  Andreas Zeller,et al.  Predicting component failures at design time , 2006, ISESE '06.

[21]  Ahmed E. Hassan,et al.  Studying the impact of dependency network measures on software quality , 2010, 2010 IEEE International Conference on Software Maintenance.

[22]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[23]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[24]  S. Dowdy,et al.  Statistics for Research , 1983 .

[25]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[26]  Harald C. Gall,et al.  Comparing fine-grained source code changes and code churn for bug prediction , 2011, MSR '11.

[27]  Natalia Juristo Juzgado,et al.  Using differences among replications of software engineering experiments to gain knowledge , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[28]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[29]  Harald C. Gall,et al.  Change Analysis with Evolizer and ChangeDistiller , 2009, IEEE Software.

[30]  Yi Zhang,et al.  Classifying Software Changes: Clean or Buggy? , 2008, IEEE Transactions on Software Engineering.

[31]  Jesús M. González-Barahona,et al.  Applying Social Network Analysis to the Information in CVS Repositories , 2004, MSR.

[32]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[33]  Sallie M. Henry,et al.  Object-oriented metrics that predict maintainability , 1993, J. Syst. Softw..

[34]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[35]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[36]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[37]  Ken-ichi Matsumoto,et al.  Accelerating cross-project knowledge collaboration using collaborative filtering and social networks , 2005, MSR.

[38]  Alexander Chatzigeorgiou,et al.  Predicting the probability of change in object-oriented systems , 2005, IEEE Transactions on Software Engineering.

[39]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[40]  Stéphane Ducasse,et al.  Yesterday's Weather: guiding early reverse engineering efforts by summarizing the evolution of changes , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..