Pieces of contextual information suitable for predicting co-changes? An empirical study

Models that predict software artifact co-changes have been proposed to assist developers in altering a software system and they often rely on coupling. However, developers have not yet widely adopted these approaches, presumably because of the high number of false recommendations. In this work, we conjecture that the contextual information related to software changes, which is collected from issues (e.g., issue type and reporter), developers’ communication (e.g., number of issue comments, issue discussants and words in the discussion), and commit metadata (e.g., number of lines added, removed, and modified), improves the accuracy of co-change prediction. We built customized prediction models for each co-change and evaluated the approach on 129 releases from a curated set of 10 Apache Software Foundation projects. Comparing our approach with the widely used association rules as a baseline, we found that contextual information models and association rules provide a similar number of co-change recommendations, but our models achieved a significantly higher F-measure. In particular, we found that contextual information significantly reduces the number of false recommendations compared to the baseline model. We conclude that contextual information is an important source for supporting change prediction and may be used to warn developers when they are about to miss relevant artifacts while performing a software change.

[1]  Gabriele Bavota,et al.  An empirical study on the developers' perception of software coupling , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[2]  Robert S. Arnold,et al.  Software Change Impact Analysis , 1996 .

[3]  Marco Aurélio Gerosa,et al.  What can commit metadata tell us about design degradation? , 2013, IWPSE 2013.

[4]  Bixin Li,et al.  Static change impact analysis techniques: A comparative study , 2015, J. Syst. Softw..

[5]  Andreas Zeller,et al.  The impact of tangled code changes , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[6]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[7]  Denys Poshyvanyk,et al.  Integrating conceptual and logical couplings for change impact analysis in software , 2013, Empirical Software Engineering.

[8]  Harald C. Gall,et al.  Putting It All Together: Using Socio-technical Networks to Predict Failures , 2009, 2009 20th International Symposium on Software Reliability Engineering.

[9]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[10]  Yu Zhou,et al.  A Bayesian Network Based Approach for Change Coupling Prediction , 2008, 2008 15th Working Conference on Reverse Engineering.

[11]  Dave W. Binkley,et al.  Generalizing the Analysis of Evolutionary Coupling for Software Change Impact Analysis , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[12]  Dirk Beyer,et al.  Clustering software artifacts based on frequent common changes , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[13]  Shane McIntosh,et al.  Mining Co-change Information to Understand When Build Changes Are Necessary , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[14]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[15]  Marco Aurélio Gerosa,et al.  Experience report: How do structural dependencies influence change propagation? An empirical study , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[16]  Gregg Rothermel,et al.  An empirical comparison of dynamic impact analysis algorithms , 2004, Proceedings. 26th International Conference on Software Engineering.

[17]  Richard C. Holt,et al.  Predicting change propagation in software systems , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[18]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[19]  David W. Binkley,et al.  Practical guidelines for change recommendation using association rule mining , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[20]  M. E. Conway HOW DO COMMITTEES INVENT , 1967 .

[21]  Denys Poshyvanyk,et al.  Using Relational Topic Models to capture coupling among classes in object-oriented software systems , 2010, 2010 IEEE International Conference on Software Maintenance.

[22]  PoshyvanykDenys,et al.  Using structural and textual information to capture feature coupling in object-oriented software , 2011 .

[23]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[24]  Bogdan Dit,et al.  Integrated impact analysis for managing software changes , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[25]  Lionel C. Briand,et al.  Using coupling measurement for impact analysis in object-oriented systems , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[26]  Marco Aurélio Gerosa,et al.  Using Structural Holes Metrics from Communication Networks to Predict Change Dependencies , 2014, CRIWG.

[27]  Marco Aurélio Gerosa,et al.  Chapter 11 – Change Coupling Between Software Artifacts: Learning from Past Changes , 2015 .

[28]  Georgios Gousios,et al.  Untangling fine-grained code changes , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[29]  Christoph Treude,et al.  Using contextual information to predict co-changes , 2017, J. Syst. Softw..

[30]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[31]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[32]  Harvey Siy,et al.  If your ver-sion control system could talk , 1997 .

[33]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[34]  Bogdan Dit,et al.  ImpactMiner: a tool for change impact analysis , 2014, ICSE Companion.

[35]  Gerardo Canfora,et al.  How changes affect software entropy: an empirical study , 2014, Empirical Software Engineering.

[36]  Christoph Treude,et al.  Overcoming Open Source Project Entry Barriers with a Portal for Newcomers , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[37]  Marco Aurélio Gerosa,et al.  Predicting Change Propagation from Repository Information , 2015, 2015 29th Brazilian Symposium on Software Engineering.

[38]  Marco Aurélio Gerosa,et al.  Social metrics included in prediction models on software engineering: a mapping study , 2014, PROMISE.

[39]  Shane McIntosh,et al.  Predicting Build Co-changes with Source Code Change and Commit Categories , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).