Exploring the Effects of History Length and Age on Mining Software Change Impact

The goal of Software Change Impact Analysis is to identify artifacts (typically source-code files) potentially affected by a change. Recently, there is an increased interest in mining software change impact based on evolutionary coupling. A particularly promising approach uses association rule mining to uncover potentially affected artifacts from patterns in the system's change history. Two main considerations when using this approach are the history length, the number of transactions from the change history used to identify the impact of a change, and history age, the number of transactions that have occurred since patterns were last mined from the history. Although history length and age can significantly affect the quality of mining results, few guidelines exist on how to best select appropriate values for these two parameters. In this paper, we empirically investigate the effects of history length and age on the quality of change impact analysis using mined evolutionary couplings. Specifically, we report on a series of systematic experiments involving the change histories of two large industrial systems and 17 large open source systems. In these experiments, we vary the length and age of the history used to mine software change impact, and assess how this affects precision and applicability. Results from the study are used to derive practical guidelines for choosing history length and age when applying association rule mining to conduct software change impact analysis.

[1]  Sergio A. Alvarez,et al.  Efficient Adaptive-Support Association Rule Mining for Recommender Systems , 2004, Data Mining and Knowledge Discovery.

[2]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[3]  Lori A. Clarke,et al.  A Formal Model of Program Dependences and Its Implications for Software Testing, Debugging, and Maintenance , 1990, IEEE Trans. Software Eng..

[4]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[5]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[6]  A.E. Hassan,et al.  The road ahead for Mining Software Repositories , 2008, 2008 Frontiers of Software Maintenance.

[7]  Audris Mockus,et al.  Does Code Decay? Assessing the Evidence from Change Management Data , 2001, IEEE Trans. Software Eng..

[8]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[9]  Gerardo Canfora,et al.  Impact analysis by mining software and change request repositories , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[10]  Daniel M. Germán,et al.  An empirical study of fine-grained software modifications , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[11]  Denys Poshyvanyk,et al.  Integrating conceptual and logical couplings for change impact analysis in software , 2013, Empirical Software Engineering.

[12]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[13]  David W. Binkley,et al.  Practical guidelines for change recommendation using association rule mining , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[14]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[15]  Frank Tip,et al.  Chianti: a tool for change impact analysis of java programs , 2004, OOPSLA.

[16]  Leon Moonen,et al.  Crossing the boundaries while analyzing heterogeneous component-based software systems , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[17]  Howard Rosenbaum,et al.  Effects of reading proficiency on embedded stem priming in primary school children , 2021 .

[18]  Nan Jiang,et al.  Research issues in data stream association rule mining , 2006, SGMD.

[19]  Andreas Zeller,et al.  Mining version histories to guide software changes , 2005, Proceedings. 26th International Conference on Software Engineering.

[20]  Dave W. Binkley,et al.  Generalizing the Analysis of Evolutionary Coupling for Software Change Impact Analysis , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[21]  Romain Robbes,et al.  Logical Coupling Based on Fine-Grained Change Information , 2008, 2008 15th Working Conference on Reverse Engineering.

[22]  Jonathan I. Maletic,et al.  What's a Typical Commit? A Characterization of Open Source Software Repositories , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[23]  Ron Kohavi,et al.  Real world performance of association rule algorithms , 2001, KDD '01.

[24]  Dirk Riehle,et al.  The empirical commit frequency distribution of open source projects , 2013, OpenSym.

[25]  Gregg Rothermel,et al.  Whole program path-based dynamic impact analysis , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[26]  Bogdan Dit,et al.  An adaptive approach to impact analysis from change requests to source code , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[27]  Dave W. Binkley,et al.  Improving Change Recommendation using Aggregated Association Rules , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[28]  Jonathan I. Maletic,et al.  Mining sequences of changed-files from version histories , 2006, MSR '06.

[29]  Yann-Gaël Guéhéneuc,et al.  Detecting asynchrony and dephase change patterns by mining software repositories , 2014, J. Softw. Evol. Process..

[30]  Reza Zafarani,et al.  Towards a more efficient static software change impact analysis method , 2008, PASTE '08.

[31]  Richard C. Holt,et al.  Predicting change propagation in software systems , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[32]  Abdulkareem Alali,et al.  AN EMPIRICAL CHARACTERIZATION OF COMMITS IN SOFTWARE REPOSITORIES , 2008 .

[33]  Huzefa H. Kagdi,et al.  Impact analysis of change requests on source code based on interaction and commit histories , 2014, MSR 2014.

[34]  Robert S. Arnold,et al.  Software Change Impact Analysis , 1996 .