Improving Change Recommendation using Aggregated Association Rules

Past research has proposed association rule mining as a means to uncover the evolutionary coupling from a system’s change history. These couplings have various applications, such as improving system decomposition and recommending related changes during development. The strength of the coupling can be characterized using a variety of interestingness measures. Existing recommendation engines typically use only the rule with the highest interestingness value in situations where more than one rule applies. In contrast, we argue that multiple applicable rules indicate increased evidence, and hypothesize that the aggregation of such rules can be exploited to provide more accurate recommendations.To investigate this hypothesis we conduct an empirical study on the change histories of two large industrial systems and four large open source systems. As aggregators we adopt three cumulative gain functions from information retrieval. The experiments evaluate the three using 39 different rule interestingness measures. The results show that aggregation provides a significant impact on most measure’s value and, furthermore, leads to a significant improvement in the resulting recommendation.

[1]  Harvey Siy,et al.  If your ver-sion control system could talk , 1997 .

[2]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[3]  Audris Mockus,et al.  Does Code Decay? Assessing the Evidence from Change Management Data , 2001, IEEE Trans. Software Eng..

[4]  Rajjan Shinghal,et al.  Evaluating the Interestingness of Characteristic Rules , 1996, KDD.

[5]  Romain Robbes,et al.  Logical Coupling Based on Fine-Grained Change Information , 2008, 2008 15th Working Conference on Reverse Engineering.

[6]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[7]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[8]  Wynne Hsu,et al.  Pruning and summarizing the discovered associations , 1999, KDD '99.

[9]  S. Kannan,et al.  Association Rule Pruning based on Interestingness Measures with Clustering , 2009, ArXiv.

[10]  Frank Tip,et al.  Chianti: a tool for change impact analysis of java programs , 2004, OOPSLA.

[11]  Leon Moonen,et al.  Crossing the boundaries while analyzing heterogeneous component-based software systems , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[12]  Reza Zafarani,et al.  Towards a more efficient static software change impact analysis method , 2008, PASTE '08.

[13]  Richard C. Holt,et al.  Predicting change propagation in software systems , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[14]  Heikki Mannila,et al.  Pruning and grouping of discovered association rules , 1995 .

[15]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[16]  Dirk Beyer,et al.  Clustering software artifacts based on frequent common changes , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[17]  Huzefa H. Kagdi,et al.  Impact analysis of change requests on source code based on interaction and commit histories , 2014, MSR 2014.

[18]  Gerardo Canfora,et al.  Impact analysis by mining software and change request repositories , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[19]  Luca Cagliero,et al.  Generalized association rule mining with constraints , 2012, Inf. Sci..

[20]  Kenneth McGarry,et al.  A survey of interestingness measures for knowledge discovery , 2005, The Knowledge Engineering Review.

[21]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[22]  Dave W. Binkley,et al.  Generalizing the Analysis of Evolutionary Coupling for Software Change Impact Analysis , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[23]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[24]  Lori A. Clarke,et al.  A Formal Model of Program Dependences and Its Implications for Software Testing, Debugging, and Maintenance , 1990, IEEE Trans. Software Eng..

[25]  Robert S. Arnold,et al.  Software Change Impact Analysis , 1996 .

[26]  David Lo,et al.  Beyond support and confidence: Exploring interestingness measures for rule-based specification mining , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[27]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[28]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[29]  J HamiltonHoward,et al.  Interestingness measures for data mining , 2006 .

[30]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[31]  Sabine Loudcher,et al.  Enhanced mining of association rules from data cubes , 2006, DOLAP '06.

[32]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).