Data-mining in Support of Detecting Class Co-evolution

In an evolving system maintained over a long time period, there exist many non-trivial relationships among system classes, such as class co-evolutions, which usually are not easily perceivable in the source code. However, unfortunately, the continuing evolution of large, long-lived systems leads to lost information about these hidden relationships. In this paper, we propose a method for recovering such lost knowledge by data mining method. This method relies on the UMLDiff algorithm that, given a sequence of UML class models of a system, surfaces the design-level changes over its life span, thus eliminating the need for high quality modification reports and nonintuitive software code-based metrics. We employ Apriori association rule mining algorithm to the transactional database of class modifications, which elicit previously unknown or undocumented co-evolving relations among two or more classes. The recovered knowledge facilitates the overall understanding of system evolution and the planning of future maintaining activities. We report on one real world case study evaluating our approach.

[1]  Alessandro Bianchi,et al.  Evaluating software degradation through entropy , 2001, Proceedings Seventh International Software Metrics Symposium.

[2]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[3]  Alexander Egyed,et al.  Scalable consistency checking between diagrams - the VIEWINTEGRA approach , 2001, Proceedings 16th Annual International Conference on Automated Software Engineering (ASE 2001).

[4]  Richard C. Holt,et al.  Studying the chaos of code development , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[5]  Premkumar T. Devanbu,et al.  LaSSIE—a knowledge-based software information system , 1991, ICSE '90.

[6]  Stan Matwin,et al.  Supporting software maintenance by mining software update records , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[7]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[8]  Harald C. Gall,et al.  An evaluation of reverse engineering tool capabilities , 1998, J. Softw. Maintenance Res. Pract..

[9]  Eleni Stroulia,et al.  Metrics of Refactoring-based Development: An Experience Report , 2001, OOIS.

[10]  E. James Whitehead,et al.  Identification of software instabilities , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[11]  James H. Cross,et al.  Reverse engineering and design recovery: a taxonomy , 1990, IEEE Software.

[12]  Kostas Kontogiannis,et al.  Workshop report: The two-day workshop on Research Issues in the Intersection between Software Engineering and Artificial Intelligence (held in conjunction with ICSE-16) , 1995, Automated Software Engineering.

[13]  Andreas Zeller,et al.  How history justifies system architecture (or not) , 2003, Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings..

[14]  Eleni Stroulia,et al.  MATHAINO: SIMULTANEOUS LEGACY INTERFACE MIGRATION TO MULTIPLE PLATFORMS1 , 2001 .

[15]  Kai Koskimies,et al.  Transformation Between UML Diagrams , 2003, J. Database Manag..

[16]  Eleni Stroulia,et al.  JRefleX: towards supporting small student software teams , 2003, eclipse '03.

[17]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.