Evaluating the Impact of Software Evolution on Software Clustering

The evolution of a software project is a rich data source for analyzing and improving the software development process. Recently, several research groups have tried to cluster source code artifacts based on information about how the code of a software system evolves. The results of these evolutionary approaches seem promising, but a direct comparison to traditional software clustering approaches based on structural code dependencies is still missing. To fill this gap, we conducted several clustering experiments with an approved software clustering tool comparing and combining the evolutionary and the structural approach. These experiments show that the evolutionary approach could produce meaningful clustering results. But still the traditional approach provides better results because of a more reliable data density of the structural data. Finally, the combination of both approaches is able to improve the overall clustering quality.

[1]  Peter Weißgerber Automatic Refactoring Detection in Version Archives , 2010 .

[2]  Vassilios Tzerpos,et al.  An effectiveness measure for software clustering algorithms , 2004, Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004..

[3]  A. Steven Klusener,et al.  Assessing software archives with evolutionary clusters , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[4]  Harvey P. Siy,et al.  If your ver-sion control system could talk , 1997 .

[5]  Richard C. Holt,et al.  MoJo: a distance metric for software clusterings , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[6]  Lucian Voinea,et al.  CVSgrab: Mining the History of Large Software Projects , 2006, EuroVis.

[7]  Emden R. Gansner,et al.  Bunch: a clustering tool for the recovery and maintenance of software system structures , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[8]  Vassilios Tzerpos,et al.  Software clustering based on dynamic dependencies , 2005, Ninth European Conference on Software Maintenance and Reengineering.

[9]  Spiros Mancoridis,et al.  On the evaluation of the Bunch search-based software modularization algorithm , 2007, Soft Comput..

[10]  Richard C. Holt,et al.  Comparison of clustering algorithms in the context of software evolution , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[11]  Onaiza Maqbool,et al.  Hierarchical Clustering for Software Architecture Recovery , 2007, IEEE Transactions on Software Engineering.

[12]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[13]  Harald C. Gall,et al.  CVS release history data for detecting logical couplings , 2003, Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings..

[14]  Andreas Zeller,et al.  Mining version histories to guide software changes , 2005, Proceedings. 26th International Conference on Software Engineering.

[15]  Brian S. Mitchell,et al.  A heuristic approach to solving the software clustering problem , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[16]  Kwan-Liu Ma StarGate: A Unified, Interactive Visualization of Software Projects , 2008, 2008 IEEE Pacific Visualization Symposium.

[17]  Thomas Zimmermann,et al.  Preprocessing CVS Data for Fine-Grained Analysis , 2004, MSR.

[18]  Periklis Andritsos,et al.  Information-theoretic software clustering , 2005, IEEE Transactions on Software Engineering.

[19]  Andreas Zeller,et al.  How history justifies system architecture (or not) , 2003, Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings..

[20]  Nicolas Anquetil,et al.  Experiments with clustering as a software remodularization method , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[21]  Rainer Koschke,et al.  A framework for experimental evaluation of clustering techniques , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[22]  Spiros Mancoridis,et al.  Gadget: A Tool for Extracting the Dynamic Structure of Java Programs , 2001, SEKE.

[23]  Glenford J. Myers,et al.  Structured Design , 1974, IBM Syst. J..

[24]  Dirk Beyer,et al.  Clustering software artifacts based on frequent common changes , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[25]  Lou J. Somers,et al.  Using version information in architectural clustering - a case study , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[26]  Emden R. Gansner,et al.  Using automatic clustering to produce high-level system organizations of source code , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[27]  Gail E. Kaiser,et al.  An Information Retrieval Approach For Automatically Constructing Software Libraries , 1991, IEEE Trans. Software Eng..

[28]  Stéphane Ducasse,et al.  Enriching reverse engineering with semantic clustering , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[29]  Spiros Mancoridis,et al.  Comparing the decompositions produced by software clustering algorithms using similarity measurements , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.