Using version information in architectural clustering - a case study

This paper describes a case study that uses clustering to group classes of an existing object-oriented system of significant size into subsystems. The clustering process is based on the structural relations between the classes: associations, generalizations and dependencies. We experiment with different combinations of relationships and different ways to use this information in the clustering process. The results clearly show that dependency relations are vital to achieve good clusterings. The clustering is performed with a third party tool called Bunch. Compared to other clustering methods the results come relatively close to the result of a manual reconstruction. Performance wise the clustering takes a significant amount of time, but not too much to make it unpractical. In our case study, we base the clustering on information from multiple versions and compare the result to that obtained when basing the clustering on a single version. We experiment with several combinations of versions. If the clustering is based on relations that were present in both the reconstructed and the first version this leads to a significantly better clustering result compared to that obtained when using only information from the reconstructed version

[1]  Rainer Koschke,et al.  Atomic architectural component recovery for program understanding and evolution , 2002, International Conference on Software Maintenance, 2002. Proceedings..

[2]  David Eichmann,et al.  Program and interface slicing for reverse engineering , 1993, [1993] Proceedings Working Conference on Reverse Engineering.

[3]  Jianjun Zhao,et al.  A slicing-based approach to extracting reusable software architectures , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[4]  Rainer Koschke,et al.  WoSEF: workshop on standard exchange format , 2001, SOEN.

[5]  Brian S. Mitchell,et al.  A heuristic approach to solving the software clustering problem , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[6]  Song C. Choi,et al.  Extracting and restructuring the design of large systems , 1990, IEEE Software.

[7]  Fletcher J. Buckley Some Standards for Software Maintenance , 1989, Computer.

[8]  Vassilios Tzerpos,et al.  An optimal algorithm for MoJo distance , 2003, 11th IEEE International Workshop on Program Comprehension, 2003..

[9]  Hausi A. Müller,et al.  Composing subsystem structures using (k,2)-partite graphs , 1990, Proceedings. Conference on Software Maintenance 1990.

[10]  Arun Lakhotia,et al.  A Unified Framework For Expressing Software Subsystem Classification Techniques , 1997, J. Syst. Softw..

[11]  Computer Staff Some standards for software maintenance , 1989 .

[12]  Spiros Mancoridis,et al.  Comparing the decompositions produced by software clustering algorithms using similarity measurements , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[13]  Stéphane Ducasse,et al.  Reengineering Object-Oriented Applications , 2001 .

[14]  Rudolf K. Keller,et al.  Software visualization tools: survey and analysis , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[15]  Nicolas Anquetil,et al.  Experiments with clustering as a software remodularization method , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[16]  Mircea Trifu,et al.  Architecture-aware adaptive clustering of OO systems , 2004, Eighth European Conference on Software Maintenance and Reengineering, 2004. CSMR 2004. Proceedings..

[17]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[18]  Arie van Deursen Software architecture recovery and modelling: [WCRE 2001 discussion forum report] , 2002, SIAP.

[19]  Mark Shtern,et al.  A framework for the comparison of nested software decompositions , 2004, 11th Working Conference on Reverse Engineering.

[20]  Serge Demeyer,et al.  The FAMOOS Object-Oriented Reengineering Handbook , 1999 .

[21]  Vassilios Tzerpos,et al.  Evaluating similarity measures for software decompositions , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[22]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[23]  Ali Shokoufandeh,et al.  Spectral and meta-heuristic algorithms for software clustering , 2005, J. Syst. Softw..

[24]  Oscar Nierstrasz,et al.  Object-oriented reengineering patterns , 2004, Proceedings. 26th International Conference on Software Engineering.

[25]  Nabor C. Mendonça,et al.  Software Architecture Recovery and Modelling [ WCRE 2001 Discussion Forum Report ] , 2001 .

[26]  Spiros Mancoridis,et al.  Using Heuristic Search Techniques To Extract Design Abstractions From Source Code , 2002, GECCO.

[27]  Alex Quilici Reverse Engineering of Legacy Systems: A Path Toward Success , 1995, 1995 17th International Conference on Software Engineering.

[28]  Hausi A. Müller,et al.  A reverse-engineering approach to subsystem structure identification , 1993, J. Softw. Maintenance Res. Pract..

[29]  M. E. Conway HOW DO COMMITTEES INVENT , 1967 .

[30]  Richard C. Holt,et al.  ACCD: an algorithm for comprehension-driven clustering , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[31]  Richard C. Holt,et al.  The small world of software reverse engineering , 2004, 11th Working Conference on Reverse Engineering.

[32]  Richard C. Holt,et al.  Software botryology. Automatic clustering of software systems , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[33]  Vassilios Tzerpos,et al.  An effectiveness measure for software clustering algorithms , 2004, Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004..

[34]  Richard C. Holt,et al.  MoJo: a distance metric for software clusterings , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[35]  Liam O'Brien,et al.  Software Architecture Reconstruction: Practice Needs and Current Approaches , 2002 .

[36]  S. Sudarshan,et al.  Database System Concepts, 4th Edition , 2001 .

[37]  Robert W. Schwanke,et al.  An intelligent tool for re-engineering software modularity , 1991, [1991 Proceedings] 13th International Conference on Software Engineering.

[38]  Arie van Deursen Software architecture recovery and modelling: [WCRE 2001 discussion forum report] , 2002 .

[39]  David N. Card,et al.  Proceedings of the Conference on Software Maintenance , 1993 .

[40]  T. A. Wiggerts,et al.  Using clustering algorithms in legacy systems remodularization , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[41]  Stéphane Ducasse,et al.  Modeling Software Evolution by Treating History as a First Class Entity , 2005, Electron. Notes Theor. Comput. Sci..

[42]  James D. Herbsleb,et al.  Architectures, coordination, and distance: Conway’s law and beyond , 1999 .