On the automatic modularization of software systems using the Bunch tool

Since modern software systems are large and complex, appropriate abstractions of their structure are needed to make them more understandable and, thus, easier to maintain. Software clustering techniques are useful to support the creation of these abstractions by producing architectural-level views of a system's structure directly from its source code. This paper examines the Bunch clustering system which, unlike other software clustering tools, uses search techniques to perform clustering. Bunch produces a subsystem decomposition by partitioning a graph of the entities (e.g., classes) and relations (e.g., function calls) in the source code. Bunch uses a fitness function to evaluate the quality of graph partitions and uses search algorithms to find a satisfactory solution. This paper presents a case study to demonstrate how Bunch can be used to create views of the structure of significant software systems. This paper also outlines research to evaluate the software clustering results produced by Bunch.

[1]  Spiros Mancoridis,et al.  CRAFT: a framework for evaluating software clustering results in the absence of benchmark decompositions [Clustering Results Analysis Framework and Tools] , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[2]  Richard C. Holt,et al.  MoJo: a distance metric for software clusterings , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[3]  Emden R. Gansner,et al.  REportal: a Web-based portal site for reverse engineering , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[4]  Yih-Farn Chen,et al.  REVERSE ENGINEERING: , 1995, Documentary Across Platforms.

[5]  Emden R. Gansner,et al.  A C++ data model supporting reachability analysis and dead code detection , 1997, ESEC '97/FSE-5.

[6]  Vassilios Tzerpos,et al.  Evaluating similarity measures for software decompositions , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[7]  David Notkin,et al.  Software Reflexion Models: Bridging the Gap between Design and Implementation , 2001, IEEE Trans. Software Eng..

[8]  John A. Clark,et al.  Formulating software engineering as a search problem , 2003, IEE Proc. Softw..

[9]  Emden R. Gansner,et al.  Using automatic clustering to produce high-level system organizations of source code , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[10]  Rainer Koschke,et al.  Aiding program comprehension by static and dynamic feature analysis , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[11]  Brian S. Mitchell,et al.  A heuristic approach to solving the software clustering problem , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[12]  A. Buchsbaum Enterprise Navigator: A System for Visualizing and Analyzing Software Infrastructures , 1999 .

[13]  Ian Sommerville,et al.  Software Engineering (7th Edition) , 2004 .

[14]  Spiros Mancoridis,et al.  Using Heuristic Search Techniques To Extract Design Abstractions From Source Code , 2002, GECCO.

[15]  Richard C. Holt,et al.  ACCD: an algorithm for comprehension-driven clustering , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[16]  Spiros Mancoridis,et al.  Using Interconnection Style Rules to Infer Software Architecture Relations , 2004, GECCO.

[17]  Albert Nijenhuis,et al.  Combinatorial Algorithms for Computers and Calculators , 1978 .

[18]  Spiros Mancoridis,et al.  On the Automatic Recovery of Style-Specific Architectural Relations in Software Systems , 2004, Automated Software Engineering.

[19]  Mary Shaw,et al.  Software architecture - perspectives on an emerging discipline , 1996 .

[20]  Spiros Mancoridis,et al.  Automatic clustering of software systems using a genetic algorithm , 1999, STEP '99. Proceedings Ninth International Workshop Software Technology and Engineering Practice.

[21]  Jeffrey L. Korn,et al.  Chava: reverse engineering and tracking of Java applets , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[22]  Spiros Mancoridis,et al.  Comparing the decompositions produced by software clustering algorithms using similarity measurements , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[23]  Song C. Choi,et al.  Extracting and restructuring the design of large systems , 1990, IEEE Software.

[24]  John E. Howland,et al.  Computer graphics , 1990, IEEE Potentials.

[25]  Mark Harman,et al.  Reformulating software engineering as a search problem , 2003 .

[26]  Nicolas Anquetil,et al.  Recovering software architecture from the names of source files , 1999, J. Softw. Maintenance Res. Pract..

[27]  Periklis Andritsos,et al.  Software clustering based on information loss minimization , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[28]  Von der Fakultät,et al.  Atomic Architectural Component Recovery for Program Understanding and Evolution Evaluation of Automatic Re-Modularization Techniques and Their Integration in a Semi-Automatic Method , 2000 .

[29]  Richard C. Holt,et al.  The Orphan Adoption problem in architecture maintenance , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[30]  Robert W. Schwanke,et al.  An intelligent tool for re-engineering software modularity , 1991, [1991 Proceedings] 13th International Conference on Software Engineering.

[31]  Gregor Snelting,et al.  Assessing Modular Structure of Legacy Code Based on Mathematical Concept Analysis , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[32]  Spiros Mancoridis,et al.  A hierarchy of dynamic software views: from object-interactions to feature-interactions , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[33]  Vassilios Tzerpos,et al.  An effectiveness measure for software clustering algorithms , 2004, Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004..

[34]  Audris Mockus,et al.  Globalization by Chunking: A Quantitative Approach , 2001, IEEE Softw..

[35]  Mark Harman,et al.  A multiple hill climbing approach to software module clustering , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[36]  Hausi A. Müller,et al.  A reverse-engineering approach to subsystem structure identification , 1993, J. Softw. Maintenance Res. Pract..

[37]  Richard C. Holt,et al.  Linux as a case study: its extracted software architecture , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[38]  Ali Shokoufandeh,et al.  Applying spectral methods to software clustering , 2002, Ninth Working Conference on Reverse Engineering, 2002. Proceedings..

[39]  Spiros Mancoridis,et al.  Modeling the Search Landscape of Metaheuristic Software Clustering Algorithms , 2003, GECCO.

[40]  Laszlo A. Belady,et al.  System partitioning and its measure , 1981, J. Syst. Softw..

[41]  Nicolas Anquetil,et al.  A comparison of graphs of concept for reverse engineering , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[42]  David Notkin,et al.  Software reflexion models: bridging the gap between source and high-level models , 1995, SIGSOFT FSE.

[43]  Robert W. Schwanke,et al.  Using Neural Networks to Modularize Software , 1994, Machine Learning.

[44]  Rainer Koschke,et al.  A framework for experimental evaluation of clustering techniques , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[45]  Arie van Deursen,et al.  Identifying objects using cluster and concept analysis , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[46]  Anne Rogers,et al.  Visualizing and Analyzing Software Infrastructures , 2001, IEEE Softw..

[47]  Mark Harman,et al.  An empirical study of the robustness of two module clustering fitness functions , 2005, GECCO '05.

[48]  Emden R. Gansner,et al.  Bunch: a clustering tool for the recovery and maintenance of software system structures , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[49]  Nicolas Anquetil,et al.  Extracting concepts from file names; a new file clustering criterion , 1998, Proceedings of the 20th International Conference on Software Engineering.

[50]  Victor R. Basili,et al.  System Structure Analysis: Clustering with Data Bindings , 1985, IEEE Transactions on Software Engineering.

[51]  Emden R. Gansner,et al.  A Technique for Drawing Directed Graphs , 1993, IEEE Trans. Software Eng..

[52]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.