Using automatic clustering to produce high-level system organizations of source code

We describe a collection of algorithms that we developed and implemented to facilitate the automatic recovery of the modular structure of a software system from its source code. We treat automatic modularization as an optimization problem. Our algorithms make use of traditional hill-climbing and genetic algorithms.

[1]  David Notkin,et al.  Software reflexion models: bridging the gap between source and high-level models , 1995, SIGSOFT FSE.

[2]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[3]  Stephen C. North Applications of Graph Visualization , 1999 .

[4]  Victor R. Basili,et al.  System Structure Analysis: Clustering with Data Bindings , 1985, IEEE Transactions on Software Engineering.

[5]  Emden R. Gansner,et al.  A C++ data model supporting reachability analysis and dead code detection , 1997, ESEC '97/FSE-5.

[6]  Hans H. Kron,et al.  Programming-in-the-Large Versus Programming-in-the-Small , 1975, IEEE Transactions on Software Engineering.

[7]  Robert W. Schwanke,et al.  An intelligent tool for re-engineering software modularity , 1991, [1991 Proceedings] 13th International Conference on Software Engineering.

[8]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[9]  Linda M. Wills,et al.  Reverse Engineering , 1996, Springer US.

[10]  T. A. Wiggerts,et al.  Using clustering algorithms in legacy systems remodularization , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.