Using structural and semantic measures to improve software modularization

Changes during software evolution and poor design decisions often lead to packages that are hard to understand and maintain, because they usually group together classes with unrelated responsibilities. One way to improve such packages is to decompose them into smaller, more cohesive packages. The difficulty lies in the fact that most definitions and interpretations of cohesion are rather vague and the multitude of measures proposed by researchers usually capture only one aspect of cohesion. We propose a new technique for automatic re-modularization of packages, which uses structural and semantic measures to decompose a package into smaller, more cohesive ones. The paper presents the new approach as well as an empirical study, which evaluates the decompositions proposed by the new technique. The results of the evaluation indicate that the decomposed packages have better cohesion without a deterioration of coupling and the re-modularizations proposed by the tool are also meaningful from a functional point of view.

[1]  T. A. Wiggerts,et al.  Using clustering algorithms in legacy systems remodularization , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[2]  Giuseppe Scanniello,et al.  Using the Kleinberg Algorithm and Vector Space Model for Software System Clustering , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[3]  Paolo Tonella,et al.  Concept Analysis for Module Restructuring , 2001, IEEE Trans. Software Eng..

[4]  Rainer Koschke,et al.  Revisiting the Delta IC approach to component recovery , 2006, Sci. Comput. Program..

[5]  Malcolm Munro,et al.  Moral Dominance Relations for Program Comprehension , 2003, IEEE Trans. Software Eng..

[6]  Richard C. Holt,et al.  Comparison of clustering algorithms in the context of software evolution , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[7]  Onaiza Maqbool,et al.  Hierarchical Clustering for Software Architecture Recovery , 2007, IEEE Transactions on Software Engineering.

[8]  Xin Yao,et al.  Software Module Clustering as a Multi-Objective Search Problem , 2011, IEEE Transactions on Software Engineering.

[9]  Arie van Deursen,et al.  Identifying objects using cluster and concept analysis , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[10]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[11]  Tibor Gyimóthy,et al.  Using information retrieval based coupling measures for impact analysis , 2009, Empirical Software Engineering.

[12]  Meir M. Lehman,et al.  On understanding laws, evolution, and conservation in the large-program life cycle , 1984, J. Syst. Softw..

[13]  Giuseppe Visaggio,et al.  Software salvaging and the call dominance tree , 1995, J. Syst. Softw..

[14]  Mark Harman,et al.  A New Representation And Crossover Operator For Search-based Optimization Of Software Modularization , 2002, GECCO.

[15]  Stéphane Ducasse,et al.  Semantic clustering: Identifying topics in source code , 2007, Inf. Softw. Technol..

[16]  Stéphane Ducasse,et al.  Package Surface Blueprints: Visually Supporting the Understanding of Package Relationships , 2007, 2007 IEEE International Conference on Software Maintenance.

[17]  Peter M. Chisnall,et al.  Questionnaire Design, Interviewing and Attitude Measurement , 1993 .

[18]  Rainer Koschke,et al.  Revisiting the Delta IC approach to component recovery , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[19]  Mark Harman,et al.  An empirical study of the robustness of two module clustering fitness functions , 2005, GECCO '05.

[20]  Johannes Stammel,et al.  Search-based determination of refactorings for improving the class structure of object-oriented systems , 2006, GECCO.

[21]  Andrea De Lucia,et al.  Improving IR-based Traceability Recovery Using Smoothing Filters , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[22]  Dalton Serey Guerrero,et al.  Comparison of Graph Clustering Algorithms for Recovering Software Architecture Module Views , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[23]  Gabriele Bavota,et al.  A two-step technique for extract class refactoring , 2010, ASE.

[24]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[25]  Rudolf Ferenc,et al.  Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems , 2008, IEEE Transactions on Software Engineering.

[26]  Matthias Biehl,et al.  Search-based improvement of subsystem decompositions , 2005, GECCO '05.

[27]  Giuseppe Scanniello,et al.  Investigating the use of lexical information for software system clustering , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[28]  Gabriele Bavota,et al.  Identifying Extract Class refactoring opportunities using structural and semantic cohesion measures , 2011, J. Syst. Softw..

[29]  Aniello Cimitile,et al.  Decomposing legacy systems into objects: an eclectic approach , 2001, Inf. Softw. Technol..

[30]  Glenford J. Myers,et al.  Structured Design , 1974, IBM Syst. J..

[31]  Paolo Tonella,et al.  Improving Web site understanding with keyword-based clustering , 2008 .

[32]  Andrian Marcus,et al.  Supporting program comprehension using semantic and structural information , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[33]  R. Yin Case Study Research: Design and Methods , 1984 .

[34]  Spiros Mancoridis,et al.  On the automatic modularization of software systems using the Bunch tool , 2006, IEEE Transactions on Software Engineering.

[35]  Giuliano Antoniol,et al.  A method to re-organize legacy systems via concept analysis , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[36]  Mark Shtern,et al.  Methods for selecting and improving software clustering algorithms , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[37]  Mark Kent O'Keeffe,et al.  Search-based software maintenance , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[38]  Spiros Mancoridis,et al.  Comparing the decompositions produced by software clustering algorithms using similarity measurements , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[39]  Gabriele Bavota,et al.  Software Re-Modularization Based on Structural and Semantic Metrics , 2010, 2010 17th Working Conference on Reverse Engineering.

[40]  Nicolas Anquetil,et al.  Experiments with clustering as a software remodularization method , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[41]  Emden R. Gansner,et al.  Using automatic clustering to produce high-level system organizations of source code , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[42]  Giuseppe Scanniello,et al.  A Probabilistic Based Approach towards Software System Clustering , 2010, 2010 14th European Conference on Software Maintenance and Reengineering.

[43]  Andrea De Lucia,et al.  Using structural and semantic metrics to improve class cohesion , 2008, 2008 IEEE International Conference on Software Maintenance.

[44]  Gabriele Bavota,et al.  Playing with refactoring: Identifying extract class opportunities through game theory , 2010, 2010 IEEE International Conference on Software Maintenance.

[45]  Houari A. Sahraoui,et al.  Automatic Package Coupling and Cycle Minimization , 2009, 2009 16th Working Conference on Reverse Engineering.