Automating extract class refactoring: an improved method and its evaluation

During software evolution the internal structure of the system undergoes continuous modifications. These continuous changes push away the source code from its original design, often reducing its quality, including class cohesion. In this paper we propose a method for automating the Extract Class refactoring. The proposed approach analyzes (structural and semantic) relationships between the methods in a class to identify chains of strongly related methods. The identified method chains are used to define new classes with higher cohesion than the original class, while preserving the overall coupling between the new classes and the classes interacting with the original class. The proposed approach has been first assessed in an artificial scenario in order to calibrate the parameters of the approach. The data was also used to compare the new approach with previous work. Then it has been empirically evaluated on real Blobs from existing open source systems in order to assess how good and useful the proposed refactoring solutions are considered by software engineers and how well the proposed refactorings approximate refactorings done by the original developers. We found that the new approach outperforms a previously proposed approach and that developers find the proposed solutions useful in guiding refactorings.

[1]  Dag I. K. Sjøberg,et al.  Evaluating the effect of a delegated versus centralized control style on the maintainability of object-oriented software , 2004, IEEE Transactions on Software Engineering.

[2]  Paolo Nesi,et al.  Proceedings of the Third European Conference on Software Maintenance and Reengineering, Cahapel of St. Agnes, University of Amsterdam, the Netherlands, March 3-5, 1999 , 1999 .

[3]  Katsuhisa Maruyama,et al.  Automatic method refactoring using weighted dependence graphs , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[4]  Yann-Gaël Guéhéneuc,et al.  DECOR: A Method for the Specification and Detection of Code and Design Smells , 2010, IEEE Transactions on Software Engineering.

[5]  Jan Verelst,et al.  Refactoring - improving coupling and cohesion of existing code , 2004, 11th Working Conference on Reverse Engineering.

[6]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[7]  John Vlissides,et al.  Proceedings of the 16th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications , 2001, OOPSLA 2001.

[8]  Eduardo Casais,et al.  An Incremental Class Reorganization Approach , 1992, ECOOP.

[9]  Jean-Francois Girard,et al.  A comparison of abstract data types and objects recovery techniques , 2000, Sci. Comput. Program..

[10]  Glenford J. Myers,et al.  Structured Design , 1974, IBM Syst. J..

[11]  Ladan Tahvildari,et al.  A metric-based approach to enhance design quality through meta-pattern transformations , 2003, Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings..

[12]  Lionel C. Briand,et al.  Using coupling measurement for impact analysis in object-oriented systems , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[13]  Darren C. Atkinson,et al.  Lightweight detection of program refactorings , 2005, 12th Asia-Pacific Software Engineering Conference (APSEC'05).

[14]  Andrian Marcus,et al.  Supporting program comprehension using semantic and structural information , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[15]  Rainer Koschke,et al.  Revisiting the Delta IC approach to component recovery , 2006, Sci. Comput. Program..

[16]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[17]  Jörg Sander,et al.  Decomposing object-oriented class modules using an agglomerative clustering technique , 2009, 2009 IEEE International Conference on Software Maintenance.

[18]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[19]  Audris Mockus,et al.  International Workshop on Mining Software Repositories , 2004 .

[20]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[21]  Rainer Koschke,et al.  Automated clustering to support the reflexion method , 2007, Inf. Softw. Technol..

[22]  Nicolas Anquetil,et al.  Experiments with clustering as a software remodularization method , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[23]  M. F. Fuller,et al.  Practical Nonparametric Statistics; Nonparametric Statistical Inference , 1973 .

[24]  Gabriele Bavota,et al.  A two-step technique for extract class refactoring , 2010, ASE.

[25]  Miryung Kim,et al.  Template-based reconstruction of complex refactorings , 2010, 2010 IEEE International Conference on Software Maintenance.

[26]  Matthias Biehl,et al.  Search-based improvement of subsystem decompositions , 2005, GECCO '05.

[27]  Tibor Gyimóthy,et al.  Modeling class cohesion as mixtures of latent topics , 2009, 2009 IEEE International Conference on Software Maintenance.

[28]  Rudolf Ferenc,et al.  Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems , 2008, IEEE Transactions on Software Engineering.

[29]  Claus Lewerentz,et al.  Metrics Based Refactoring , 2001, CSMR.

[30]  Foutse Khomh,et al.  A Bayesian Approach for the Detection of Code and Design Smells , 2009, 2009 Ninth International Conference on Quality Software.

[31]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[32]  David R. Barstow,et al.  Proceedings of the 25th International Conference on Software Engineering , 1978, ICSE.

[33]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[34]  Arie van Deursen,et al.  Identifying objects using cluster and concept analysis , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[35]  Radu Marinescu,et al.  Detection strategies: metrics-based rules for detecting design flaws , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[36]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[37]  Tibor Gyimóthy,et al.  Using information retrieval based coupling measures for impact analysis , 2009, Empirical Software Engineering.

[38]  Daniela Cruzes,et al.  The evolution and impact of code smells: A case study of two open source systems , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[39]  Gabriele Bavota,et al.  Identifying Extract Class refactoring opportunities using structural and semantic cohesion measures , 2011, J. Syst. Softw..

[40]  Adrian Trifu,et al.  Diagnosing design problems in object oriented systems , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[41]  M. Rieger,et al.  SORMASA : A tool for Suggesting Model Refactoring Actions by Metrics-led Genetic Algorithm , 2007 .

[42]  Katherine J. Stewart,et al.  Opportunities and Challenges Applying Functional Data Analysis to the Study of Open Source Software Evolution , 2006 .

[43]  Kamran Sartipi,et al.  Component clustering based on maximal association , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[44]  Hausi A. Müller,et al.  Proceedings of the 23rd International Conference on Software Engineering & Knowledge Engineering (SEKE'2011), Eden Roc Renaissance, Miami Beach, USA, July 7-9, 2011 , 2001, SEKE.

[45]  Thomas J. Mowbray,et al.  AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis , 1998 .

[46]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[47]  Stéphane Ducasse,et al.  Semantic clustering: Identifying topics in source code , 2007, Inf. Softw. Technol..

[48]  Paul D. Scott,et al.  Coupling and cohesion measures for evaluation of component reusability , 2006, MSR '06.

[49]  Rainer Koschke,et al.  Revisiting the Delta IC approach to component recovery , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[50]  Gabriele Bavota,et al.  Identifying method friendships to remove the feature envy bad smell: NIER track , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[51]  Ivan Moore,et al.  Automatic inheritance hierarchy restructuring and method refactoring , 1996, OOPSLA '96.

[52]  Alexander Chatzigeorgiou,et al.  Identification of Move Method Refactoring Opportunities , 2009, IEEE Transactions on Software Engineering.

[53]  Houari A. Sahraoui,et al.  Automatic Package Coupling and Cycle Minimization , 2009, 2009 16th Working Conference on Reverse Engineering.

[54]  Stephen R. Schach,et al.  Validation of the coupling dependency metric as a predictor of run-time failures and maintenance measures , 1998, Proceedings of the 20th International Conference on Software Engineering.

[55]  Rushikesh K. Joshi,et al.  Concept Analysis for Class Cohesion , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[56]  Xin Yao,et al.  Software Module Clustering as a Multi-Objective Search Problem , 2011, IEEE Transactions on Software Engineering.

[57]  T. A. Wiggerts,et al.  Using clustering algorithms in legacy systems remodularization , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[58]  Paolo Tonella,et al.  Concept Analysis for Module Restructuring , 2001, IEEE Trans. Software Eng..

[59]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[60]  Lionel C. Briand,et al.  Investigating quality factors in object-oriented designs: an industrial case study , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[61]  W. J. Conover,et al.  Practical Nonparametric Statistics , 1972 .

[62]  Mark Kent O'Keeffe,et al.  Search-based software maintenance , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[63]  A. N. Oppenheim,et al.  Questionnaire Design, Interviewing and Attitude Measurement , 1992 .

[64]  Aniello Cimitile,et al.  Decomposing legacy systems into objects: an eclectic approach , 2001, Inf. Softw. Technol..

[65]  Johannes Stammel,et al.  Search-based determination of refactorings for improving the class structure of object-oriented systems , 2006, GECCO.

[66]  Sallie M. Henry,et al.  Maintenance metrics for the object oriented paradigm , 1993, [1993] Proceedings First International Software Metrics Symposium.

[67]  Vassilios Tzerpos,et al.  An effectiveness measure for software clustering algorithms , 2004, Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004..