Automated Refactoring: Can They Pass The Turing Test?

Refactoring is a maintenance activity that aims to improve design quality while preserving the behavior of a system. Several (semi)automated approaches have been proposed to support developers in this maintenance activity, based on the correction of anti-patterns, which are "poor solutions" to recurring design problems. However, little quantitative evidence exists about the impact of automatically refactored code on program comprehension, and in which context automated refactoring can be as effective as manual refactoring. We performed an empirical study to investigate whether the use of automated refactoring approaches affects the understandability of systems during comprehension tasks. (1) We surveyed 80 developers, asking them to identify from a set of 20 refactoring changes if they were generated by developers or by machine, and to rate the refactorings according to their design quality; (2) we asked 30 developers to complete code comprehension tasks on 10 systems that were refactored by either a freelancer or an automated refactoring tool. We measured developers' performance using the NASA task load index for their effort; the time that they spent performing the tasks; and their percentages of correct answers. Results show that for 3 out the 5 types of studied anti-patterns, developers cannot recognize the origin of the refactoring (i.e., whether it was performed by a human or an automatic tool). We also observe that developers do not prefer human refactorings over automated refactorings, except when refactoring Blob classes; and that there is no statistically significant difference between the impact on code understandability of human refactorings and automated refactorings. We conclude that automated refactorings can be as effective as manual refactorings. However, for complex anti-patterns types like the Blob, the perceived quality of human refactorings is slightly higher.

[1]  David Lo,et al.  Deep Code Comment Generation , 2018, 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC).

[2]  Claire Le Goues,et al.  A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[3]  Takeo Imai,et al.  A quantitative evaluation of maintainability enhancement by refactoring , 2002, International Conference on Software Maintenance, 2002. Proceedings..

[4]  Katsuro Inoue,et al.  Improving multi-objective code-smells correction using development history , 2015, J. Syst. Softw..

[5]  Tom Mens,et al.  Does God Class Decomposition Affect Comprehensibility? , 2006, IASTED Conf. on Software Engineering.

[6]  William F. Opdyke,et al.  Refactoring object-oriented frameworks , 1992 .

[7]  Witold Pedrycz,et al.  A Case Study on the Impact of Refactoring on Quality and Productivity in an Agile Team , 2008, CEE-SET.

[8]  Yann-Gaël Guéhéneuc,et al.  DECOR: A Method for the Specification and Detection of Code and Design Smells , 2010, IEEE Transactions on Software Engineering.

[9]  William G. Griswold,et al.  Automated assistance for program restructuring , 1993, TSEM.

[10]  Gabriele Bavota,et al.  Do They Really Smell Bad? A Study on Developers' Perception of Bad Code Smells , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[11]  GORDON FRASER,et al.  A Large-Scale Evaluation of Automated Unit Test Generation Using EvoSuite , 2014, ACM Trans. Softw. Eng. Methodol..

[12]  Foutse Khomh,et al.  Efficient refactoring scheduling based on partial order reduction , 2018, J. Syst. Softw..

[13]  Stas Negara,et al.  Use, disuse, and misuse of automated refactorings , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[14]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[15]  Tibor Gyimóthy,et al.  Do automatic refactorings improve maintainability? An industrial case study , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[16]  Foutse Khomh,et al.  An Empirical Study of the Impact of Two Antipatterns, Blob and Spaghetti Code, on Program Comprehension , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[17]  E. Murphy-Hill,et al.  Refactoring Tools: Fitness for Purpose , 2006, IEEE Software.

[18]  David Lorge Parnas,et al.  Software aging , 1994, Proceedings of 16th International Conference on Software Engineering.

[19]  David Sharek,et al.  A Useable, Online NASA-TLX Tool , 2011 .

[20]  Iman Hemati Moghadam,et al.  Automated Refactoring Using Design Differencing , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[21]  Stas Negara,et al.  A Comparative Study of Manual and Automated Refactorings , 2013, ECOOP.

[22]  Ioannis Stamelos,et al.  A controlled experiment investigation of an object-oriented design heuristic for maintainability , 2004, J. Syst. Softw..

[23]  Foutse Khomh,et al.  On the use of developers' context for automatic refactoring of software anti-patterns , 2017, J. Syst. Softw..

[24]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[25]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[26]  N. Cliff Ordinal methods for behavioral data analysis , 1996 .

[27]  Gail C. Murphy,et al.  Asking and Answering Questions during a Programming Change Task , 2008, IEEE Transactions on Software Engineering.

[28]  Johannes Stammel,et al.  Search-based determination of refactorings for improving the class structure of object-oriented systems , 2006, GECCO.

[29]  Shinji Kusumoto,et al.  Toward Refactoring Evaluation with Code Naturalness , 2018, 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC).

[30]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[31]  Mauricio A. Saca Refactoring improving the design of existing code , 2017, 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII).

[32]  Foutse Khomh,et al.  Finding the Best Compromise Between Design Quality and Testing Effort During Refactoring , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[33]  Radu Marinescu,et al.  Detection strategies: metrics-based rules for detecting design flaws , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[34]  Miryung Kim,et al.  An Empirical Study of RefactoringChallenges and Benefits at Microsoft , 2014, IEEE Transactions on Software Engineering.

[35]  Zhenchang Xing,et al.  Refactoring Practice: How it is and How it Should be Supported - An Eclipse Case Study , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[36]  Andrew P. Black,et al.  How we refactor, and how we know it , 2009, 2009 IEEE 31st International Conference on Software Engineering.