Conflict-aware optimal scheduling of prioritised code clone refactoring

Duplicated or similar source code, also known as code clones, are possible malicious 'code smells' that may need to be removed through refactoring to enhance maintainability. Among many potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality measured in terms of software metrics. Moreover, there may be dependencies and conflicts among those refactorings of different priorities. Addressing all the conflicts, priorities and dependencies, a manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore an automated refactoring scheduler is necessary to 'maximise benefit and minimise refactoring effort'. However, the estimation of the efforts required to perform code clone refactoring is a challenging task. This study makes two contributions. First, the authors propose an effort model for the estimation of code clone refactoring efforts. Second, the authors propose a constraint programming (CP) approach for conflict-aware optimal scheduling of code clone refactoring. A qualitative evaluation of the effort model from the developers' perspective suggests that the model is complete and useful for code clone refactoring effort estimation. The authors also quantitatively compared their refactoring scheduler with other wellknown scheduling techniques such as the genetic algorithm, greedy approaches and linear programming. The authors' empirical study suggests that the proposed CP-based approach outperforms other approaches they considered.

[1]  Doo-Hwan Bae,et al.  Automated scheduling for clone‐based refactoring using a competent GA , 2011, Softw. Pract. Exp..

[2]  Chanchal Kumar Roy,et al.  A Mutation/Injection-Based Automatic Framework for Evaluating Code Clone Detection Tools , 2009, 2009 International Conference on Software Testing, Verification, and Validation Workshops.

[3]  G. Li,et al.  Conflict-aware schedule of software refactorings , 2008, IET Softw..

[4]  Martin P. Robillard,et al.  How effective developers investigate source code: an exploratory study , 2004, IEEE Transactions on Software Engineering.

[5]  Tom Mens,et al.  A case study to evaluate the suitability of graph transformation tools for program refactoring , 2010, International Journal on Software Tools for Technology Transfer.

[6]  Claus Lewerentz,et al.  Metrics Based Refactoring , 2001, CSMR.

[7]  Chanchal Kumar Roy,et al.  On the Effectiveness of Simhash for Detecting Near-Miss Clones in Large Scale Software Systems , 2011, 2011 18th Working Conference on Reverse Engineering.

[8]  Roman Barták,et al.  Constraint Programming: In Pursuit of the Holy Grail , 1999 .

[9]  Shinji Kusumoto,et al.  ARIES: refactoring support tool for code clone , 2005, ACM SIGSOFT Softw. Eng. Notes.

[10]  Alessandro Orso,et al.  Scaling regression testing to large software systems , 2004, SIGSOFT '04/FSE-12.

[11]  Mel Ó Cinnéide,et al.  Search-based refactoring: an empirical study , 2008 .

[12]  Giuliano Antoniol,et al.  A novel approach to optimize clone refactoring activity , 2006, GECCO.

[13]  Chanchal K. Roy,et al.  Analyzing and Forecasting Near-Miss Clones in Evolving Software: An Empirical Study , 2011, 2011 16th IEEE International Conference on Engineering of Complex Computer Systems.

[14]  Stéphane Ducasse,et al.  Tool Support for Refactoring Duplicated OO Code , 1999, ECOOP Workshops.

[15]  Steve Counsell,et al.  Understanding the complexity of refactoring in software systems: a tool-based approach , 2006, Int. J. Gen. Syst..

[16]  Martin Fowler,et al.  Refactoring - Improving the Design of Existing Code , 1999, Addison Wesley object technology series.

[17]  Deepak Goyal,et al.  A hierarchical model for object-oriented design quality assessment , 2015 .

[18]  Robert DeLine,et al.  Software Development with Code Maps , 2010, ACM Queue.

[19]  Chanchal Kumar Roy,et al.  Conflict-Aware Optimal Scheduling of Code Clone Refactoring: A Constraint Programming Approach , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[20]  Yuval Davidor,et al.  Epistasis Variance: A Viewpoint on GA-Hardness , 1990, FOGA.

[21]  Gail C. Murphy,et al.  Asking and Answering Questions during a Programming Change Task , 2008, IEEE Transactions on Software Engineering.

[22]  Wayne L. Winston Operations research: applications and algorithms / Wayne L. Winston , 2004 .

[23]  G. Balabaskaran Method Level Detection and Removal of Code Clones in C and Java Programs using Refactoring , 2010 .

[24]  Houari A. Sahraoui,et al.  Can metrics help to bridge the gap between the improvement of OO design quality and its automation? , 2000, Proceedings 2000 International Conference on Software Maintenance.

[25]  Chanchal Kumar Roy,et al.  An automatic framework for extracting and classifying near-miss clone genealogies , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[26]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[27]  S. N. Sivanandam,et al.  Introduction to genetic algorithms , 2007 .

[28]  Shinji Kusumoto,et al.  Refactoring Support Based on Code Clone Analysis , 2004, PROFES.

[29]  Chanchal Kumar Roy,et al.  A Constraint Programming Approach to Conflict-Aware Optimal Scheduling of Prioritized Code Clone Refactoring , 2011, 2011 IEEE 11th International Working Conference on Source Code Analysis and Manipulation.

[30]  Stéphane Ducasse,et al.  Insights into system-wide code duplication , 2004, 11th Working Conference on Reverse Engineering.

[31]  Chanchal Kumar Roy,et al.  Evaluating Code Clone Genealogies at Release Level: An Empirical Study , 2010, 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation.

[32]  A. E. Eiben,et al.  Solving constraint satisfaction problems using genetic algorithms , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[33]  Marko Rosenmüller,et al.  Towards a refactoring guideline using code clone classification , 2008, WRT '08.

[34]  Tom Mens,et al.  Analysing refactoring dependencies using graph transformation , 2007, Software & Systems Modeling.

[35]  Shinji Kusumoto,et al.  Simultaneous Modification Support based on Code Clone Analysis , 2007, 14th Asia-Pacific Software Engineering Conference (APSEC'07).

[36]  Chanchal Kumar Roy,et al.  IDE-based real-time focused search for near-miss clones , 2012, SAC '12.

[37]  Carl G. Davis,et al.  A Hierarchical Model for Object-Oriented Design Quality Assessment , 2002, IEEE Trans. Software Eng..

[38]  Ladan Tahvildari,et al.  A metric-based approach to enhance design quality through meta-pattern transformations , 2003, Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings..

[39]  Shinji Kusumoto,et al.  On refactoring support based on code clone dependency relation , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[40]  Minhaz F. Zibran,et al.  A multi-phase approach to university course timetabling , 2007 .

[41]  Chanchal Kumar Roy,et al.  The NiCad Clone Detector , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[42]  Elmar Jürgens,et al.  Do code clones matter? , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[43]  Michael W. Godfrey,et al.  “Cloning considered harmful” considered harmful: patterns of cloning in software , 2008, Empirical Software Engineering.

[44]  Sandro Schulze,et al.  Advanced Analysis for Code Clone Removal , 2009, Softwaretechnik-Trends.

[45]  Chanchal Kumar Roy,et al.  VisCad: flexible code clone analysis support for NiCad , 2011, IWSC '11.

[46]  Ferosh Jacob,et al.  CnP: Towards an environment for the proactive management of copy-and-paste programming , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[47]  Chanchal Kumar Roy,et al.  Towards flexible code clone detection, management, and refactoring in IDE , 2011, IWSC '11.