Identifying Code Clones Having High Possibilities of Containing Bugs

Code cloning has emerged as a controversial term in software engineering research and practice because of its positive and negative impacts on software evolution and maintenance. Researchers suggest managing code clones through refactoring and tracking. Given the huge number of code clones in a software system's code-base, it is essential to identify the most important ones to manage. In our research, we investigate which clone fragments have high possibilities of containing bugs so that such clones can be prioritized for refactoring and tracking to help minimize future bug-fixing tasks. Existing studies on clone bug-proneness cannot pinpoint code clones that are likely to experience bug-fixes in the future. According to our analysis on thousands of revisions of four diverse subject systems written in Java, change frequency of code clones does not indicate their bug-proneness (i.e., does not indicate their tendencies of experiencing bug-fixes in future). Bug-proneness is mainly related with change recency of code clones. In other words, more recently changed code clones have a higher possibility of containing bugs. Moreover, for the code clones that were not changed previously we observed that clones that were created more recently have higher possibilities of experiencing bug-fixes. Thus, our research reveals the fact that bug-proneness of code clones mainly depends on how recently they were changed or created (for the ones that were not changed before). It invalidates the common intuition regarding the relatedness between high change frequency and bug-proneness. We believe that code clones should be prioritized for management considering their change recency or recency of creation (for the unchanged ones).

[1]  Martin P. Robillard,et al.  Clonetracker: tool support for code clone management , 2008, ICSE '08.

[2]  Chanchal Kumar Roy,et al.  Conflict-aware optimal scheduling of prioritised code clone refactoring , 2013, IET Softw..

[3]  Chanchal Kumar Roy,et al.  An automatic framework for extracting and classifying near-miss clone genealogies , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[4]  Lerina Aversano,et al.  How Clones are Maintained: An Empirical Study , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[5]  Foutse Khomh,et al.  An empirical study of the fault-proneness of clone mutation and clone migration , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[6]  Chanchal Kumar Roy,et al.  The NiCad Clone Detector , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[7]  Elmar Jürgens,et al.  Do code clones matter? , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[8]  Hagen Hagen Is Cloned Code more stable than Non-Cloned Code? , 2008 .

[9]  Chanchal Kumar Roy,et al.  Detection and analysis of near-miss software clones , 2009, 2009 IEEE International Conference on Software Maintenance.

[10]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[11]  Michel Wermelinger,et al.  Tracking clones' imprint , 2010, IWSC '10.

[12]  Akito Monden,et al.  Software quality analysis by code clones in industrial legacy software , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[13]  Chanchal Kumar Roy,et al.  Evaluating Modern Clone Detection Tools , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[14]  Doo-Hwan Bae,et al.  Automated scheduling for clone‐based refactoring using a competent GA , 2011, Softw. Pract. Exp..

[15]  Chanchal Kumar Roy,et al.  Conflict-Aware Optimal Scheduling of Code Clone Refactoring: A Constraint Programming Approach , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[16]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[17]  Jeffrey C. Carver,et al.  Measuring the Efficacy of Code Clone Information in a Bug Localization Task: An Empirical Study , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[18]  Davood Mazinanian,et al.  Assessing the Refactorability of Software Clones , 2015, IEEE Transactions on Software Engineering.

[19]  Stan Jarzabek,et al.  Using Server Pages to Unify Clones in Web Applications: A Trade-Off Analysis , 2007, 29th International Conference on Software Engineering (ICSE'07).

[20]  Zhendong Su,et al.  Context-based detection of clone-related bugs , 2007, ESEC-FSE '07.

[21]  Jens Krinke,et al.  Is cloned code older than non-cloned code? , 2011, IWSC '11.

[22]  Foutse Khomh,et al.  Late propagation in software clones , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[23]  Michael W. Godfrey,et al.  “Cloning considered harmful” considered harmful: patterns of cloning in software , 2008, Empirical Software Engineering.

[24]  Manishankar Mondal,et al.  Automatic Identification of Important Clones for Refactoring and Tracking , 2014, 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

[25]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[26]  Markus Neuhäuser,et al.  Wilcoxon Signed Rank Test , 2006 .

[27]  Giuliano Antoniol,et al.  A novel approach to optimize clone refactoring activity , 2006, GECCO.

[28]  Premkumar T. Devanbu,et al.  Clones: what is that smell? , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[29]  Jens Krinke,et al.  A Study of Consistent and Inconsistent Changes to Code Clones , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[30]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[31]  Manishankar Mondal,et al.  A comparative study on the intensity and harmfulness of late propagation in near-miss code clones , 2016, Software Quality Journal.

[32]  Dongmei Zhang,et al.  Predicting Consistency-Maintenance Requirement of Code Clonesat Copy-and-Paste Time , 2014, IEEE Transactions on Software Engineering.

[33]  Manishankar Mondal,et al.  SPCP-Miner: A tool for mining code clones that are important for refactoring or tracking , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[34]  Jens Krinke,et al.  Is Cloned Code More Stable than Non-cloned Code? , 2008, 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation.

[35]  Richard C. Holt,et al.  The top ten list: dynamic fault prediction , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[36]  Foutse Khomh,et al.  An empirical study of faults in late propagation clone genealogies , 2013, J. Softw. Evol. Process..

[37]  Michael D. Ernst,et al.  CBCD: Cloned buggy code detector , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[38]  Ying Zou,et al.  Studying the Impact of Clones on Software Defects , 2010, 2010 17th Working Conference on Reverse Engineering.

[39]  D. V. Radhika,et al.  Prioritizing code clone detection results for clone management , 2013, 2013 7th International Workshop on Software Clones (IWSC).

[40]  Manishankar Mondal,et al.  A comparative study on the bug-proneness of different types of code clones , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[41]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[42]  Chanchal Kumar Roy,et al.  The vision of software clone management: Past, present, and future (Keynote paper) , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[43]  Shinji Kusumoto,et al.  Experience of finding inconsistently-changed bugs in code clones of mobile software , 2012, 2012 6th International Workshop on Software Clones (IWSC).

[44]  Lerina Aversano,et al.  An empirical study on the maintenance of source code clones , 2010, Empirical Software Engineering.

[45]  Yuanyuan Zhou,et al.  CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code , 2004, OSDI.

[46]  Michel Wermelinger,et al.  Assessing the effect of clones on changeability , 2008, 2008 IEEE International Conference on Software Maintenance.

[47]  Daniela Steidl,et al.  Feature-based detection of bugs in clones , 2013, 2013 7th International Workshop on Software Clones (IWSC).

[48]  Nils Göde,et al.  Clone Stability , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[49]  Rainer Koschke,et al.  Frequency and risks of changes to clones , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[50]  Chanchal Kumar Roy,et al.  A Mutation/Injection-Based Automatic Framework for Evaluating Code Clone Detection Tools , 2009, 2009 International Conference on Software Testing, Verification, and Validation Workshops.

[51]  Shinji Kusumoto,et al.  Is duplicate code more frequently modified than non-duplicate code in software evolution?: an empirical study on open source software , 2010, IWPSE-EVOL '10.

[52]  Manishankar Mondal,et al.  Bug Replication in Code Clones: An Empirical Study , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[53]  Mark Harman,et al.  Searching for better configurations: a rigorous approach to clone evaluation , 2013, ESEC/FSE 2013.

[54]  Chanchal Kumar Roy,et al.  NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization , 2008, 2008 16th IEEE International Conference on Program Comprehension.