Revealing Missing Bug-Fixes in Code Clones in Large-Scale Code Bases

When a bug is fixed in duplicated code, it is often necessary to modify all duplicates (so-called clones) accordingly. In practice, however, fixes are often incomplete, which causes the bug to remain in one or more of the clones. This paper presents an approach that detects such incomplete bug-fixes in cloned code by analyzing a system's version history to reveal those commits that fix problems. The approach then performs incremental clone detection to reveal those clones that became inconsistent as a result of such a fix. We present results from a case study that analyzed incomplete bug-fixes in six industrial and open-source systems to demonstrate the feasibility and defectiveness of our approach. We identified likely incomplete bug-fixes in all analyzed systems.

[1]  Elmar Jürgens,et al.  Index-based code clone detection: incremental, distributed, scalable , 2010, 2010 IEEE International Conference on Software Maintenance.

[2]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[3]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[4]  Tibor Gyimóthy,et al.  Clone Smells in Software Evolution , 2007, 2007 IEEE International Conference on Software Maintenance.

[5]  Sunghun Kim,et al.  Memories of bug fixes , 2006, SIGSOFT '06/FSE-14.

[6]  Lerina Aversano,et al.  How Clones are Maintained: An Empirical Study , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[7]  Nils Göde,et al.  Clone Evolution Revisited , 2010, Softwaretechnik-Trends.

[8]  Lerina Aversano,et al.  An empirical study on the maintenance of source code clones , 2010, Empirical Software Engineering.

[9]  Thomas Zimmermann,et al.  Predicting Bugs from History , 2008, Software Evolution.

[10]  Rainer Koschke,et al.  Approximate Code Search in Program Histories , 2011, 2011 18th Working Conference on Reverse Engineering.

[11]  Andreas Zeller,et al.  Mining version histories to guide software changes , 2005, Proceedings. 26th International Conference on Software Engineering.

[12]  Zhendong Su,et al.  Context-based detection of clone-related bugs , 2007, ESEC-FSE '07.

[13]  Harald C. Gall,et al.  Relation of Code Clones and Change Couplings , 2006, FASE.

[14]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[15]  Martin P. Robillard,et al.  Tracking Code Clones in Evolving Software , 2007, 29th International Conference on Software Engineering (ICSE'07).

[16]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[17]  Yuanyuan Zhou,et al.  CP-Miner: finding copy-paste and related bugs in large-scale software code , 2006, IEEE Transactions on Software Engineering.

[18]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[19]  Elmar Jürgens,et al.  Why and how to control cloning in software artifacts , 2011 .

[20]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[21]  Rainer Koschke,et al.  Incremental Clone Detection , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[22]  Gerardo Canfora,et al.  Tracking Your Changes: A Language-Independent Approach , 2009, IEEE Software.

[23]  Elmar Jürgens,et al.  Do code clones matter? , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[24]  Rainer Koschke,et al.  Survey of Research on Software Clones , 2006, Duplication, Redundancy, and Similarity in Software.