An empirical study on how project context impacts on code cloning

Code cloning can seriously affect software quality. Code clones are various fragments of syntactically or semantically equivalent code. Some authors argue that code clones have a negative impact on maintainability and understandability, since clones propagate defects and make it mandatory to pay attention to several copies. However, other authors believe clones are not necessarily bad, since self‐admitted clones favor system stability and allow developers to move projects forward. Although some root causes and effects of cloning have been widely studied, there is not much relevant work analyzing how certain projects context factors impact on code cloning. This work presents an empirical validation of six open source projects by considering certain factors from Git repositories measured throughout a total of 70 releases for the 6 systems. The factors analyzed were the number of commits and committers per release, the average size of the commits and the size of the system in each release. The main conclusion obtained from the study is that, while the number of commits and committers and the system size do not significantly affect cloning, larger commits lead to a higher cloning ratio. These insights contribute to predicting and preventing code cloning, thus enabling a software quality improvement.

[1]  Katsuro Inoue,et al.  Evolution of code clone ratios throughout development history of open-source C and C++ programs , 2017, 2017 IEEE 11th International Workshop on Software Clones (IWSC).

[2]  Viswanath Venkatesh,et al.  Bridging the Qualitative-Quantitative Divide: Guidelines for Conducting Mixed Methods Research in Information Systems , 2013, MIS Q..

[3]  Dongmei Zhang,et al.  Predicting Consistency-Maintenance Requirement of Code Clonesat Copy-and-Paste Time , 2014, IEEE Transactions on Software Engineering.

[4]  Martin P. Robillard,et al.  Tracking Code Clones in Evolving Software , 2007, 29th International Conference on Software Engineering (ICSE'07).

[5]  Shaohua Wang,et al.  An investigation of the fault-proneness of clone evolutionary patterns , 2017, Software Quality Journal.

[6]  Elmar Jürgens,et al.  Do code clones matter? , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[7]  Manishankar Mondal,et al.  A comparative study on the intensity and harmfulness of late propagation in near-miss code clones , 2016, Software Quality Journal.

[8]  Christopher W. Fraser,et al.  Clone detection via structural abstraction , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[9]  Chanchal Kumar Roy,et al.  Evaluating Code Clone Genealogies at Release Level: An Empirical Study , 2010, 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation.

[10]  Narayan Ramasubbu,et al.  Integrating Technical Debt Management and Software Quality Management Processes: A Normative Framework and Field Tests , 2019, IEEE Transactions on Software Engineering.

[11]  Jugal K. Kalita,et al.  Semantic Clone Detection Using Machine Learning , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[12]  Michael W. Godfrey,et al.  Compiling Clones: What Happens? , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[13]  Siau-Cheng Khoo,et al.  Predicting Consistent Clone Change , 2016, 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE).

[14]  Lerina Aversano,et al.  How Clones are Maintained: An Empirical Study , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[15]  Miguel P Caldas,et al.  Research design: qualitative, quantitative, and mixed methods approaches , 2003 .

[16]  Iftekhar Ahmed,et al.  An Empirical Examination of the Relationship between Code Smells and Merge Conflicts , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[17]  Iman Keivanloo,et al.  Doppel-Code: A Clone Visualization Tool for Prioritizing Global and Local Clone Impacts , 2012, 2012 IEEE 36th Annual Computer Software and Applications Conference.

[18]  Maria Teresa Baldassarre,et al.  Full reuse maintenance process for reducing software degradation , 2003, Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings..

[19]  Jan Harder,et al.  How Multiple Developers Affect the Evolution of Code Clones , 2013, 2013 IEEE International Conference on Software Maintenance.

[20]  Young-Woo Kwon,et al.  Tool Support for Managing Clone Refactorings to Facilitate Code Review in Evolving Software , 2017, 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC).

[21]  Frank Tip,et al.  Safe-commit analysis to facilitate team software development , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[22]  E. Sreenivasa Reddy,et al.  Generic code Cloning method for detection of Clone code in software Development , 2016, 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE).

[23]  Ying Zou,et al.  Studying the Impact of Clones on Software Defects , 2010, 2010 17th Working Conference on Reverse Engineering.

[24]  Heejo Lee,et al.  VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[25]  Neil A. Ernst,et al.  Measure it? Manage it? Ignore it? software practitioners and technical debt , 2015, ESEC/SIGSOFT FSE.

[26]  Manishankar Mondal,et al.  Bug Propagation through Code Cloning: An Empirical Study , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[27]  Serge Demeyer,et al.  An empirical study of clone density evolution and developer cloning tendency , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[28]  Austen Rainer,et al.  Case Study Research in Software Engineering - Guidelines and Examples , 2012 .

[29]  Davood Mazinanian,et al.  Assessing the Refactorability of Software Clones , 2015, IEEE Transactions on Software Engineering.

[30]  Chanchal Kumar Roy,et al.  Evaluating clone detection tools with BigCloneBench , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[31]  Manishankar Mondal,et al.  An Empirical Study of the Impacts of Clones in Software Maintenance , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[32]  Bayu Priyambadha,et al.  Case study on semantic clone detection based on code behavior , 2014, 2014 International Conference on Data and Software Engineering (ICODSE).

[33]  Sachin V. Shinde,et al.  Code clone detection using decentralized architecture and code reduction , 2015, 2015 International Conference on Pervasive Computing (ICPC).

[34]  Alexander Serebrenik,et al.  Perceptions of Diversity on Git Hub: A User Survey , 2015, 2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering.

[35]  Cristina V. Lopes,et al.  SourcererCC and SourcererCC-I: Tools to Detect Clones in Batch Mode and during Software Development , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[36]  Ivan Janicijevic,et al.  Software quality improvement: a model based on managing factors impacting software quality , 2014, Software Quality Journal.

[37]  Manishankar Mondal,et al.  Identifying Code Clones Having High Possibilities of Containing Bugs , 2017, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[38]  Rainer Koschke,et al.  An extended assessment of type-3 clones as detected by state-of-the-art tools , 2011, Software Quality Journal.

[39]  Gregorio Robles,et al.  Software clones in scratch projects: on the presence of copy-and-paste in computational thinking learning , 2017, 2017 IEEE 11th International Workshop on Software Clones (IWSC).

[40]  Karl Fogel,et al.  Producing open source software - how to run a successful free software project , 2005 .

[41]  Chanchal K. Roy,et al.  Analyzing and Forecasting Near-Miss Clones in Evolving Software: An Empirical Study , 2011, 2011 16th IEEE International Conference on Engineering of Complex Computer Systems.

[42]  Maria Teresa Baldassarre,et al.  Human Factors in Software Development Processes: Measuring System Quality , 2016, PROFES.

[43]  John W. Creswell,et al.  Designing and Conducting Mixed Methods Research , 2006 .