An Empirical Study on the Characteristics of Python Fine-Grained Source Code Change Types

Software has been changing during its whole life cycle. Therefore, identification of source code changes becomes a key issue in software evolution analysis. However, few current change analysis research focus on dynamic language software. In this paper, we pay attention to the fine-grained source code changes of Python software. We implement an automatic tool named PyCT to extract 77 kinds of fine-grained source code change types from commit history information. We conduct an empirical study on ten popular Python projects from five domains, with 132294 commits, to investigate the characteristics of dynamic software source code changes. Analyzing the source code changes in four aspects, we distill 11 findings, which are summarized into two insights on software evolution: change prediction and fault code fix. In addition, we provide direct evidence on how developers use and change dynamic features. Our results provide useful guidance and insights for improving the understanding of source code evolution of dynamic language software.

[1]  Gail C. Murphy,et al.  Why did this code change? , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[2]  Matias Martinez,et al.  Accurate Extraction of Bug Fix Pattern Occurrences using Abstract Syntax Tree Analysis , 2014 .

[3]  Jennifer Widom,et al.  Change detection in hierarchically structured information , 1996, SIGMOD '96.

[4]  Harald C. Gall,et al.  Fine-grained analysis of change couplings , 2005, Fifth IEEE International Workshop on Source Code Analysis and Manipulation (SCAM'05).

[5]  Tie Feng,et al.  Applying Dynamic Change Impact Analysis in Component-based Architecture Design , 2006, Seventh ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD'06).

[6]  James Harland,et al.  Evaluating the dynamic behaviour of Python applications , 2009, ACSC.

[7]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[8]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[9]  Zhenchang Xing,et al.  Distilling useful clones by contextual differencing , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[10]  Bixin Li,et al.  Change Impact Analysis Based on a Taxonomy of Change Types , 2010, 2010 IEEE 34th Annual Computer Software and Applications Conference.

[11]  Haidar Osman,et al.  Mining frequent bug-fix code changes , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[12]  Harald C. Gall,et al.  Classifying Change Types for Qualifying Change Couplings , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[13]  Gerardo Canfora,et al.  How changes affect software entropy: an empirical study , 2014, Empirical Software Engineering.

[14]  David Lo,et al.  Automatic recovery of root causes from bug-fixing changes , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[15]  Li Li,et al.  Algorithmic analysis of the impact of changes to object-oriented software , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[16]  Baowen Xu,et al.  An empirical study on the impact of Python dynamic features on change-proneness , 2015, ICSE 2015.

[17]  Michele Marchesi,et al.  A machine learning approach for text categorization of fixing-issue commits on CVS , 2010, ESEM '10.

[18]  Tobias Wrigstad,et al.  Tracing dynamic features in python programs , 2014, MSR 2014.

[19]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[20]  Jane Huffman Hayes,et al.  Toward Extended Change Types for Analyzing Software Faults , 2014, 2014 14th International Conference on Quality Software.

[21]  Harald C. Gall,et al.  Can we predict types of code changes? An empirical analysis , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[22]  Eleni Stroulia,et al.  Data-mining in Support of Detecting Class Co-evolution , 2004, SEKE.

[23]  Miryung Kim,et al.  A graph-based approach to API usage adaptation , 2010, OOPSLA.

[24]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[25]  Harald C. Gall,et al.  Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction , 2007, IEEE Transactions on Software Engineering.

[26]  Laurence Tratt,et al.  Dynamically Typed Languages , 2009, Adv. Comput..

[27]  Katsuhisa Maruyama,et al.  A change-aware development environment by recording editing operations of source code , 2008, MSR '08.

[28]  Eleni Stroulia,et al.  Understanding class evolution in object-oriented software , 2004, Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004..

[29]  Rongxin Wu,et al.  Dealing with noise in defect prediction , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[30]  Rongxin Wu,et al.  ReLink: recovering links between bugs and changes , 2011, ESEC/FSE '11.

[31]  Sunghun Kim,et al.  Toward an understanding of bug fix patterns , 2009, Empirical Software Engineering.

[32]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[33]  Eleni Stroulia,et al.  UMLDiff: an algorithm for object-oriented design differencing , 2005, ASE.

[34]  Harald C. Gall,et al.  On the Relation of Refactoring and Software Defects , 2008 .

[35]  Rajiv Gupta,et al.  BugFix: A learning-based tool to assist developers in fixing bugs , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[36]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[37]  Martin P. Robillard,et al.  Non-essential changes in version histories , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[38]  Thomas Zimmermann,et al.  Automatic Identification of Bug-Introducing Changes , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[39]  Daniele Romano,et al.  Analyzing the Evolution of Web Services Using Fine-Grained Changes , 2012, 2012 IEEE 19th International Conference on Web Services.

[40]  Premkumar T. Devanbu,et al.  Fair and balanced?: bias in bug-fix datasets , 2009, ESEC/FSE '09.

[41]  Daniel M. Germán,et al.  An empirical study of fine-grained software modifications , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[42]  Harald C. Gall,et al.  Discovering Patterns of Change Types , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.