Toward an understanding of bug fix patterns

Twenty-seven automatically extractable bug fix patterns are defined using the syntax components and context of the source code involved in bug fix changes. Bug fix patterns are extracted from the configuration management repositories of seven open source projects, all written in Java (Eclipse, Columba, JEdit, Scarab, ArgoUML, Lucene, and MegaMek). Defined bug fix patterns cover 45.7% to 63.3% of the total bug fix hunk pairs in these projects. The frequency of occurrence of each bug fix pattern is computed across all projects. The most common individual patterns are MC-DAP (method call with different actual parameter values) at 14.9–25.5%, IF-CC (change in if conditional) at 5.6–18.6%, and AS-CE (change of assignment expression) at 6.0–14.2%. A correlation analysis on the extracted pattern instances on the seven projects shows that six have very similar bug fix pattern frequencies. Analysis of if conditional bug fix sub-patterns shows a trend towards increasing conditional complexity in if conditional fixes. Analysis of five developers in the Eclipse projects shows overall consistency with project-level bug fix pattern frequencies, as well as distinct variations among developers in their rates of producing various bug patterns. Overall, data in the paper suggest that developers have difficulty with specific code situations at surprisingly consistent rates. There appear to be broad mechanisms causing the injection of bugs that are largely independent of the type of software being produced.

[1]  Reengineering Forum Proceedings of the 21st IEEE International Conference on Software Maintenance : ICSM 2005 , 2005 .

[2]  D. Potier,et al.  Experiments with computer software complexity and reliability , 1982, ICSE '82.

[3]  Chadd C. Williams,et al.  Automatic mining of source code repositories to improve bug finding techniques , 2005, IEEE Transactions on Software Engineering.

[4]  Richard C. Holt,et al.  The top ten list: dynamic fault prediction , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[5]  Henrique Madeira,et al.  Emulation of Software Faults: A Field Data Study and a Practical Approach , 2006, IEEE Transactions on Software Engineering.

[6]  Greg Nelson,et al.  Extended static checking for Java , 2002, PLDI '02.

[7]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation , 1993 .

[8]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[9]  E. James Whitehead,et al.  Identification of software instabilities , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[10]  Inderpal S. Bhandari,et al.  Orthogonal Defect Classification - A Concept for In-Process Measurements , 1992, IEEE Trans. Software Eng..

[11]  E. James Whitehead,et al.  Using evolution patterns to find duplicated bugs , 2006 .

[12]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[13]  Benjamin Livshits,et al.  DynaMine: finding common error patterns by mining software revision histories , 2005, ESEC/FSE-13.

[14]  Harald C. Gall,et al.  Populating a Release History Database from version control and bug tracking systems , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[15]  David A. Gustafson,et al.  Shotgun correlations in software measures , 1993, Softw. Eng. J..

[16]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[17]  Gail C. Murphy,et al.  Hipikat: recommending pertinent software development artifacts , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[18]  Walter F. Tichy,et al.  Proceedings 25th International Conference on Software Engineering , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[19]  Dewayne E. Perry,et al.  Software Faults in Evolving a Large, Real-Time System: a Case Study , 1993, ESEC.

[20]  Albert Endres An analysis of errors and their causes in system programs , 1975 .

[21]  Elaine J. Weyuker,et al.  Collecting and categorizing software error data in an industrial environment , 2018, J. Syst. Softw..

[22]  Raymond T. Yeh,et al.  Proceedings of the international conference on Reliable software , 1975 .

[23]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation0 , 1984, CACM.

[24]  Albert Endres,et al.  An analysis of errors and their causes in system programs , 1975, IEEE Transactions on Software Engineering.

[25]  Dewayne E. Perry,et al.  A case study in root cause defect analysis , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[26]  Thomas Zimmermann,et al.  Automatic Identification of Bug-Introducing Changes , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[27]  Yuanyuan Zhou,et al.  Have things changed now?: an empirical study of bug characteristics in modern open source software , 2006, ASID '06.