Filling the Gaps of Development Logs and Bug Issue Data

It has been suggested that the data from bug repositories is not always in sync or complete compared to the logs detailing the actions of developers on source code. In this paper, we trace two sources of information relative to software bugs: the change logs of the actions of developers and the issues reported as bugs. The aim is to identify and quantify the discrepancies between the two sources in recording and storing the developer logs relative to bugs. Focussing on the databases produced by two mining software repository tools, CVSAnalY and Bicho, we use part of the SZZ algorithm to identify bugs and to compare how the "defects-fixing changes" are recorded in the two databases. We use a working example to show how to do so. The results indicate that there is a significant amount of information, not in sync when tracing bugs in the two databases. We, therefore, propose an automatic approach to re-align the two databases, so that the collected information is mirrored and in sync.

[1]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[2]  Gregorio Robles,et al.  Remote analysis and measurement of libre software systems by means of the CVSAnalY tool , 2004, ICSE 2004.

[3]  Akito Monden,et al.  Why is collaboration needed in OSS projects? a case study of eclipse project , 2013, SSE 2013.

[4]  Rongxin Wu,et al.  ReLink: recovering links between bugs and changes , 2011, ESEC/FSE '11.

[5]  Andreas Schreiber,et al.  RepoGuard: A Framework for Integration of Development Tools with Source Code Repositories , 2009, 2009 Fourth IEEE International Conference on Global Software Engineering.

[6]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[7]  Abraham Bernstein,et al.  LINKSTER: enabling efficient manual inspection and annotation of mined data , 2010, FSE '10.

[8]  Qinbao Song,et al.  Data Quality: Some Comments on the NASA Software Defect Datasets , 2013, IEEE Transactions on Software Engineering.

[9]  Daniel Izquierdo-Cortazar,et al.  FLOSSMetrics: Free/Libre/Open Source Software Metrics , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[10]  Premkumar T. Devanbu,et al.  The missing links: bugs and bug-fix commits , 2010, FSE '10.

[11]  Ashish Sureka,et al.  Applying Fellegi-Sunter (FS) Model for Traceability Link Recovery between Bug Databases and Version Archives , 2011, 2011 18th Asia-Pacific Software Engineering Conference.