Detect Related Bugs from Source Code Using Bug Information

Open source projects often maintain open bug repositories during development and maintenance, and the reporters often point out straightly or implicitly the reasons why bugs occur when they submit them. The comments about a bug are very valuable for developers to locate and fix the bug. Meanwhile, it is very common in large software for programmers to override or overload some methods according to the same logic. If one method causes a bug, it is obvious that other overridden or overloaded methods maybe cause related or similar bugs. In this paper, we propose and implement a tool Rebug-Detector, which detects related bugs using bug information and code features. Firstly, it extracts bug features from bug information in bug repositories; secondly, it locates bug methods from source code, and then extracts code features of bug methods; thirdly, it calculates similarities between each overridden or overloaded method and bug methods; lastly, it determines which method maybe causes potential related or similar bugs. We evaluate Rebug-Detector on an open source project: Apache Lucene-Java. Our tool totally detects 61 related bugs, including 21 real bugs and 10 suspected bugs, and it costs us about 15.5 minutes. The results show that bug features and code features extracted by our tool are useful to find real bugs in existing projects.

[1]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[2]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[3]  Yuanyuan Zhou,et al.  /*icomment: bugs or bad comments?*/ , 2007, SOSP.

[4]  Stefania Gnesi,et al.  Applications of linguistic techniques for use case analysis , 2003, Requirements Engineering.

[5]  Gail C. Murphy,et al.  Design pattern rationale graphs: linking design to source , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[6]  Tao Xie,et al.  Inferring Resource Specifications from Natural Language API Documentation , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[7]  Enno Ohlebusch,et al.  Replacing suffix trees with enhanced suffix arrays , 2004, J. Discrete Algorithms.

[8]  Yuanyuan Zhou,et al.  CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code , 2004, OSDI.

[9]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[10]  Emily Hill,et al.  Using natural language program analysis to locate and understand action-oriented concerns , 2007, AOSD.

[11]  Chao Liu,et al.  Journal of Software Maintenance and Evolution: Research and Practice Mining Temporal Rules for Software Maintenance , 2022 .

[12]  Leonid Kof,et al.  Scenarios: Identifying Missing Objects and Actions by Means of Computational Linguistics , 2007, 15th IEEE International Requirements Engineering Conference (RE 2007).

[13]  Hiroki Arimura,et al.  Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications , 2001, CPM.

[14]  Chao Liu,et al.  Data Mining for Software Engineering , 2009, Computer.

[15]  Ruzanna Chitchyan,et al.  EA-Miner: a tool for automating aspect-oriented requirements identification , 2005, ASE.

[16]  Tao Xie,et al.  An approach to detecting duplicate bug reports using natural language and execution information , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[17]  Chadd C. Williams,et al.  Automatic mining of source code repositories to improve bug finding techniques , 2005, IEEE Transactions on Software Engineering.

[18]  Bruce Eckel Thinking in Java , 1998 .

[19]  David Hovemeyer,et al.  Finding bugs is easy , 2004, SIGP.

[20]  Emily Hill,et al.  Analysing source code: looking for useful verbdirect object pairs in all the right places , 2008, IET Softw..

[21]  Sunghun Kim,et al.  Memories of bug fixes , 2006, SIGSOFT '06/FSE-14.