On the Rise and Fall of Simple Stupid Bugs: a Life-Cycle Analysis of SStuBs

Bug detection and prevention is one of the most important goals of software quality assurance. Nowadays, many of the major problems faced by developers can be detected or even fixed fully or partially with automatic tools. However, recent works explored that there exists a substantial amount of simple yet very annoying errors in code-bases, which are easy to fix, but hard to detect as they do not hinder the functionality of the given product in a major way. Programmers introduce such errors accidentally, mostly due to inattention.Using the ManySStuBs4J dataset, which contains many simple, stupid bugs, found in GitHub repositories written in the Java programming language, we investigated the history of such bugs. We were interested in properties such as: How long do such bugs stay unnoticed in code-bases? Whether they are typically fixed by the same developer who introduced them? Are they introduced with the addition of new code or caused more by careless modification of existing code? We found that most of such stupid bugs lurk in the code for a long time before they get removed. We noticed that the developer who made the mistake seems to find a solution faster, however less then half of SStuBs are fixed by the same person. We also examined PMD’s performance when to came to flagging lines containing SStuBs, and found that similarly to SpotBugs, it is insufficient when it comes to finding these types of errors. Examining the life-cycle of such bugs allows us to better understand their nature and adjust our development processes and quality assurance methods to better support avoiding them.

[1]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[2]  Lucas D. Panjer Predicting Eclipse Bug Lifetimes , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[3]  Charles Sutton,et al.  How Often Do Single-Statement Bugs Occur? The ManySStuBs4J Dataset , 2019, 2020 IEEE/ACM 17th International Conference on Mining Software Repositories (MSR).

[4]  Bernhard Plattner,et al.  Large-scale vulnerability analysis , 2006, LSAD '06.

[5]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[6]  Bart Goethals,et al.  Predicting the severity of a reported bug , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[7]  Péter Hegedüs,et al.  A Data-Mining Based Study of Security Vulnerability Types and Their Mitigation in Different Languages , 2020, ICCSA.

[8]  Yashwant K. Malaiya,et al.  A Framework for Software Security Risk Evaluation using the Vulnerability Lifecycle and CVSS Metrics , 2010 .

[9]  Sunghun Kim,et al.  How long did it take to fix bugs? , 2006, MSR '06.

[10]  Dewayne E. Perry,et al.  Software Faults in Evolving a Large, Real-Time System: a Case Study , 1993, ESEC.

[11]  J. L. Lions ARIANE 5 Flight 501 Failure: Report by the Enquiry Board , 1996 .

[12]  Bashar Nuseibeh Ariane 5: Who Dunnit? , 1997, IEEE Software.

[13]  Ratsameetip Wita,et al.  An Ontology for Vulnerability Lifecycle , 2010, 2010 Third International Symposium on Intelligent Information Technology and Security Informatics.

[14]  Sarfraz Khurshid,et al.  An empirical study of long lived bugs , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[15]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[16]  A. Zeller,et al.  If Your Bug Database Could Talk . . . , 2006 .

[17]  Andreas Zeller,et al.  How Long Will It Take to Fix This Bug? , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[18]  Harald C. Gall,et al.  Predicting the fix time of bugs , 2010, RSSE '10.