Bug Prediction Model using Code Smells

The term ‘Code Smells’ was first coined in the book Refactoring: Improving the design of existing code by M Fowler in 1999. Code smells are poor design choices which have the potential to cause an error or failure in a computer program. The objective of this study is to use code smells as a candidate metric to build a bug prediction model. In this study we have built a bug prediction model using both source code metrics and code smell based metrics proposed in the literature. We used Naive Bayes, Random Forest and Logistic Regression as our candidate algorithms to build the model. We have trained our model against multiple versions of 13 different Java based open source projects. The trained model was used to predict bugs in a particular version of a project, within a particular project and among different projects. We were able to demonstrate, that code smell based metrics can significantly improve the accuracy of a bug prediction model when integrated with source code metrics. Random Forest algorithm based model showed higher accuracy within a version, within a project and among projects when compared to other algorithms.

[1]  Aiko Fallas Yamashita,et al.  Do developers care about code smells? An exploratory survey , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[2]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[3]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[4]  Tze-Jie Yu,et al.  Identifying Error-Prone Software—An Empirical Study , 1985, IEEE Transactions on Software Engineering.

[5]  Andrea De Lucia,et al.  Smells Like Teen Spirit: Improving Bug Prediction Performance Using the Intensity of Code Smells , 2016, 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[6]  Peter Kokol,et al.  Code smells , 2018, ArXiv.

[7]  James D. Herbsleb,et al.  Identification of coordination requirements: implications for the Design of collaboration and awareness tools , 2006, CSCW '06.

[8]  Thomas J. Ostrand,et al.  \{PROMISE\} Repository of empirical software engineering data , 2007 .

[9]  Ayse Basar Bener,et al.  Validation of network measures as indicators of defective modules in software systems , 2009, PROMISE '09.

[10]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[11]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[12]  Hongyu Zhang,et al.  An investigation of the relationships between lines of code and defects , 2009, 2009 IEEE International Conference on Software Maintenance.

[13]  Harald C. Gall,et al.  Putting It All Together: Using Socio-technical Networks to Predict Failures , 2009, 2009 20th International Symposium on Software Reliability Engineering.

[14]  Gabriele Bavota,et al.  When and Why Your Code Starts to Smell Bad (and Whether the Smells Go Away) , 2015, IEEE Transactions on Software Engineering.

[15]  Foutse Khomh,et al.  Predicting Bugs Using Antipatterns , 2013, 2013 IEEE International Conference on Software Maintenance.

[16]  Harald C. Gall,et al.  Don't touch my code!: examining the effects of ownership on software quality , 2011, ESEC/FSE '11.

[17]  Fumio Akiyama,et al.  An Example of Software System Debugging , 1971, IFIP Congress.

[18]  Foutse Khomh,et al.  An exploratory study of the impact of antipatterns on class change- and fault-proneness , 2011, Empirical Software Engineering.

[19]  Robert L. Nord,et al.  Managing technical debt in software-reliant systems , 2010, FoSER '10.

[20]  Raed Shatnawi,et al.  An empirical study of the bad smells and class error probability in the post-release object-oriented system evolution , 2007, J. Syst. Softw..