Practical development of an Eclipse-based software fault prediction tool using Naive Bayes algorithm

Despite the amount of effort software engineers have been putting into developing fault prediction models, software fault prediction still poses great challenges. This research using machine learning and statistical techniques has been ongoing for 15years, and yet we still have not had a breakthrough. Unfortunately, none of these prediction models have achieved widespread applicability in the software industry due to a lack of software tools to automate this prediction process. Historical project data, including software faults and a robust software fault prediction tool, can enable quality managers to focus on fault-prone modules. Thus, they can improve the testing process. We developed an Eclipse-based software fault prediction tool for Java programs to simplify the fault prediction process. We also integrated a machine learning algorithm called Naive Bayes into the plug-in because of its proven high-performance for this problem. This article presents a practical view to software fault prediction problem, and it shows how we managed to combine software metrics with software fault data to apply Naive Bayes technique inside an open source platform.

[1]  Burak Turhan,et al.  Implications of ceiling effects in defect predictors , 2008, PROMISE '08.

[2]  T. Shatovskaya,et al.  Application of the Bayesian Networks In the Informational Modeling , 2006, 2006 International Conference - Modern Problems of Radio Engineering, Telecommunications, and Computer Science.

[3]  Karim O. Elish,et al.  Predicting defect-prone software modules using support vector machines , 2008, J. Syst. Softw..

[4]  Iker Gondra,et al.  Applying machine learning to software fault-proneness prediction , 2008, J. Syst. Softw..

[5]  Félix Cuadrado,et al.  Apache and Eclipse: Comparing Open Source Project Incubators , 2007, IEEE Softw..

[6]  Tong-Seng Quah,et al.  Application of neural networks for software quality prediction using object-oriented metrics , 2005, J. Syst. Softw..

[7]  Banu Diri,et al.  A Fault Prediction Model with Limited Fault Data to Improve Test Process , 2008, PROFES.

[8]  Lars Lundberg,et al.  Statistical models vs. expert estimation for fault prediction in modified code - an industrial case study , 2007, J. Syst. Softw..

[9]  Witold Pedrycz,et al.  Identification of defect-prone classes in telecommunication software systems using design metrics , 2006, Inf. Sci..

[10]  Taghi M. Khoshgoftaar,et al.  An application of fuzzy clustering to software quality prediction , 2000, Proceedings 3rd IEEE Symposium on Application-Specific Systems and Software Engineering Technology.

[11]  Khaled El Emam,et al.  Comparing case-based reasoning classifiers for predicting high risk software components , 2001, J. Syst. Softw..

[12]  S. Kanmani,et al.  Object-oriented software fault prediction using neural networks , 2007, Inf. Softw. Technol..

[13]  Berthold Daum,et al.  Professional Eclipse 3 for Java Developers , 2004 .

[14]  Bart Baesens,et al.  Mining software repositories for comprehensible software fault prediction models , 2008, J. Syst. Softw..

[15]  Bojan Cukic,et al.  Predicting fault prone modules by the Dempster-Shafer belief networks , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[16]  Eric Clayberg,et al.  Eclipse: Building Commercial-Quality Plug-ins , 2004 .

[17]  William Marsh,et al.  Predicting software defects in varying development lifecycles using Bayesian nets , 2007, Inf. Softw. Technol..

[18]  Elaine J. Weyuker,et al.  On the Automation of Software Fault Prediction , 2006, Testing: Academic & Industrial Conference - Practice And Research Techniques (TAIC PART'06).

[19]  Elaine J. Weyuker,et al.  What Can Fault Prediction Do for YOU? , 2008, TAP.

[20]  Taghi M. Khoshgoftaar,et al.  Software Quality Classification Modeling Using the SPRINT Decision Tree Algorithm , 2003, Int. J. Artif. Intell. Tools.

[21]  Banu Diri,et al.  Software Fault Prediction with Object-Oriented Metrics Based Artificial Immune Recognition System , 2007, PROFES.

[22]  Jeff McAffer,et al.  Eclipse Rich Client Platform , 2010 .

[23]  Edward B. Allen,et al.  GP-based software quality prediction , 1998 .

[24]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[25]  Harry Zhang,et al.  Learning weighted naive Bayes with accurate ranking , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[26]  Tong-Seng Quah,et al.  Prediction of software development faults in PL/SQL files using neural network models , 2004, Inf. Softw. Technol..

[27]  Silvia Breu Review of "eclipse---Building Commercial-Quality Plug-Ins by Eric Clayberg and Dan Rubel", Addison-Wesley, 2004, 0-321-22847-2 , 2006, SOEN.