Enhance Rule Based Detection for Software Fault Prone Modules

Abstract Software quality assurance is necessary to increase the level of confidence in the developed software and reduce the overall cost for developing software projects. The problem addressed in this research is the prediction of fault prone modules using data mining techniques. Predicting fault prone modules allows the software managers to allocate more testing and resources to such modules. This can also imply a good investment in better design in future systems to avoid building error prone modules. Software quality models that are based upon data mining from previous projects can identify fault-prone modules in the current similar development project, once similarity between projects is established. In this paper, we applied different data mining rule-based classification techniques on several publicly available datasets of the NASA software repository (e.g. PC1, PC2, etc). The goal was to classify the software modules into either fault prone or not fault prone modules. The paper proposed a modification on the RIDOR algorithm on which the results show that the enhanced RIDOR algorithm is better than other classification techniques in terms of the number of extracted rules and accuracy. The implemented algorithm learns defect prediction using mining static code attributes. Those attributes are then used to present a new defect predictor with high accuracy and low error rate.

[1]  Chao Liu,et al.  Efficient mining of iterative patterns for software specification discovery , 2007, KDD '07.

[2]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007 .

[3]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[4]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[5]  Monique Snoeck,et al.  Classification With Ant Colony Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[6]  Harald C. Gall,et al.  Guest editors introduction: special issue on mining software repositories , 2009, Empirical Software Engineering.

[7]  Brian R. Gaines,et al.  Induction of ripple-down rules applied to modeling large databases , 1995, Journal of Intelligent Information Systems.

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  K. H. Bennett,et al.  Journal of software maintenance : research and practice , 1989 .

[10]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[11]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[12]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[13]  Jonathan I. Maletic,et al.  Journal of Software Maintenance and Evolution: Research and Practice Survey a Survey and Taxonomy of Approaches for Mining Software Repositories in the Context of Software Evolution , 2022 .

[14]  Bart Baesens,et al.  A Stigmergy Based Approach to Data Mining , 2005, Australian Conference on Artificial Intelligence.

[15]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[16]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[17]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[18]  Bart Baesens,et al.  Mining software repositories for comprehensible software fault prediction models , 2008, J. Syst. Softw..

[19]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[20]  R. Malor,et al.  Ripple down rules: possibilities and limitations , 2010 .

[21]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[22]  Yixin Yin,et al.  Text Classification Based on Rule Mining by Granule Network Constructing , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[23]  S. Salzberg,et al.  INSTANCE-BASED LEARNING : Nearest Neighbour with Generalisation , 1995 .

[24]  Audris Mockus,et al.  Guest Editor's Introduction: Special Issue on Mining Software Repositories , 2005, IEEE Trans. Software Eng..

[25]  Barry W. Boehm,et al.  What we have learned about fighting defects , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.