Software quality estimation with case-based reasoning

Abstract The software quality team of a software project often strives to predict the operational quality of software modules prior to software deployment. A timely software quality prediction can be used for enacting any preventive actions so as to reduce software faults from occurring during system operations. This is especially important for high-assurance systems where software reliability is very critical. The two most commonly used models for software quality estimation are, software fault prediction and software quality classification. Generally, such models use software metrics as predictors of a software module's quality, which is either represented by the expected number of faults or a class membership to quality-based groups. This study presents a comprehensive methodology for building software quality estimation models with case-based reasoning ( cbr ), a computational intelligence technique that is suited for experience-based analysis. A  cbr system is a practical option for software quality modeling, because it uses an organization's previous experience with its software development process to estimate the quality of a currently under-development software project. In the context of software metrics and quality data collected from a high-assurance software system, software fault prediction and software quality classification models are built. The former predicts the number of faults in software modules, while the latter predicts the class membership of the modules into the fault-prone and not fault-prone groups. This study presents in-depth details for the cbr models so as to facilitate a comprehensive understanding of the cbr technology as applied to software quality estimation.

[1]  Norman F. Schneidewind,et al.  Investigation of logistic regression as a discriminant of software quality , 2001, Proceedings Seventh International Software Metrics Symposium.

[2]  Raymond A. Paul Metric-based neural network classification tool for analyzing large-scale software , 1992, Proceedings Fourth International Conference on Tools with Artificial Intelligence TAI '92.

[3]  David Leake,et al.  Case-Based Reasoning: Experiences, Lessons and Future Directions , 1996 .

[4]  Ralph Barletta,et al.  Building a case-based help desk application , 1993, IEEE Expert.

[5]  Taghi M. Khoshgoftaar,et al.  Balancing Misclassification Rates in Classification-Tree Models of Software Quality , 2004, Empirical Software Engineering.

[6]  Hausi A. Müller,et al.  Predicting fault-proneness using OO metrics. An industrial case study , 2002, Proceedings of the Sixth European Conference on Software Maintenance and Reengineering.

[7]  Taghi M. Khoshgoftaar,et al.  Ordering Fault-Prone Software Modules , 2003, Software Quality Journal.

[8]  W. Pedrycz,et al.  Software quality prediction using median-adjusted class labels , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[9]  Christof Ebert,et al.  Classification techniques for metric-based software development , 1996, Software Quality Journal.

[10]  C. V. Ramamoorthy,et al.  Knowledge based tools for risk assessment in software development and reuse , 1993, Proceedings of 1993 IEEE Conference on Tools with Al (TAI-93).

[11]  David M. Levine,et al.  Intermediate Statistical Methods and Applications: A Computer Package Approach , 1982 .

[12]  Per Runeson,et al.  Experience from replicating empirical studies on prediction models , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[13]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[14]  Taghi M. Khoshgoftaar,et al.  Modeling software quality: the Software Measurement Analysis and Reliability Toolkit , 2000, Proceedings 12th IEEE Internationals Conference on Tools with Artificial Intelligence. ICTAI 2000.

[15]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[16]  Taghi M. Khoshgoftaar,et al.  Estimating software project effort by analogy based on linguistic values , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[17]  Edward B. Allen,et al.  GP-based software quality prediction , 1998 .

[18]  Brigitte Bartsch-Spörl,et al.  Towards the Integration of Case-Based, Schema-Based and Model-Based Reasoning for Supporting Complex Design Tasks , 1995, ICCBR.

[19]  David W. Aha,et al.  Feature Selection for Case-Based Classification of Cloud Types: An Empirical Comparison , 1994 .

[20]  Taghi M. Khoshgoftaar,et al.  Predicting Fault-Prone Modules in Embedded Systems Using Analogy-Based Classification Models , 2002, Int. J. Softw. Eng. Knowl. Eng..

[21]  Taghi M. Khoshgoftaar,et al.  MODELING SOFTWARE QUALITY WITH CLASSIFICATION TREES , 2001 .

[22]  Taghi M. Khoshgoftaar,et al.  A practical classification-rule for software-quality models , 2000, IEEE Trans. Reliab..

[23]  Khaled El Emam,et al.  Comparing case-based reasoning classifiers for predicting high risk software components , 2001, J. Syst. Softw..

[24]  Taghi M. Khoshgoftaar,et al.  Tree-based software quality estimation models for fault prediction , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[25]  Taghi M. Khoshgoftaar,et al.  LOGISTIC REGRESSION MODELING OF SOFTWARE QUALITY , 1999 .

[26]  Janet L. Kolodner,et al.  Case-Based Reasoning , 1988, IJCAI 1989.

[27]  U. M. Feyyad Data mining and knowledge discovery: making sense out of data , 1996 .

[28]  Taghi M. Khoshgoftaar,et al.  Improving tree-based models of software quality with principal components analysis , 2000, Proceedings 11th International Symposium on Software Reliability Engineering. ISSRE 2000.

[29]  Edward B. Allen,et al.  Case-Based Software Quality Prediction , 2000, Int. J. Softw. Eng. Knowl. Eng..

[30]  Lionel C. Briand,et al.  A replicated assessment and comparison of common software cost modeling techniques , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[31]  Taghi M. Khoshgoftaar,et al.  Genetic programming model for software quality classification , 2001, Proceedings Sixth IEEE International Symposium on High Assurance Systems Engineering. Special Topic: Impact of Networking.

[32]  Niclas Ohlsson,et al.  Predicting Fault-Prone Software Modules in Telephone Switches , 1996, IEEE Trans. Software Eng..

[33]  M. Goldstein,et al.  Multivariate Analysis: Methods and Applications , 1984 .

[34]  Adam A. Porter,et al.  Experimental Software Engineering: A Report on the State of the Art , 1995, 1995 17th International Conference on Software Engineering.

[35]  Martin Shepperd,et al.  Experiences Using Case-Based Reasoning to Predict Software Project Effort , 2000 .

[36]  Taghi M. Khoshgoftaar,et al.  EMERALD: software metrics and models on the desktop , 1996, Proceedings of the Fourth International Symposium on Assessment of Software Tools.

[37]  Taghi M. Khoshgoftaar,et al.  Software Quality Prediction for High-Assurance Network Telecommunications Systems , 2001, Computer/law journal.

[38]  Hoang Pham Recent Advances in Reliability and Quality Engineering , 2001, Series on Quality, Reliability and Engineering Statistics.

[39]  Alberto Suárez,et al.  Globally Optimal Fuzzy Decision Trees for Classification and Regression , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Yoichi Muraoka,et al.  Building software quality classification trees: Approach, experimentation, evaluation , 1999 .

[41]  Ivan Bratko,et al.  Machine Learning and Data Mining; Methods and Applications , 1998 .

[42]  Lionel C. Briand,et al.  Assessing the Applicability of Fault-Proneness Models Across Object-Oriented Software Projects , 2002, IEEE Trans. Software Eng..

[43]  Ming Zhao,et al.  Application of multivariate analysis for software fault prediction , 1998, Software Quality Journal.

[44]  Ray Bareiss,et al.  Interactive Model-Driven Case Adaptation for Instructional Software Design , 2019, Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society.

[45]  Taghi M. Khoshgoftaar,et al.  A neural network approach for early detection of program modules having high risk in the maintenance phase , 1995, J. Syst. Softw..

[46]  Taghi M. Khoshgoftaar,et al.  Software Quality Classification Modeling Using the SPRINT Decision Tree Algorithm , 2003, Int. J. Artif. Intell. Tools.

[47]  Swapna S. Gokhale,et al.  Regression Tree Modeling For The Prediction Of Software Quality , 1997 .

[48]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[49]  Alain Abran,et al.  Fuzzy Analogy: A New Approach for Software Cost Estimation , 2001 .

[50]  Taghi M. Khoshgoftaar,et al.  Analogy-Based Practical Classification Rules for Software Quality Estimation , 2003, Empirical Software Engineering.

[51]  Taghi M. Khoshgoftaar,et al.  THREE-GROUP SOFTWARE QUALITY CLASSIFICATION MODELING USING AN AUTOMATED REASONING APPROACH , 2004 .

[52]  N. E. Schneidewind,et al.  Body of Knowledge for Software Quality Measurement , 2002, Computer.

[53]  Martin J. Shepperd,et al.  Comparing Software Prediction Techniques Using Simulation , 2001, IEEE Trans. Software Eng..

[54]  Sandro Morasca,et al.  On the application of measurement theory in software engineering , 2004, Empirical Software Engineering.

[55]  Witold Pedrycz,et al.  Software quality analysis with the use of computational intelligence , 2003, Inf. Softw. Technol..

[56]  K. Ganesan,et al.  Case-based path planning for autonomous underwater vehicles , 1994, Auton. Robots.