The necessity of assuring quality in software measurement data

Software measurement data is often used to model software quality classification models. Related literature has focussed on developing new classification techniques and schemes with the aim of improving classification accuracy. However, the quality of software measurement data used to build such classification models plays a critical role in their accuracy and usefulness. We present empirical case studies, which demonstrate that despite using a very large number of diverse classification techniques for building software quality classification models, the classification accuracy does not show a dramatic improvement. For example, a simple lines-of-code based classification performs comparatively to some other more advanced classification techniques such as neural networks, decision trees, and case-based reasoning. Case studies of the NASA JM1 and KC2 software measurement datasets (obtained through the NASA Metrics Data Program) are presented. Some possible reasons that affect the quality of a software measurement dataset include presence of data noise, errors due to improper software data collection, exclusion of software metrics that are better representative software quality indicators, and improper recording of software fault data. This study shows, through an empirical study, that instead of searching for a classification technique that perform well for given software measurement dataset, the software quality and development teams should focus on improving the quality of the software measurement dataset.

[1]  P. Compton,et al.  A philosophical basis for knowledge acquisition , 1990 .

[2]  Yoichi Muraoka,et al.  Building software quality classification trees: approach, experimentation, evaluation , 1997, Proceedings The Eighth International Symposium on Software Reliability Engineering.

[3]  Bojan Cukic,et al.  Predicting fault prone modules by the Dempster-Shafer belief networks , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[4]  Taghi M. Khoshgoftaar,et al.  Fault Prediction Modeling for Software Quality Estimation: Comparing Commonly Used Techniques , 2003, Empirical Software Engineering.

[5]  Per Runeson,et al.  Experience from replicating empirical studies on prediction models , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[6]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[7]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[8]  Khaled El Emam,et al.  Comparing case-based reasoning classifiers for predicting high risk software components , 2001, J. Syst. Softw..

[9]  Taghi M. Khoshgoftaar,et al.  LOGISTIC REGRESSION MODELING OF SOFTWARE QUALITY , 1999 .

[10]  Taghi M. Khoshgoftaar,et al.  Analogy-Based Practical Classification Rules for Software Quality Estimation , 2003, Empirical Software Engineering.

[11]  Taghi M. Khoshgoftaar,et al.  Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study , 2004, Empirical Software Engineering.

[12]  Leonard E. Trigg,et al.  Technical Note: Naive Bayes for Regression , 2000, Machine Learning.

[13]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[14]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.

[15]  Taghi M. Khoshgoftaar,et al.  Balancing Misclassification Rates in Classification-Tree Models of Software Quality , 2004, Empirical Software Engineering.

[16]  Hausi A. Müller,et al.  Predicting fault-proneness using OO metrics. An industrial case study , 2002, Proceedings of the Sixth European Conference on Software Maintenance and Reengineering.

[17]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[18]  Taghi M. Khoshgoftaar,et al.  Genetic programming model for software quality classification , 2001, Proceedings Sixth IEEE International Symposium on High Assurance Systems Engineering. Special Topic: Impact of Networking.

[19]  Martin J. Shepperd,et al.  Comparing Software Prediction Techniques Using Simulation , 2001, IEEE Trans. Software Eng..

[20]  Norman F. Schneidewind,et al.  Investigation of logistic regression as a discriminant of software quality , 2001, Proceedings Seventh International Software Metrics Symposium.

[21]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[22]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[23]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[24]  Alberto Suárez,et al.  Globally Optimal Fuzzy Decision Trees for Classification and Regression , 1999, IEEE Trans. Pattern Anal. Mach. Intell..