Building software quality classification trees: approach, experimentation, evaluation

A methodology for generating an optimum software quality classification tree using software complexity metrics to discriminate between high-quality modules and low-quality modules is proposed. The process of tree generation is an application of the AIC (Akaike Information Criterion) procedures to the binomial distribution. AIC procedures are based on maximum likelihood estimation and the least number of complexity metrics. It is an improvement of the software quality classification tree generation method proposed by Porter and Selby (1990) from the viewpoint that the complexity metrics are minimized. The problems of their method are that the software quality prediction model is unstable because it reflects observational errors in real data too much and there is no objective criterion for determining whether the discrimination is appropriate or not at a deep nesting level of the classification tree when the number of sample modules gets smaller. To solve these problems a new metric is introduced and its validity is theoretically and experimentally verified. In our examples, complexity metrics written in C language, such as lines of source code, Halstead's (1977) software science, McCabe's (976) cyclomatic number, Henry and Kafura's (1981) fan-in/out and Howatt and Baker's (1989) scope number, are investigated. Our experiments with a medium-sized piece of software (85 thousand lines of source code; 562 samples) show that the software quality classification tree generated by our new metric identifies the target class of the observed modules more efficiently using the minimum number of complexity metrics without any significant decrease of the correct classification ratio (76%->72%) than the conventional classification tree.

[1]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[2]  J. Kmenta,et al.  Multivariate Statistical Methods for Business and Economics. , 1974 .

[3]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[4]  Maurice H. Halstead,et al.  Elements of software science (Operating and programming systems series) , 1977 .

[5]  Sallie M. Henry,et al.  Software Structure Metrics Based on Information Flow , 1981, IEEE Transactions on Software Engineering.

[6]  Wei-Tek Tsai,et al.  A tool for discriminant analysis and classification of software metrics , 1987 .

[7]  H. Akaike Factor analysis and AIC , 1987 .

[8]  S. Henry,et al.  A methodology for integrating maintainability using software metrics , 1989, Proceedings. Conference on Software Maintenance - 1989.

[9]  Rigorous definition and analysis of program complexity measures: An example using nesting , 1989, J. Syst. Softw..

[10]  Adam A. Porter,et al.  Empirically guided software development using metric-based classification trees , 1990, IEEE Software.

[11]  Carma McClure,et al.  The three Rs of software automation: re-engineering, repository, reusability , 1992 .

[12]  Taghi M. Khoshgoftaar,et al.  The Detection of Fault-Prone Programs , 1992, IEEE Trans. Software Eng..

[13]  Abhijit S. Pandya,et al.  A neural network modeling methodology for the detection of high-risk programs , 1993, Proceedings of 1993 IEEE International Symposium on Software Reliability Engineering.

[14]  Taghi M. Khoshgoftaar,et al.  Dynamic system complexity , 1993, [1993] Proceedings First International Software Metrics Symposium.

[15]  Paul W. Oman,et al.  Using metrics to evaluate software system maintainability , 1994, Computer.

[16]  Norman F. Schneidewind,et al.  Validating metrics for ensuring Space Shuttle flight software quality , 1994, Computer.

[17]  Ryouei Takahashi,et al.  Discriminative efficiency methodology for validating software quality classification models , 1995, Systems and Computers in Japan.

[18]  Marvin V. Zelkowitz,et al.  Complexity Measure Evaluation and Selection , 1995, IEEE Trans. Software Eng..

[19]  William M. Evanco Poisson Models for Subprogram Defect Analyses , 1996 .

[20]  Yukihiro Nakamura,et al.  The effect of interface complexity on program error density , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[21]  Ryouei Takahashi,et al.  Software quality classification model based on McCabe's complexity measure , 1997, J. Syst. Softw..

[22]  William M. Evanco,et al.  Poisson analyses of defects for small software components , 1997, J. Syst. Softw..