A Critique of Software Defect Prediction Models

Many organizations want to predict the number of defects (faults) in software systems, before they are deployed, to gauge the likely delivered quality and maintenance effort. To help in this numerous software metrics and statistical models have been developed, with a correspondingly large literature. We provide a critical review of this literature and the state-of-the-art. Most of the wide range of prediction models use size and complexity metrics to predict defects. Others are based on testing data, the "quality" of the development process, or take a multivariate approach. The authors of the models have often made heroic contributions to a subject otherwise bereft of empirical studies. However, there are a number of serious theoretical and practical problems in many studies. The models are weak because of their inability to cope with the, as yet, unknown relationship between defects and failures. There are fundamental statistical and data quality problems that undermine model validity. More significantly many prediction models tend to model only part of the underlying problem and seriously misspecify it. To illustrate these points the Goldilock's Conjecture, that there is an optimum module size, is used to show the considerable problems inherent in current defect prediction approaches. Careful and considered analysis of past and new results shows that the conjecture lacks support and that some models are misleading. We recommend holistic models for software defect prediction, using Bayesian belief networks, as alternative approaches to the single-issue models used at present. We also argue for research into a theory of "software decomposition" in order to test hypotheses about defect introduction and help construct a better science of software engineering.

[1]  Edward N. Adams,et al.  Optimizing Preventive Service of Software Products , 1984, IBM J. Res. Dev..

[2]  Adam C. Marshall,et al.  A relationship between software coverage metrics and reliability , 1994, Softw. Test. Verification Reliab..

[3]  Lorenzo Strigini,et al.  On the Use of Testability Measures for Dependability Assessment , 1996, IEEE Trans. Software Eng..

[4]  Sallie M. Henry,et al.  Software Structure Metrics Based on Information Flow , 1981, IEEE Transactions on Software Engineering.

[5]  Sarah Brocklehurst,et al.  New Ways to Get Accurate Reliability Measures , 1992, IEEE Softw..

[6]  Hitoshi Kume,et al.  A Case History Analysis of Software Error Cause-Effect Relationships , 1991, IEEE Trans. Software Eng..

[7]  B. Manly Multivariate Statistical Methods : A Primer , 1986 .

[8]  Bev Littlewood,et al.  Applying Bayesian Belief Networks to System Dependability Assessment , 1996, SSS.

[9]  Michael A. Cusumano,et al.  Japan's Software Factories: A Challenge to U.S. Management, Michael A. Cusumano. 1991. Oxford University Press, New York, NY. 513 pages. ISBN: 0-19-506216-7 , 1991, The Journal of Asian Studies.

[10]  Michael Dyer The Cleanroom Approach to Quality Software Development , 1992, Int. CMG Conference.

[11]  Daniel J. Paulish,et al.  An empirical investigation of software fault distribution , 1993, [1993] Proceedings First International Software Metrics Symposium.

[12]  D. Potier,et al.  Experiments with computer software complexity and reliability , 1982, ICSE '82.

[13]  Sallie M. Henry,et al.  The evaluation of software systems' structure using quantitative software metrics , 1984, Softw. Pract. Exp..

[14]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[15]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[16]  Richard Bache,et al.  Software Metrics for Product Assesment , 1993 .

[17]  Barbara A. Kitchenham,et al.  An evaluation of some design metrics , 1990, Softw. Eng. J..

[18]  Shari Lawrence Pfleeger,et al.  Software metrics (2nd ed.): a rigorous and practical approach , 1997 .

[19]  Sallie M. Henry,et al.  Maintenance metrics for the object oriented paradigm , 1993, [1993] Proceedings First International Software Metrics Symposium.

[20]  Tor Stålhane Practical Experience with Safety Assessment of a System for Automatic Train Control , 1992 .

[21]  J. Voas,et al.  Software Testability: The New Verification , 1995, IEEE Softw..

[22]  Taghi M. Khoshgoftaar,et al.  Regression modelling of software quality: empirical investigation☆ , 1990 .

[23]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation0 , 1984, CACM.

[24]  Paul Oman,et al.  Constructing and testing software maintainability assessment models , 1993, [1993] Proceedings First International Software Metrics Symposium.

[25]  Linda M. Ottenstein Quantitative Estimates of Debugging Requirements , 1979, IEEE Transactions on Software Engineering.

[26]  A. E. Ferdinand A THEORY OF SYSTEM COMPLEXITY , 1974 .

[27]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[28]  Niclas Ohlsson,et al.  Predicting Fault-Prone Software Modules in Telephone Switches , 1996, IEEE Trans. Software Eng..

[29]  Norman Fenton,et al.  Software engineering metrics. Vol. 1: Measures and validations. Martin Shepperd, Published by McGraw‐Hill Book Company Europe, Maidenhead, U.K., 1993. ISBN 0‐07‐707410‐6, 302 pages. Price: £35.00, hard cover , 1994 .

[30]  Martin Neil,et al.  Multivariate Assessment of Software Products , 1992, Softw. Test. Verification Reliab..

[31]  Carol Withrow,et al.  Prediction and control of ADA software defects , 1990, J. Syst. Softw..

[32]  M. Shepperd,et al.  A critique of cyclomatic complexity as a software metric , 1988, Softw. Eng. J..

[33]  M. Lipow,et al.  Number of Faults per Line of Code , 1982, IEEE Transactions on Software Engineering.

[34]  Norman E. Fenton,et al.  Measurement : A Necessary Scientific Basis , 2004 .

[35]  James H. Dobbins,et al.  Application of software inspection methodology in design and code , 1984 .

[36]  Jarrett Rosenberg,et al.  Some misconceptions about lines of code , 1997, Proceedings Fourth International Software Metrics Symposium.

[37]  Martin David Neil,et al.  Statistical modelling of software metrics , 1992 .

[38]  Taghi M. Khoshgoftaar,et al.  The Detection of Fault-Prone Programs , 1992, IEEE Trans. Software Eng..

[39]  Martin L. Shooman,et al.  Software Engineering: Design, Reliability, and Management , 1983 .

[40]  Barbara A. Kitchenham An evaluation of software structure metrics , 1988, Proceedings COMPSAC 88: The Twelfth Annual International Computer Software & Applications Conference.

[41]  H. E. Dunsmore,et al.  Software Science Revisited: A Critical Analysis of the Theory and Its Empirical Support , 1983, IEEE Transactions on Software Engineering.

[42]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[43]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[44]  Yinka R. Adebayo,et al.  Some issues surrounding air pollution problems in Africa , 1990 .

[45]  Capers Jones,et al.  Applied software measurement: assuring productivity and quality , 1991 .

[46]  R. Regan,et al.  The detection of , 1973 .

[47]  Robert B. Grady,et al.  Practical Software Metrics for Project Management and Process Improvement , 1992 .

[48]  Keith W. Miller,et al.  Confidently Assessing a Zero Probability of Software Failure , 1993, SAFECOMP.

[49]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[50]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[51]  Maurice H. Halstead,et al.  Elements of software science (Operating and programming systems series) , 1977 .

[52]  Norman F. Schneidewind,et al.  An Experiment in Software Error Data Collection and Analysis , 1979, IEEE Transactions on Software Engineering.

[53]  John E. Gaffney,et al.  Estimating the Number of Faults in Code , 1984, IEEE Transactions on Software Engineering.

[54]  G. D. Frewin,et al.  M.H. Halstead's Software Science - a critical examination , 1982, ICSE '82.

[55]  D. F. Morrison,et al.  Multivariate Statistical Methods , 1968 .

[56]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation , 1993 .

[57]  Fumio Akiyama,et al.  An Example of Software System Debugging , 1971, IFIP Congress.

[58]  Michael Diaz,et al.  How Software Process Improvement Helped Motorola , 1997, IEEE Softw..

[59]  L. Hatton,et al.  The Automation Of Software Process AndProduct Quality , 1970 .

[60]  Taghi M. Khoshgoftaar,et al.  Predicting Software Development Errors Using Software Complexity Metrics , 1990, IEEE J. Sel. Areas Commun..

[61]  Jarrett Rosenberg,et al.  Some misconceptions about lines of code , 1997, Proceedings Fourth International Software Metrics Symposium.

[62]  NakajoTakeshi,et al.  A case history analysis of software error cause-effect relationships , 1991 .

[63]  Tze-Jie Yu,et al.  Identifying Error-Prone Software—An Empirical Study , 1985, IEEE Transactions on Software Engineering.

[64]  Barbara A. Kitchenham,et al.  Validating software measures , 1991, Softw. Test. Verification Reliab..

[65]  Les Hatton,et al.  Reexamining the Fault Density-Component Size Connection , 1997, IEEE Softw..

[66]  Robert L. Glass,et al.  Science and substance: a challenge to software engineers , 1994, IEEE Software.

[67]  K. Yasuda,et al.  Software quality assurance activities in Japan , 1990 .

[68]  Watts S. Humphrey,et al.  Managing the software process , 1989, The SEI series in software engineering.