Testing the theory of relative defect proneness for closed-source software

Recent studies on open-source software (OSS) products report that smaller modules are proportionally more defect prone compared to larger ones. This phenomenon, referred to as the Theory of Relative Defect Proneness (RDP), challenges the traditional QA approaches that give a higher priority to larger modules, and it attracts growing interest from closed-source software (CSS) practitioners. In this paper, we report the findings of a study where we tested the theory of RDP using ten CSS products. The results clearly confirm the theory of RDP. We also demonstrate the useful practical implications of this theory in terms of defect-detection effectiveness. Therefore, this study does not only make research contributions by rigorously testing a scientific theory for a different category of software products, but also provides useful insights and evidence to practitioners for revising their existing QA practices.

[1]  P. J. Huber The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .

[2]  F. Chayes Ratio Correlation: A Manual for Students of Petrology and Geochemistry , 1971 .

[3]  Fumio Akiyama,et al.  An Example of Software System Debugging , 1971, IFIP Congress.

[4]  D. Cox Regression Models and Life-Tables , 1972 .

[5]  Maurice H. Halstead,et al.  A Software Physics Analysis of Akiyama's Debugging Data , 1975 .

[6]  Maurice H. Halstead,et al.  Elements of software science (Operating and programming systems series) , 1977 .

[7]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[8]  Linda M. Ottenstein Quantitative Estimates of Debugging Requirements , 1979, IEEE Transactions on Software Engineering.

[9]  M. Lipow,et al.  Number of Faults per Line of Code , 1982, IEEE Transactions on Software Engineering.

[10]  G. D. Frewin,et al.  M.H. Halstead's Software Science - a critical examination , 1982, ICSE '82.

[11]  John E. Gaffney,et al.  Estimating the Number of Faults in Code , 1984, IEEE Transactions on Software Engineering.

[12]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation0 , 1984, CACM.

[13]  Tze-Jie Yu,et al.  Identifying Error-Prone Software—An Empirical Study , 1985, IEEE Transactions on Software Engineering.

[14]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[15]  Adam A. Porter,et al.  Empirically guided software development using metric-based classification trees , 1990, IEEE Software.

[16]  A. Wagstaff,et al.  On the measurement of inequalities in health. , 1991, Social science & medicine.

[17]  Victor R. Basili,et al.  Analyzing Error-Prone System Structure , 1991, IEEE Trans. Software Eng..

[18]  A. Pickles,et al.  A specification test for univariate and multivariate proportional hazards models. , 1993, Biometrics.

[19]  A. Ehrenberg,et al.  The Design of Replicated Studies , 1993 .

[20]  Taghi M. Khoshgoftaar,et al.  Using neural networks to predict software faults during testing , 1996, IEEE Trans. Reliab..

[21]  Taghi M. Khoshgoftaar,et al.  Application of neural networks to software quality modeling of a very large telecommunications system , 1997, IEEE Trans. Neural Networks.

[22]  Jarrett Rosenberg,et al.  Some misconceptions about lines of code , 1997, Proceedings Fourth International Software Metrics Symposium.

[23]  Les Hatton,et al.  Reexamining the Fault Density-Component Size Connection , 1997, IEEE Softw..

[24]  N. Kakwani,et al.  Socioeconomic inequalities in health: Measurement, computation, and statistical inference , 1997 .

[25]  Les Hatton,et al.  Does OO Sync with How We Think? , 1998, IEEE Softw..

[26]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[27]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[28]  Hoang Pham Software Reliability , 1999 .

[29]  Eric S. Raymond,et al.  The cathedral and the bazaar - musings on Linux and Open Source by an accidental revolutionary , 2001 .

[30]  Eric Lease Morgan,et al.  Review of The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary by Eric S. Raymond, Sebastopol, Calif.: O'Reilly, 1999 , 2000 .

[31]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[32]  Audris Mockus,et al.  A case study of open source software development: the Apache server , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[33]  P. Grambsch,et al.  Modeling Survival Data: Extending the Cox Model , 2000 .

[34]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[35]  Khaled El Emam,et al.  The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics , 2001, IEEE Trans. Software Eng..

[36]  D HerbslebJames,et al.  Two case studies of open source software development , 2002 .

[37]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[38]  Lionel C. Briand,et al.  Assessing the Applicability of Fault-Proneness Models Across Object-Oriented Software Projects , 2002, IEEE Trans. Software Eng..

[39]  Khaled El Emam,et al.  The Optimal Class Size for Object-Oriented Software , 2002, IEEE Trans. Software Eng..

[40]  Jeffrey C. Carver,et al.  Replicated Studies: Building a Body of Knowledge about Software Reading Techniques , 2003, Lecture Notes on Empirical Software Engineering.

[41]  Ana M. Moreno,et al.  Lecture Notes on Empirical Software Engineering , 2003, Series on Software Engineering and Knowledge Engineering.

[42]  Sebastian G. Elbaum,et al.  Quality assurance under the open source development model , 2003, J. Syst. Softw..

[43]  Adam A. Porter,et al.  Comparing Detection Methods For Software Requirements Inspections: A Replication Using Professional Subjects , 1998, Empirical Software Engineering.

[44]  Lionel C. Briand,et al.  Replicated Case Studies for Investigating Quality Factors in Object-Oriented Designs , 2001, Empirical Software Engineering.

[45]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[46]  Khaled El Emam The ROI from Software Quality , 2005 .

[47]  Per Runeson,et al.  A Replicated Quantitative Analysis of Fault Distributions in Complex Software Systems , 2007, IEEE Transactions on Software Engineering.

[48]  Hongfang Liu,et al.  Modeling the Effect of Size on Defect Proneness for Open-Source Software , 2007, 29th International Conference on Software Engineering (ICSE'07 Companion).

[49]  Hongfang Liu,et al.  Modeling the Effect of Size on Defect Proneness for Open-Source Software , 2007, ICSE 2007.

[50]  Günes Koru,et al.  A Survey of Quality Assurance Practices in Biomedical Open Source Software Projects , 2007, Journal of medical Internet research.

[51]  Khaled El Emam,et al.  A Replicated Survey of IT Software Project Failures , 2008, IEEE Software.

[52]  Hongfang Liu,et al.  Theory of relative defect proneness , 2008, Empirical Software Engineering.

[53]  Hongfang Liu,et al.  An Investigation into the Functional Form of the Size-Defect Relationship for Software Modules , 2009, IEEE Transactions on Software Engineering.