Evaluation of Several Nonparametric Bootstrap Methods to Estimate Confidence Intervals for Software Metrics

Sample statistics and model parameters can be used to infer the properties, or characteristics, of the underlying population in typical data-analytic situations. Confidence intervals can provide an estimate of the range within which the true value of the statistic lies. A narrow confidence interval implies low variability of the statistic, justifying a strong conclusion made from the analysis. Many statistics used in software metrics analysis do not come with theoretical formulas to allow such accuracy assessment. The Efron bootstrap statistical analysis appears to address this weakness. In this paper, we present an empirical analysis of the reliability of several Efron nonparametric bootstrap methods in assessing the accuracy of sample statistics in the context of software metrics. A brief review on the basic concept of various methods available for the estimation of statistical errors is provided, with the stated advantages of the Efron bootstrap discussed. Validations of several different bootstrap algorithms are performed across basic software metrics in both simulated and industrial software engineering contexts. It was found that the 90 percent confidence intervals for mean, median, and Spearman correlation coefficients were accurately predicted. The 90 percent confidence intervals for the variance and Pearson correlation coefficients were typically underestimated (60-70 percent confidence interval), and those for skewness and kurtosis overestimated (98-100 percent confidence interval). It was found that the Bias-corrected and accelerated bootstrap approach gave the most consistent confidence intervals, but its accuracy depended on the metric examined. A method for correcting the under-/ overestimation of bootstrap confidence intervals for small data sets is suggested, but the success of the approach was found to be inconsistent across the tested metrics.

[1]  R. A. Visser,et al.  Applying the Bootstrap to Generate Confidence Regions in Multiple Correspondence Analysis , 1992 .

[2]  Giancarlo Succi,et al.  The Webmetrics Project-Exploiting " Software Tools on Demand " , 2002 .

[3]  J Carpenter,et al.  Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. , 2000, Statistics in medicine.

[4]  Abdelhak M. Zoubir,et al.  Bootstrap: theory and applications , 1993, Optics & Photonics.

[5]  Amrit L. Goel,et al.  Time-Dependent Error-Detection Rate Model for Software Reliability and Other Performance Measures , 1979, IEEE Transactions on Reliability.

[6]  Skylar Sao Cheng Lei On the application of the Efron bootstrap for accessing confidence measures on software metrics , 2001 .

[7]  Robert Tibshirani,et al.  Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[8]  Lionel C. Briand,et al.  The impact of design properties on development cost in object-oriented systems , 2001, Proceedings Seventh International Software Metrics Symposium.

[9]  J Llorca,et al.  A comparison of several procedures to estimate the confidence interval for attributable risk in case-control studies. , 2000, Statistics in medicine.

[10]  Norman E. Fenton,et al.  Software Metrics: A Rigorous Approach , 1991 .

[11]  B. Littlewood Software Reliability Model for Modular Program Structure , 1979, IEEE Transactions on Reliability.

[12]  Skylar Lei,et al.  Evaluation of several Efron bootstrap methods to estimate error measures for software metrics , 2002, IEEE CCECE2002. Canadian Conference on Electrical and Computer Engineering. Conference Proceedings (Cat. No.02CH37373).

[13]  J. Shao,et al.  The jackknife and bootstrap , 1996 .

[14]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[15]  Boualem Boashash,et al.  The bootstrap and its application in signal processing , 1998, IEEE Signal Process. Mag..

[16]  Lutz Prechelt,et al.  An Experiment Measuring the Effects of Personal Software Process (PSP) Training , 2001, IEEE Trans. Software Eng..

[17]  Donald L. Harnett,et al.  Introduction to statistical methods , 1970 .

[18]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[19]  Sg Thompson,et al.  Analysis of cost data in randomised controlled trials: An application of the non-parametric bootstrap , 2000 .

[20]  R. H. Myers,et al.  Probability and Statistics for Engineers and Scientists , 1978 .

[21]  Russell C. H. Cheng Bootstrap methods in computer simulation experiments , 1995, WSC '95.

[22]  Shari Lawrence Pfleeger,et al.  Software metrics (2nd ed.): a rigorous and practical approach , 1997 .

[23]  Peter Meer,et al.  Performance Assessment Through Bootstrap , 1997, IEEE Trans. Pattern Anal. Mach. Intell..