Evaluation and Application of Bounded Generalized Pareto Analysis to Fault Distributions in Open Source Software

In general, one of the most important aspects of software development and project management is how to make predictions and assessments of quality and reliability for developed products. Project data usually will be systematically collected and analyzed during the process of software development. Practically, it would be helpful if developers could identify the most error-prone modules early so that they can optimize testing-resource allocation and increase fault detection effectiveness accordingly. In the past, many research studies revealed the applicability of the Pareto principle to software systems, and some of them reported that the Pareto distribution (PD) model can be used to predict the fault distribution of software. In this paper, a special form of the Generalized PD model, named the Bounded Generalized Pareto distribution (BGPD) model, is further proposed to investigate the fault distributions of Open Source Software (OSS). It can be seen that the BGPD model eliminates the issue which occurred in the classical PD model. Three methods of parameter estimation will be presented, and related experiments are performed based on real OSS failure data. Experimental results show that the BGPD model presents high fitness to the actual failure data of OSS. Finally, the possibility of using early limited fault data to predict the later software fault distribution is also studied. Numerical results indicate that the BGPD model can be trusted to consistently produce accurate estimates of fault predictions during the early stages of development. The findings can provide an effective foundation for managing the necessary activities of software development and testing.

[1]  Min Xie,et al.  Software Reliability Modelling , 1991, Series on Quality, Reliability and Engineering Statistics.

[2]  Pankaj Jalote,et al.  Software Project Management in Practice , 2002 .

[3]  Hongyu Zhang On the Distribution of Software Faults , 2008, IEEE Transactions on Software Engineering.

[4]  Chin-Yu Huang,et al.  Analysis of Software Reliability Modeling Considering Testing Compression Factor and Failure-to-Fault Relationship , 2010, IEEE Transactions on Computers.

[5]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[6]  Michael R. Lyu,et al.  A Unified Scheme of Some Nonhomogenous Poisson Process Models for Software Reliability Estimation , 2003, IEEE Trans. Software Eng..

[7]  Mary Shaw,et al.  Empirical evaluation of defect projection models for widely-deployed production software systems , 2004, SIGSOFT '04/FSE-12.

[8]  P. K. Kapur,et al.  A Unified Approach for Developing Software Reliability Growth Models in the Presence of Imperfect Debugging and Error Generation , 2011, IEEE Transactions on Reliability.

[9]  John D. Musa,et al.  Software reliability - measurement, prediction, application , 1987, McGraw-Hill series in software engineering and technology.

[10]  Claes Wohlin,et al.  Experimentation in Software Engineering , 2012, Springer Berlin Heidelberg.

[11]  Mary Shaw,et al.  Forecasting field defect rates using a combined time-based and metrics-based approach: a case study of OpenBSD , 2005, 16th IEEE International Symposium on Software Reliability Engineering (ISSRE'05).

[12]  Michael R. Lyu,et al.  Estimation and Analysis of Some Generalized Multiple Change-Point Software Reliability Models , 2011, IEEE Transactions on Reliability.

[13]  Albert Endres,et al.  An analysis of errors and their causes in system programs , 1975, IEEE Transactions on Software Engineering.

[14]  Aurora Trinidad Ramirez Pozo,et al.  A Genetic Programming Approach for Software Reliability Modeling , 2010, IEEE Transactions on Reliability.

[15]  J. Pickands Statistical Inference Using Extreme Order Statistics , 1975 .

[16]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[17]  Michael E. Fagan Advances in software inspections , 1986, IEEE Transactions on Software Engineering.

[18]  John D. Musa,et al.  A comparison of time domains for software reliability models , 1984, J. Syst. Softw..

[19]  Karama Kanoun,et al.  A Method for Software Reliability Analysis and Prediction Application to the TROPICO-R Switching System , 1991, IEEE Trans. Software Eng..

[20]  D. Peel,et al.  Economic Forecasting: An Introduction. , 1992 .

[21]  Ying Zhou,et al.  Open source software reliability model , 2005, ACM SIGSOFT Softw. Eng. Notes.

[22]  Michael Daskalantonakis,et al.  A Practical View of Software Measurement and Implementation Experiences Within Motorola , 1992, IEEE Trans. Software Eng..

[23]  Elaine J. Weyuker,et al.  The distribution of faults in a large industrial software system , 2002, ISSTA '02.

[24]  Milton C. Chew Distributions in Statistics: Continuous Univariate Distributions-1 and 2 , 1971 .

[25]  Chin-Yu Huang,et al.  A study of applying the bounded Generalized Pareto distribution to the analysis of software fault distribution , 2010, 2010 IEEE International Conference on Industrial Engineering and Engineering Management.

[26]  Stamatis Vassiliadis,et al.  Software Reliability Models for Computer Implementations — An Empirical Study , 1996 .

[27]  K. Goseva-Popstojanova,et al.  Common Trends in Software Fault and Failure Data , 2009, IEEE Transactions on Software Engineering.

[28]  Edward N. Adams,et al.  Optimizing Preventive Service of Software Products , 1984, IBM J. Res. Dev..

[29]  M. Xie,et al.  Software Reliability Models - Past, Present and Future , 2000 .

[30]  Michele Marchesi,et al.  On the Distribution of Bugs in the Eclipse System , 2011, IEEE Transactions on Software Engineering.

[31]  Per Runeson,et al.  A Replicated Quantitative Analysis of Fault Distributions in Complex Software Systems , 2007, IEEE Transactions on Software Engineering.

[32]  Nozer D. Singpurwalla,et al.  A Bayesian Analysis of the Logarithmic-Poisson Execution Time Model Based on Expert Opinion and Failure Data , 1994, IEEE Trans. Software Eng..

[33]  Xiang Li,et al.  Reliability analysis and optimal version-updating for open source software , 2011, Inf. Softw. Technol..

[34]  Willa K. Ehrlich,et al.  Modeling software failures and reliability growth during system testing , 1987, ICSE '87.