Temporal Patterns of Software Evolution Defects: A Comparative Analysis of Open Source and Closed Source Projects

This study examines temporal patterns of software systems defects using the Autoregressive Integrated Moving Average (ARIMA) approach. Defect reports from ten software application projects are analyzed; five of these projects are open source and five are closed source from two software vendors. Across all sampled projects, the ARIMA time series modeling technique provides accurate estimates of reported defects during software maintenance, with organizationally dependent parameterization. In contrast to causal models that require extraction of source-code level metrics, this approach is based on readily available defect report data and is less computation intensive. This approach can be used to improve software maintenance and evolution resource allocation decisions and to identify outlier projects—that is, to provide evidence of unexpected defect reporting patterns that may indicate troubled projects.

[1]  James M. Bieman,et al.  The FreeBSD project: a replication case study of open source development , 2005, IEEE Transactions on Software Engineering.

[2]  Mark Keil,et al.  Software project risks and their effect on outcomes , 2004, CACM.

[3]  Mayuram S. Krishnan,et al.  Measuring Process Consistency: Implications for Reducing Software Defects , 1999, IEEE Trans. Software Eng..

[4]  Khaled El Emam,et al.  Evaluating Capture-Recapture Models with Two Inspectors , 2001, IEEE Trans. Software Eng..

[5]  Kathleen L. Gregory,et al.  Native-view paradigms: Multiple cultures and culture conflicts in organizations. , 1983 .

[6]  Taghi M. Khoshgoftaar,et al.  Classification-tree models of software-quality over multiple releases , 2000, IEEE Trans. Reliab..

[7]  Douglas Thomas Hacker Culture , 2002 .

[8]  Taghi M. Khoshgoftaar,et al.  Analyzing software quality with limited fault-proneness defect data , 2005, Ninth IEEE International Symposium on High-Assurance Systems Engineering (HASE'05).

[9]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[10]  Mayuram S. Krishnan,et al.  The role of team factors in software cost and quality: An empirical analysis , 1998, Inf. Technol. People.

[11]  Stan Jarzabek,et al.  Quality Prediction and Assessment for Product Lines , 2003, CAiSE.

[12]  Georgios Dounias,et al.  Predicting Defects in Software Using Grammar-Guided Genetic Programming , 2008, SETN.

[13]  Chris F. Kemerer,et al.  A longitudinal analysis of software maintenance patterns , 1997, ICIS '97.

[14]  Tong-Seng Quah,et al.  Application of neural networks for software quality prediction using object-oriented metrics , 2005, J. Syst. Softw..

[15]  Philip Hans Franses,et al.  Periodicity and Stochastic Trends in Economic Time Series , 1996 .

[16]  Howard B. Lee,et al.  Foundations of Behavioral Research , 1973 .

[17]  Claes Wohlin,et al.  Modelling fault-proneness statistically over a sequence of releases: a case study , 2001, J. Softw. Maintenance Res. Pract..

[18]  M. Hecht,et al.  A discrete-event simulator for predicting outage time and costs as a function of maintenance resources , 2002, Annual Reliability and Maintainability Symposium. 2002 Proceedings (Cat. No.02CH37318).

[19]  George E. Stark,et al.  Software maintenance management strategies: observations from the field , 1997 .

[20]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[21]  Scott Dick,et al.  Fuzzy Clustering of Open-Source Software Quality Data: A Case Study of Mozilla , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[22]  Giuliano Antoniol,et al.  Trend Analysis and Issue Prediction in Large-Scale Open Source Systems , 2008, 2008 12th European Conference on Software Maintenance and Reengineering.

[23]  Kevin Crowston,et al.  Open source software projects as virtual organisations: competency rallying for software development , 2002, IEE Proc. Softw..

[24]  Philip Hans Franses,et al.  The Econometric Analysis of Seasonal Time Series , 2005 .

[25]  G. Box,et al.  On a measure of lack of fit in time series models , 1978 .

[26]  Alaa F. Sheta,et al.  Prediction of software reliability: a comparison between regression and neural network non-parametric models , 2001, Proceedings ACS/IEEE International Conference on Computer Systems and Applications.

[27]  R. Dennis Cook,et al.  Cross-Validation of Regression Models , 1984 .

[28]  S. Dick,et al.  Applying Novel Resampling Strategies To Software Defect Prediction , 2007, NAFIPS 2007 - 2007 Annual Meeting of the North American Fuzzy Information Processing Society.

[29]  Watts S. Humphrey,et al.  Predicting (Individual) Software Productivity , 1991, IEEE Trans. Software Eng..

[30]  E. Burton Swanson,et al.  Characteristics of application software maintenance , 1978, CACM.

[31]  Thomas M. Pigoski Practical Software Maintenance: Best Practices for Managing Your Software Investment , 1996 .

[32]  Cláudio Sant'Anna,et al.  Evolving software product lines with aspects , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[33]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[34]  Sunita Chulani,et al.  Metrics for managing customer view of software quality , 2003, Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No.03EX717).

[35]  Peter Van Roy,et al.  Self Management and the Future of Software Design , 2007, FACS.

[36]  Taghi M. Khoshgoftaar,et al.  Predicting software errors, during development, using nonlinear regression models: a comparative study , 1992 .

[37]  D. Sommers,et al.  A longitudinal analysis , 1992 .

[38]  Chris Chatfield,et al.  The Analysis of Time Series , 1990 .

[39]  Sumit Sarkar,et al.  Staffing Software Maintenance and Support Projects , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[40]  Taghi M. Khoshgoftaar,et al.  A neural network approach for early detection of program modules having high risk in the maintenance phase , 1995, J. Syst. Softw..

[41]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[42]  Daniel M. Germán,et al.  On the prediction of the evolution of libre software projects , 2007, 2007 IEEE International Conference on Software Maintenance.

[43]  Uzma Raja,et al.  Modeling software evolution defects: a time series approach , 2009 .

[44]  Qinbao Song,et al.  Software defect association mining and defect correction effort prediction , 2006, IEEE Transactions on Software Engineering.

[45]  Venkata U. B. Challagulla,et al.  Empirical assessment of machine learning based software defect prediction techniques , 2005, 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems.

[46]  Edward N. Adams,et al.  Optimizing Preventive Service of Software Products , 1984, IBM J. Res. Dev..

[47]  Giuliano Antoniol,et al.  An automatic approach to identify class evolution discontinuities , 2004 .

[48]  Taizan Chan,et al.  Beyond productivity in software maintenance: factors affecting lead time in servicing users' requests , 2000, Proceedings 2000 International Conference on Software Maintenance.