A statistical framework for analyzing the duration of software projects

The duration of a software project is a very important feature, closely related to its cost. Various methods and models have been proposed in order to predict not only the cost of a software project but also its duration. Since duration is essentially the random length of a time interval from a starting to a terminating event, in this paper we present a framework of statistical tools, appropriate for studying and modeling the distribution of the duration. The idea for our approach comes from the parallelism of duration to the life of an entity which is frequently studied in biostatistics by a certain statistical methodology known as survival analysis. This type of analysis offers great flexibility in modeling the duration and in computing various statistics useful for inference and estimation. As in any other statistical methodology, the approach is based on datasets of measurements on projects. However, one of the most important advantages is that we can use in our data information not only from completed projects, but also from ongoing projects. In this paper we present the general principles of the methodology for a comprehensive duration analysis and we also illustrate it with applications to known data sets. The analysis showed that duration is affected by various factors such as customer participation, use of tools, software logical complexity, user requirements volatility and staff tool skills.

[1]  Barry W. Boehm,et al.  Termination Doesn't Equal Project Failure , 2000, Computer.

[2]  David Machin,et al.  Survival Analysis: A Practical Approach , 1995 .

[3]  Elisa Lee,et al.  Statistical Methods for Survival Data Analysis: Lee/Survival Data Analysis , 2003 .

[4]  Tridas Mukhopadhyay,et al.  Software Project Duration and Effort: An Empirical Study , 2002, Inf. Technol. Manag..

[5]  Ronald L. Thompson,et al.  Information Technology and Management , 1996 .

[6]  Lefteris Angelis,et al.  Survival analysis for the duration of software projects , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[7]  Ellis Horowitz,et al.  Software Cost Estimation with COCOMO II , 2000 .

[8]  David Collett Modelling Survival Data in Medical Research , 1994 .

[9]  Joanne M. Sulek,et al.  A methodology for forecasting knowledge work projects , 2000, Comput. Oper. Res..

[10]  Rupert G. Miller,et al.  Survival Analysis , 2022, The SAGE Encyclopedia of Research Design.

[11]  Austen Rainer,et al.  Re-planning for a successful project schedule , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[12]  Elisa T. Lee,et al.  Statistical Methods for Survival Data Analysis , 1994, IEEE Transactions on Reliability.

[13]  David W. Hosmer,et al.  Applied Survival Analysis: Regression Modeling of Time-to-Event Data , 2008 .

[14]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[15]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[16]  Austen Rainer,et al.  A quantitative and qualitative analysis of factors affecting software processes , 2003, J. Syst. Softw..

[17]  Alain Abran,et al.  Exploring the Relation Between Effort and Duration in Software Engineering Projects , 2000 .

[18]  Barbara A. Kitchenham,et al.  A Procedure for Analyzing Unbalanced Datasets , 1998, IEEE Trans. Software Eng..

[19]  W. D. Ray 4. Modelling Survival Data in Medical Research , 1995 .

[20]  Lionel C. Briand,et al.  An assessment and comparison of common software cost estimation modeling techniques , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[21]  Melanie Rettie,et al.  Information technology and management , 2000 .

[22]  Barry W. Boehm,et al.  Calibrating the COCOMO II Post-Architecture model , 1998, Proceedings of the 20th International Conference on Software Engineering.

[23]  Ware Myers,et al.  Five Core Metrics: Intelligence behind Successful Software Management , 2003 .

[24]  Yasunari Takagi,et al.  On prediction of cost and duration for risky software projects based on risk questionnaire , 2001, Proceedings Second Asia-Pacific Conference on Quality Software.

[25]  Lionel C. Briand,et al.  A replicated assessment and comparison of common software cost modeling techniques , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[26]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[27]  Austen Rainer,et al.  Identifying the causes of poor progress in software projects , 2004 .

[28]  Building a software cost estimation model based on categorical data , 2001, Proceedings Seventh International Software Metrics Symposium.

[29]  Hongfang Liu,et al.  Effect of Coupling on Defect Proneness in Evolutionary Open-Source Software Development , 2007, OSS.

[30]  Katrina D. Maxwell,et al.  Applied Statistics for Software Managers , 2002 .