Department of Defense (DoD) and other government agencies. To better understand and control these costs, agencies often use parametric cost models for software development cost and schedule estimation. However, the accuracy of these models is poor when the default values embedded in the models are used [1]. Even after the software cost models are calibrated to DoD databases, most have been shown to be accurate to within only 25 percent of actual cost or schedule about half the time. For example, Robert Thibodeau [2] reported accuracy of early versions of the PRICES and SLIM models to be within 25 and 30 percent, respectively, on military ground programs. The IIT Research Institute [3] reported similar results on eight Ada programs, with the most accurate model at only 30 percent of actual cost or schedule, 62 percent of the time. Furthermore, the level of accuracy reported by these studies is likely overstated because most studies have failed to use holdout samples to validate the calibrated models. Instead of reserving a sample of the database for validation, the same data used to calibrate the models were used to assess accuracy [4]. In a study using 28 military ground software data points, Gerald Ourada [5] showed that failure to use a holdout sample overstates a model's accuracy. Half of the data was used to calibrate the Air Force's REVIC model. The remaining half was used to validate the calibrated model. REVIC was accurate to within 30 percent, 57 percent of the time on the calibration subset, but only 28 percent of the time on the validation subset. Validating on a holdout sample is clearly more relevant because new programs being estimated are, by definition, not in the calibration database. The purpose of this project was to calibrate and properly evaluate the accuracy of selected software cost estimation models using holdout samples. The expectation is that calibration improves the estimating accuracy of a model [6]. This paper describes the results of a long-term project at the Air Force Institute of Technology to calibrate and validate selected software cost estimation models. Two Air Force product centers provided software databases: the Space and Missile Systems Center (SMC), and the Electronic Systems Center (ESC). The project has been nicknamed the " Decalogue project " because 10 masters' theses extensively document the procedures and results of calibrating each software cost estimation model. The Decalogue project is organized into three …
[1]
Wayne A. Bernheisel.
CALIBRATION AND VALIDATION OF THE COCOMO II.1997.0 COST/SCHEDULE ESTIMATING MODEL TO THE SPACE AND MISSILE SYSTEMS CENTER DATABASE
,
1997
.
[2]
Karen R. Mertes,et al.
Calibration of the Checkpoint Model to the Space and Missile Systems Center (SMC) Software Database (SWDB).
,
1996
.
[3]
David S. Christensen,et al.
Software Cost Model Calibration and Validation—An Air Force Case Study
,
1997
.
[4]
Daniel V. Ferens,et al.
Software Cost Estimating Models: A Calibration, Validation, and Comparison
,
1992
.
[5]
Thomas C. Shrum.
Calibration and Validation of the Checkpoint Model to the Air Force Electronic Systems Center Software Database
,
2012
.
[6]
Robert T. Hughes,et al.
Expert judgement as an estimating method
,
1996,
Inf. Softw. Technol..
[7]
Robert K. Kressin.
Calibration of the Software Life Cycle Model (SLIM) to the Space and Missile Systems Center (SMC) Software Database (SWDB).
,
1995
.
[8]
Steven V. Southwell.
Calibration of the Softcost-R Software Cost Model To the Space and Missile Systems Center (SMC) Software Database (SWDB).
,
1996
.
[9]
Betty G. Webber,et al.
A Calibration of the Revic Software Cost Estimating Model.
,
1995
.
[10]
Carl D. Vegas,et al.
Calibration of the Software Architecture Sizing and Estimation Tool (SASET).
,
1995
.
[11]
Chris F. Kemerer,et al.
An empirical validation of software cost estimation models
,
1987,
CACM.
[12]
David B. Marzo.
Calibration and Validation of the SAGE Software Cost/Schedule Estimating System to United States Air Force Databases
,
2012
.