Using Support Vector Regression for Web Development Effort Estimation

The objective of this paper is to investigate the use of Support Vector Regression (SVR) for Web development effort estimation when using a cross-company data set. Four kernels of SVR were used, linear, polynomial, Gaussian and sigmoid and two preprocessing strategies of the variables were applied, namely normalization and logarithmic. The hold-out validation process was carried out for all the eight configurations using a training set and a validation set from the Tukutuku data set. Our results suggest that the predictions obtained with linear kernel applying a logarithmic transformation of variables (LinLog) are significantly better than those obtained with the other configurations. In addition, SVR has been compared with the traditional estimation techniques, such as Manual StepWise Regression, Case-Based Reasoning, and Bayesian Networks. Our results suggest that SVR with LinLog configuration can provide significantly superior prediction accuracy than other techniques.

[1]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[2]  Sotiris P. Christodoulou,et al.  Web Engineering: The Developers' View and a Practitioner's Approach , 2001, Web Engineering.

[3]  Katrina D. Maxwell,et al.  Applied Statistics for Software Managers , 2002 .

[4]  Emilia Mendes,et al.  Investigating Web size metrics for early Web cost estimation , 2005, J. Syst. Softw..

[5]  Emilia Mendes,et al.  Comparing Size Measures for Predicting Web Application Development Effort: A Case Study , 2007, ESEM 2007.

[6]  Donald J. Reifer,et al.  Web Development: Estimating Quick-to-Market Software , 2000, IEEE Softw..

[7]  Martin Shepperd,et al.  Using Simulation to Evaluate Prediction Techniques , 2001 .

[8]  Emilia Mendes,et al.  Further comparison of cross-company and within-company effort estimation models for Web applications , 2004 .

[9]  D. Ross Jeffery,et al.  Cost estimation for web applications , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[10]  Barbara Kitchenham,et al.  A comparison of cross-company and within-company effort estimation models for Web applications , 2004, ICSE 2004.

[11]  Guilherme Horta Travassos,et al.  A systematic review of cross- vs. within- company cost estimation studies , 2006 .

[12]  Adriano Lorena Inácio de Oliveira,et al.  Estimation of software project effort with support vector regression , 2006, Neurocomputing.

[13]  Stephen G. MacDonell,et al.  What accuracy statistics really measure , 2001, IEE Proc. Softw..

[14]  Emilia Mendes,et al.  Early Web size measures and effort prediction for Web costimation , 2003, Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No.03EX717).

[15]  Emilia Mendes,et al.  Web Metrics-Estimating Design and Authoring Effort , 2001, IEEE Multim..

[16]  Emilia Mendes,et al.  A Comparative Study of Cost Estimation Models for Web Hypermedia Applications , 2003, Empirical Software Engineering.

[17]  Bernhard Schölkopf,et al.  Support vector learning , 1997 .

[18]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[19]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[20]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[21]  Barbara A. Kitchenham,et al.  A Procedure for Analyzing Unbalanced Datasets , 1998, IEEE Trans. Software Eng..

[22]  Kevin Lano,et al.  Slicing of UML models using model transformations , 2010, MODELS'10.

[23]  V. Vapnik Pattern recognition using generalized portrait method , 1963 .

[24]  Emilia Mendes,et al.  Cross-company vs. single-company web effort models using the Tukutuku database: An extended study , 2008, J. Syst. Softw..

[25]  Victor R. Basili,et al.  A meta-model for software development resource expenditures , 1981, ICSE '81.

[26]  Emilia Mendes The Use of Bayesian Networks for Web Effort Estimation: Further Investigation , 2008, 2008 Eighth International Conference on Web Engineering.

[27]  Emilia Mendes,et al.  Bayesian Network Models for Web Effort Prediction: A Comparative Study , 2008, IEEE Transactions on Software Engineering.

[28]  Donald J. Reifer A Little Bit of Knowledge Is a Dangerous Thing , 2002, IEEE Softw..

[29]  Genny Tortora,et al.  Effort estimation modeling techniques: a case study for web applications , 2006, ICWE '06.

[30]  Emilia Mendes,et al.  Comparison of Web size measures for predicting Web design and authoring effort , 2002, IEE Proc. Softw..

[31]  Luciano Baresi,et al.  Three empirical studies on estimating the design effort of Web applications , 2007, TSEM.

[33]  Donald J. Reifer Ten Deadly Risks in Internet and Intranet Software Development , 2002, IEEE Softw..

[34]  Simon Haykin,et al.  Support vector machines for dynamic reconstruction of a chaotic system , 1999 .

[35]  D. Ross Jeffery,et al.  Using public domain metrics to estimate software development effort , 2001, Proceedings Seventh International Software Metrics Symposium.

[36]  Guilherme Horta Travassos,et al.  Cross versus Within-Company Cost Estimation Studies: A Systematic Review , 2007, IEEE Transactions on Software Engineering.

[37]  Silvia Mara Abrahão,et al.  A model-driven measurement procedure for sizing web applications: design, automation and validation , 2007, MODELS'07.

[38]  Gustavo Rossi,et al.  Web Engineering , 2001, Lecture Notes in Computer Science.

[39]  Silvio Romero de Lemos Meira,et al.  Software Effort Estimation Using Machine Learning Techniques with Robust Confidence Intervals , 2007, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007).

[40]  Barry W. Boehm,et al.  Bayesian Analysis of Empirical Software Engineering Cost Models , 1999, IEEE Trans. Software Eng..

[41]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[42]  Emilia Mendes,et al.  Web development effort estimation using analogy , 2000, Proceedings 2000 Australian Software Engineering Conference.

[43]  H. E. Dunsmore,et al.  Software engineering metrics and models , 1986 .

[44]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .