Single and Multi Objective Genetic Programming for software development effort estimation

The idea of exploiting Genetic Programming (GP) to estimate software development effort is based on the observation that the effort estimation problem can be formulated as an optimization problem. Indeed, among the possible models, we have to identify the one providing the most accurate estimates. To this end a suitable measure to evaluate and compare different models is needed. However, in the context of effort estimation there does not exist a unique measure that allows us to compare different models but several different criteria (e.g., MMRE, Pred(25), MdMRE) have been proposed. Aiming at getting an insight on the effects of using different measures as fitness function, in this paper we analyzed the performance of GP using each of the five most used evaluation criteria. Moreover, we designed a Multi-Objective Genetic Programming (MOGP) based on Pareto optimality to simultaneously optimize the five evaluation measures and analyzed whether MOGP is able to build estimation models more accurate than those obtained using GP. The results of the empirical analysis, carried out using three publicly available datasets, showed that the choice of the fitness function significantly affects the estimation accuracy of the models built with GP and the use of some fitness functions allowed GP to get estimation accuracy comparable with the ones provided by MOGP.

[1]  Sun-Jen Huang,et al.  Optimization of analogy weights by genetic algorithm for software effort estimation , 2006, Inf. Softw. Technol..

[2]  B. Kitchenham,et al.  Case Studies for Method and Tool Evaluation , 1995, IEEE Softw..

[3]  Isabella Wieczorek,et al.  Resource Estimation in Software Engineering , 2002 .

[4]  Stephen G. MacDonell,et al.  What accuracy statistics really measure , 2001, IEE Proc. Softw..

[5]  Thomas J. Ostrand,et al.  \{PROMISE\} Repository of empirical software engineering data , 2007 .

[6]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[7]  Rajarshi Das,et al.  A Study of Control Parameters Affecting Online Performance of Genetic Algorithms for Function Optimization , 1989, ICGA.

[8]  John R. Koza,et al.  Genetic programming (videotape): the movie , 1992 .

[9]  John A. Clark,et al.  Metrics are fitness functions too , 2004, 10th International Symposium on Software Metrics, 2004. Proceedings..

[10]  Mark Harman,et al.  The Current State and Future of Search Based Software Engineering , 2007, Future of Software Engineering (FOSE '07).

[11]  Vidroha Debroy,et al.  Genetic Programming , 1998, Lecture Notes in Computer Science.

[12]  Lionel C. Briand,et al.  Solving the Class Responsibility Assignment Problem in Object-Oriented Analysis with Multi-Objective Genetic Algorithms , 2010, IEEE Transactions on Software Engineering.

[13]  Martin J. Shepperd,et al.  Using Genetic Programming to Improve Software Effort Estimation Based on General Data Sets , 2003, GECCO.

[14]  Emilia Mendes,et al.  How effective is Tabu search to configure support vector regression for effort estimation? , 2010, PROMISE '10.

[15]  José Javier Dolado,et al.  On the problem of the software cost function , 2001, Inf. Softw. Technol..

[16]  Daryl Essam,et al.  Software project effort estimation using genetic programming , 2002, IEEE 2002 International Conference on Communications, Circuits and Systems and West Sino Expositions.

[17]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[18]  Y. Miyazaki,et al.  Robust regression for developing software estimation models , 1994, J. Syst. Softw..

[19]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems (Genetic and Evolutionary Computation) , 2006 .

[20]  Lionel C. Briand,et al.  An assessment and comparison of common software cost estimation modeling techniques , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[21]  H. E. Dunsmore,et al.  Software engineering metrics and models , 1986 .

[22]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[23]  Yuanyuan Zhang,et al.  The multi-objective next release problem , 2007, GECCO '07.

[24]  Karen T. Lum,et al.  Selecting Best Practices for Effort Estimation , 2006, IEEE Transactions on Software Engineering.

[25]  Emilia Mendes,et al.  Bayesian Network Models for Web Effort Prediction: A Comparative Study , 2008, IEEE Transactions on Software Engineering.

[26]  Colin J Burgess,et al.  Can genetic programming improve software effort estimation? A comparative evaluation , 2001, Inf. Softw. Technol..

[27]  Filomena Ferrucci,et al.  Genetic Programming for Effort Estimation: An Analysis of the Impact of Different Fitness Functions , 2010, 2nd International Symposium on Search Based Software Engineering.

[28]  John J. Marciniak,et al.  Encyclopedia of Software Engineering , 1994, Encyclopedia of Software Engineering.

[29]  Lionel C. Briand,et al.  Modeling Development Effort in Object-Oriented Systems Using Design Properties , 2001, IEEE Trans. Software Eng..