Evolving Simple Symbolic Regression Models by Multi-Objective Genetic Programming

In this chapter we examine how multi-objective genetic programming can be used to perform symbolic regression and compare its performance to single-objective genetic programming. Multi-objective optimization is implemented by using a slightly adapted version of NSGA-II, where the optimization objectives are the model’s prediction accuracy and its complexity. As the model complexity is explicitly defined as an objective, the evolved symbolic regression models are simpler and more parsimonious when compared to models generated by a single-objective algorithm. Furthermore, we define a new complexity measure that includes syntactical and semantic information about the model, while still being efficiently computed, and demonstrate its performance on several benchmark problems. As a result of the multi-objective approach the appropriate model length and the functions included in the models are automatically determined without the necessity to specify them a-priori.

[1]  Dick den Hertog,et al.  Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming , 2009, IEEE Transactions on Evolutionary Computation.

[2]  Mark Kotanchek,et al.  Pareto-Front Exploitation in Symbolic Regression , 2005 .

[3]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[4]  Ernesto Costa,et al.  Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories , 2009, Genetic Programming and Evolvable Machines.

[5]  Sean Luke,et al.  Two fast tree-creation algorithms for genetic programming , 2000, IEEE Trans. Evol. Comput..

[6]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[7]  Wojciech Jaskowski,et al.  Better GP benchmarks: community survey results and proposals , 2012, Genetic Programming and Evolvable Machines.

[8]  J. Friedman Multivariate adaptive regression splines , 1990 .

[9]  Riccardo Poli,et al.  Operator Equalisation and Bloat Free GP , 2008, EuroGP.

[10]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.

[11]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[12]  R. Poli Covariant Tarpeian Method for Bloat Control in Genetic Programming , 2011 .

[13]  Maarten Keijzer,et al.  Crossover Bias in Genetic Programming , 2007, EuroGP.

[14]  Leonardo Vanneschi,et al.  Measuring bloat, overfitting and functional complexity in genetic programming , 2010, GECCO '10.