Evolving multidimensional transformations for symbolic regression with M3GP

Multidimensional Multiclass Genetic Programming with Multidimensional Populations (M3GP) was originally proposed as a wrapper approach for supervised classification. M3GP searches for transformations of the form $$k:{\mathbb {R}}^p \rightarrow {\mathbb {R}}^d$$k:Rp→Rd, where p is the number of dimensions of the problem data, and d is the dimensionality of the transformed data, as determined by the search. This work extends M3GP to symbolic regression, building models that are linear in the parameters using the transformed data. The proposal implements a sequential memetic structure with Lamarckian inheritance, combining two local search methods: a greedy pruning algorithm and least squares parameter estimation. Experimental results show that M3GP outperforms several standard and state-of-the-art regression techniques, as well as other GP approaches. Using several synthetic and real-world problems, M3GP outperforms most methods in terms of RMSE and generates more parsimonious models. The performance of M3GP can be explained by the fact that M3GP increases the maximal mutual information in the new feature space.

[1]  Leonardo Vanneschi,et al.  Genetic programming needs better benchmarks , 2012, GECCO '12.

[2]  Sanjiban Sekhar Roy,et al.  Estimating heating load in buildings using multivariate adaptive regression splines, extreme learning machine, a hybrid model of MARS and ELM , 2018 .

[3]  TrujilloLeonardo,et al.  neat Genetic Programming , 2016 .

[4]  Fabio Caraffini,et al.  An analysis on separability for Memetic Computing automatic design , 2014, Inf. Sci..

[5]  Sean Luke,et al.  Lexicographic Parsimony Pressure , 2002, GECCO.

[6]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[7]  Ömer Faruk Ertugrul,et al.  A novel type of activation function in artificial neural networks: Trained activation function , 2018, Neural Networks.

[8]  Vinicius Veloso de Melo,et al.  Improving the prediction of material properties of concrete using Kaizen Programming with Simulated Annealing , 2017, Neurocomputing.

[9]  Krzysztof Krawiec,et al.  Multiple regression genetic programming , 2014, GECCO.

[10]  Athanasios Tsanas,et al.  Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools , 2012 .

[11]  I-Cheng Yeh,et al.  Modeling of strength of high-performance concrete using artificial neural networks , 1998 .

[12]  Leonardo Vanneschi,et al.  A C++ framework for geometric semantic genetic programming , 2014, Genetic Programming and Evolvable Machines.

[13]  Leonardo Trujillo,et al.  A comparison of fitness-case sampling methods for genetic programming , 2017, J. Exp. Theor. Artif. Intell..

[14]  Leonardo Vanneschi,et al.  A Multi-dimensional Genetic Programming Approach for Multi-class Classification Problems , 2014, EuroGP.

[15]  Luis Muñoz,et al.  M3GP - Multiclass Classification with GP , 2015, EuroGP.

[16]  Krzysztof Krawiec,et al.  Geometric Semantic Genetic Programming , 2012, PPSN.

[17]  Dick den Hertog,et al.  Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming , 2009, IEEE Transactions on Evolutionary Computation.

[18]  Stephan M. Winkler,et al.  Dynamic observation of genotypic and phenotypic diversity for different symbolic regression GP variants , 2017, GECCO.

[19]  Kay Chen Tan,et al.  A Multi-Facet Survey on Memetic Computation , 2011, IEEE Transactions on Evolutionary Computation.

[20]  Luis Muñoz,et al.  neat Genetic Programming: Controlling bloat naturally , 2016, Inf. Sci..

[21]  Kallol Roy,et al.  Analysis of energy management in micro grid – A hybrid BFOA and ANN approach , 2018 .

[22]  Kalyan Veeramachaneni,et al.  Building Predictive Models via Feature Synthesis , 2015, GECCO.

[23]  Benjamin Doerr,et al.  Bounding bloat in genetic programming , 2017, GECCO.

[24]  Leonardo Vanneschi,et al.  Multiclass Classification Through Multidimensional Clustering , 2016 .

[25]  Vinicius Veloso de Melo,et al.  Kaizen programming , 2014, GECCO.

[26]  Jason H. Moore,et al.  Investigating the parameter space of evolutionary algorithms , 2017, BioData Mining.

[27]  J. Friedman Multivariate adaptive regression splines , 1990 .

[28]  Giovanni Iacca,et al.  Parallel memetic structures , 2013, Inf. Sci..