Structural Risk Minimization-Driven Genetic Programming for Enhancing Generalization in Symbolic Regression

Generalization ability, which reflects the prediction ability of a learned model, is an important property in genetic programming (GP) for symbolic regression. Structural risk minimization (SRM) is a framework providing a reliable estimation of the generalization performance of prediction models. Introducing the framework into GP has the potential to drive the evolutionary process toward models with good generalization performance. However, this is tough due to the difficulty in obtaining the Vapnik–Chervonenkis (VC) dimension of nonlinear models. To address this difficulty, this paper proposes an SRM-driven GP approach, which uses an experimental method (instead of theoretical estimation) to measure the VC dimension of a mixture of linear and nonlinear regression models for the first time. The experimental method has been conducted using uniform and nonuniform settings. The results show that our method has impressive generalization gains over standard GP and GP with the 0.632 bootstrap, and that the proposed method using the nonuniform setting has further improvement than its counterpart using the uniform setting. Further analyzes reveal that the proposed method can evolve more compact models, and that the behavioral difference between these compact models and the target models is much smaller than their counterparts evolved by the other GP methods.

[1]  Leonardo Vanneschi,et al.  Genetic programming for computational pharmacokinetics in drug discovery and development , 2007, Genetic Programming and Evolvable Machines.

[2]  Li Feng,et al.  A new genetic programming approach in symbolic regression , 2003 .

[3]  José Luis Montaña,et al.  Penalty Functions for Genetic Programming Algorithms , 2011, ICCSA.

[4]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[5]  Si Wu,et al.  Improving support vector machine classifiers by modifying kernel functions , 1999, Neural Networks.

[6]  AN Kolmogorov-Smirnov,et al.  Sulla determinazione empírica di uma legge di distribuzione , 1933 .

[7]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[8]  William Li,et al.  Columnwise-pairwise algorithms with applications to the construction of supersaturated designs , 1997 .

[9]  Leonardo Vanneschi,et al.  A New Implementation of Geometric Semantic GP and Its Application to Problems in Pharmacokinetics , 2013, EuroGP.

[10]  José Luis Montaña,et al.  Model selection in genetic programming , 2010, GECCO '10.

[11]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[12]  Tapabrata Ray,et al.  Genetic Programming With Mixed-Integer Linear Programming-Based Library Search , 2018, IEEE Transactions on Evolutionary Computation.

[13]  Meland,et al.  THE USE OF MOLECULAR PROFILING TO PREDICT SURVIVAL AFTER CHEMOTHERAPY FOR DIFFUSE LARGE-B-CELL LYMPHOMA , 2002 .

[14]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[15]  Kit Po Wong,et al.  A Java-based parallel platform for the implementation of evolutionary computation for engineering applications , 2004, Int. J. Syst. Sci..

[16]  William Li,et al.  Measuring the VC-Dimension Using Optimized Experimental Design , 2000, Neural Computation.

[17]  Maarten Keijzer,et al.  Improving Symbolic Regression with Interval Arithmetic and Linear Scaling , 2003, EuroGP.

[18]  Michael O'Neill,et al.  Improving the Generalisation Ability of Genetic Programming with Semantic Similarity based Crossover , 2010, EuroGP.

[19]  Mohammad Mehdi Ebadzadeh,et al.  Improving GP generalization: a variance-based layered learning approach , 2014, Genetic Programming and Evolvable Machines.

[20]  Leonardo Vanneschi,et al.  Measuring bloat, overfitting and functional complexity in genetic programming , 2010, GECCO '10.

[21]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[22]  Hans Neuner,et al.  Design of Artificial Neural Networks for Change-Point Detection , 2015 .

[23]  Leonardo Vanneschi,et al.  Operator equalisation for bloat free genetic programming and a survey of bloat control methods , 2011, Genetic Programming and Evolvable Machines.

[24]  Yann LeCun,et al.  Measuring the VC-Dimension of a Learning Machine , 1994, Neural Computation.

[25]  Carlos M. Fonseca,et al.  On the Generalization Ability of Geometric Semantic Genetic Programming , 2015, EuroGP.

[26]  Alexey Ya. Chervonenkis,et al.  On the Uniform Convergence of the Frequencies of Occurrence of Events to Their Probabilities , 2013, Empirical Inference.

[27]  Mengjie Zhang,et al.  Improving Generalization of Genetic Programming for Symbolic Regression With Angle-Driven Geometric Semantic Operators , 2019, IEEE Transactions on Evolutionary Computation.

[28]  Yunqian Ma,et al.  Comparison of Model Selection for Regression , 2003, Neural Computation.

[29]  Leonardo Vanneschi,et al.  Geometric Semantic Genetic Programming for Real Life Applications , 2013, GPTP.

[30]  Alexandros Agapitos,et al.  Controlling Overfitting in Symbolic Regression Based on a Bias/Variance Error Decomposition , 2012, PPSN.

[31]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[32]  Georgios Lappas,et al.  Estimating the Size of Neural Networks from the Number of Available Training Data , 2007, ICANN.

[33]  Giandomenico Spezzano,et al.  Ensemble Techniques for Parallel Genetic Programming Based Classifiers , 2003, EuroGP.

[34]  Mengjie Zhang,et al.  Improving Generalisation of Genetic Programming for Symbolic Regression with Structural Risk Minimisation , 2016, GECCO.

[35]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[36]  Conor Ryan,et al.  Bootstrapping to reduce bloat and improve generalisation in genetic programming , 2013, GECCO '13 Companion.

[37]  Feng Li,et al.  A new genetic programming approach in symbolic regression , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[38]  M. E. Johnson,et al.  Some Guidelines for Constructing Exact D-Optimal Designs on Convex Design Spaces , 1983 .

[39]  D. Ashlock,et al.  Using Evolvable Regressors to Partition Data , 2010 .

[40]  Samaneh Sadat Mousavi Astarabadi,et al.  Avoiding Overfitting in Symbolic Regression Using the First Order Derivative of GP Trees , 2015, GECCO.

[41]  Dick den Hertog,et al.  Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming , 2009, IEEE Transactions on Evolutionary Computation.

[42]  Vladan Babovic,et al.  Genetic Programming, Ensemble Methods and the Bias/Variance Tradeoff - Introductory Investigations , 2000, EuroGP.

[43]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[44]  Mengjie Zhang,et al.  Feature Selection to Improve Generalization of Genetic Programming for High-Dimensional Symbolic Regression , 2017, IEEE Transactions on Evolutionary Computation.

[45]  Mengjie Zhang,et al.  Generalisation and domain adaptation in GP with gradient descent for symbolic regression , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).