Adaptive weighted splines: a new representation to genetic programming for symbolic regression

Genetic Programming for Symbolic Regression is often prone to overfit the training data, resulting in poor generalization on unseen data. To address this issue, many pieces of research have been devoted to regularization via controlling the model complexity. However, due to the unstructured tree based representation of individuals the model complexity cannot be directly computed, rather approximation of the complexity must be taken. This paper proposes a new novel representation called Adaptive Weighted Splines which enables explicit control over the complexity of individuals using splines. The experimental results confirm that this new representation is significantly better than the tree-based representation at avoiding overfitting and generalizing on unseen data, demonstrating notably better and far more consistent generalization performances on all the benchmark problems. Further analysis also shows that in most cases, the new Genetic Programming method outperforms classical regression techniques such as Linear Regression, Support Vector Regression, K-Nearest Neighbour and Decision Tree Regression and performs competitively with state-of-the-art ensemble regression methods Random Forests and Gradient Boosting.

[1]  P. Dierckx An algorithm for smoothing, differentiation and integration of experimental data using spline functions , 1975 .

[2]  Leonardo Vanneschi,et al.  Genetic programming needs better benchmarks , 2012, GECCO '12.

[3]  A. Azzouz 2011 , 2020, City.

[4]  Maarten Keijzer,et al.  Improving Symbolic Regression with Interval Arithmetic and Linear Scaling , 2003, EuroGP.

[5]  Si Wu,et al.  Improving support vector machine classifiers by modifying kernel functions , 1999, Neural Networks.

[6]  Paul Dierckx,et al.  Curve and surface fitting with splines , 1994, Monographs on numerical analysis.

[7]  Sean Luke,et al.  A Comparison of Bloat Control Methods for Genetic Programming , 2006, Evolutionary Computation.

[8]  Leonardo Vanneschi,et al.  Measuring bloat, overfitting and functional complexity in genetic programming , 2010, GECCO '10.

[9]  P. Dierckx A Fast Algorithm for Smoothing Data on a Rectangular Grid while Using Spline Functions , 1982 .

[10]  Leonardo Vanneschi,et al.  Operator equalisation for bloat free genetic programming and a survey of bloat control methods , 2011, Genetic Programming and Evolvable Machines.

[11]  Mengjie Zhang,et al.  Improving Generalisation of Genetic Programming for Symbolic Regression with Structural Risk Minimisation , 2016, GECCO.

[12]  Thomas G. Dietterich Overfitting and undercomputing in machine learning , 1995, CSUR.

[13]  Leonardo Vanneschi,et al.  An Empirical Study of Functional Complexity as an Indicator of Overfitting in Genetic Programming , 2011, EuroGP.

[14]  Sean Luke,et al.  Fighting Bloat with Nonparametric Parsimony Pressure , 2002, PPSN.

[15]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[16]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[17]  Anthony Brabazon,et al.  Complexity measures in Genetic Programming learning: A brief review , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[18]  David W. Aha,et al.  A Comparative Evaluation of Sequential Feature Selection Algorithms , 1995, AISTATS.

[19]  Mengjie Zhang,et al.  Genetic Programming with Rademacher Complexity for Symbolic Regression , 2019, 2019 IEEE Congress on Evolutionary Computation (CEC).

[20]  Leonardo Vanneschi,et al.  Genetic programming for computational pharmacokinetics in drug discovery and development , 2007, Genetic Programming and Evolvable Machines.

[21]  Mengjie Zhang,et al.  A survey on evolutionary machine learning , 2019, Journal of the Royal Society of New Zealand.

[22]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[23]  Dick den Hertog,et al.  Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming , 2009, IEEE Transactions on Evolutionary Computation.

[24]  Vladimir Vapnik,et al.  Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[25]  Samaneh Sadat Mousavi Astarabadi,et al.  Avoiding Overfitting in Symbolic Regression Using the First Order Derivative of GP Trees , 2015, GECCO.

[26]  P. Dierckx An algorithm for surface-fitting with spline functions , 1981 .

[27]  Temple F. Smith Occam's razor , 1980, Nature.

[28]  Sean Luke,et al.  Lexicographic Parsimony Pressure , 2002, GECCO.