Symbolic regression driven by training data and prior knowledge

In symbolic regression, the search for analytic models is typically driven purely by the prediction error observed on the training data samples. However, when the data samples do not sufficiently cover the input space, the prediction error does not provide sufficient guidance toward desired models. Standard symbolic regression techniques then yield models that are partially incorrect, for instance, in terms of their steady-state characteristics or local behavior. If these properties were considered already during the search process, more accurate and relevant models could be produced. We propose a multi-objective symbolic regression approach that is driven by both the training data and the prior knowledge of the properties the desired model should manifest. The properties given in the form of formal constraints are internally represented by a set of discrete data samples on which candidate models are exactly checked. The proposed approach was experimentally evaluated on three test problems with results clearly demonstrating its capability to evolve realistic models that fit the training data well while complying with the prior knowledge of the desired model characteristics at the same time. It outperforms standard symbolic regression by several orders of magnitude in terms of the mean squared deviation from a reference model.

[1]  Krzysztof Krawiec,et al.  Multiple regression genetic programming , 2014, GECCO.

[2]  Robert Babuska,et al.  Model-based real-time control of a magnetic manipulator system , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[3]  Robert Babuska,et al.  Symbolic method for deriving policy in reinforcement learning , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[4]  Dominic P. Searson GPTIPS 2: An Open-Source Software Platform for Symbolic Data Mining , 2014, Handbook of Genetic Programming Applications.

[5]  Robert Babuska,et al.  Reinforcement Learning with Symbolic Input-Output Models , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Robert Babuska,et al.  Hybrid Single Node Genetic Programming for Symbolic Regression , 2016, Trans. Comput. Collect. Intell..

[7]  Jan Peters,et al.  Sample-based informationl-theoretic stochastic optimal control , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[9]  Markus Wagner,et al.  Predicting the Energy Output of Wind Farms Based on Weather Data: Important Variables and their Correlation , 2011, ArXiv.

[10]  Robert Babuska,et al.  Enhanced Symbolic Regression Through Local Variable Transformations , 2017, IJCCI.

[11]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[12]  Krzysztof Krawiec,et al.  Counterexample-driven genetic programming , 2017, GECCO.

[13]  Sergey Levine,et al.  Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.

[14]  Krzysztof Krawiec,et al.  Solving symbolic regression problems with formal constraints , 2019, GECCO.

[15]  David Jackson,et al.  A New, Node-Focused Model for Genetic Programming , 2012, EuroGP.

[16]  Piet Demeester,et al.  Constructing a No-Reference H.264/AVC Bitstream-Based Video Quality Metric Using Genetic Programming-Based Symbolic Regression , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Robert Babuska,et al.  Fuzzy Modeling for Control , 1998 .

[18]  Robert Babuska,et al.  Efficient Model Learning Methods for Actor–Critic Control , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[20]  Kalyan Veeramachaneni,et al.  Building Predictive Models via Feature Synthesis , 2015, GECCO.

[21]  Robert Babuska,et al.  Data-driven Construction of Symbolic Process Models for Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Martin A. Riedmiller,et al.  Approximate real-time optimal control based on sparse Gaussian process models , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[23]  Karl Tuyls,et al.  Integrating State Representation Learning Into Deep Reinforcement Learning , 2018, IEEE Robotics and Automation Letters.

[24]  Robert Babuska,et al.  Policy derivation methods for critic-only reinforcement learning in continuous spaces , 2018, Eng. Appl. Artif. Intell..

[25]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[26]  Zdenek Hurák,et al.  Feedback linearization approach to distributed feedback manipulation , 2012, 2012 American Control Conference (ACC).