Localized active learning of Gaussian process state space models

The performance of learning-based control techniques crucially depends on how effectively the system is explored. While most exploration techniques aim to achieve a globally accurate model, such approaches are generally unsuited for systems with unbounded state spaces. Furthermore, a globally accurate model is not required to achieve good performance in many common control applications, e.g., local stabilization tasks. In this paper, we propose an active learning strategy for Gaussian process state space models that aims to obtain an accurate model on a bounded subset of the state-action space. Our approach aims to maximize the mutual information of the exploration trajectories with respect to a discretization of the region of interest. By employing model predictive control, the proposed technique integrates information collected during exploration and adaptively improves its exploration strategy. To enable computational tractability, we decouple the choice of most informative data points from the model predictive control optimization step. This yields two optimization problems that can be solved in parallel. We apply the proposed method to explore the state space of various dynamical systems and compare our approach to a commonly used entropy-based exploration strategy. In all experiments, our method yields a better model within the region of interest than the entropy-based method.

[1]  Terrence J. Sejnowski,et al.  Exploration Bonuses and Dual Control , 1996, Machine Learning.

[2]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[3]  Agathe Girard,et al.  Propagation of uncertainty in Bayesian kernel models - application to multiple-step ahead forecasting , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Sandra Hirche,et al.  An Uncertainty-Based Control Lyapunov Approach for Control-Affine Systems Modeled by Gaussian Process , 2018, IEEE Control Systems Letters.

[5]  Ralf Der,et al.  Predictive information and explorative behavior of autonomous robots , 2008 .

[6]  Marko Bacic,et al.  Model predictive control , 2003 .

[7]  Dana Kulic,et al.  Stable Gaussian Process based Tracking Control of Euler-Lagrange Systems , 2018, Autom..

[8]  Duy Nguyen-Tuong,et al.  Safe Exploration for Active Learning with Gaussian Processes , 2015, ECML/PKDD.

[9]  Angela P. Schoellig,et al.  Safe and robust learning control with Gaussian processes , 2015, 2015 European Control Conference (ECC).

[10]  S. Kakade,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2012, IEEE Transactions on Information Theory.

[11]  Tansu Alpcan,et al.  An Information-Based Learning Approach to Dual Control , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[13]  Marc Peter Deisenroth,et al.  Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control , 2017, AISTATS.

[14]  Sandra Hirche,et al.  Feedback Linearization Based on Gaussian Processes With Event-Triggered Online Learning , 2019, IEEE Transactions on Automatic Control.

[15]  Torsten Koller,et al.  Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning , 2019, ArXiv.

[16]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[17]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[18]  Lennart Ljung,et al.  Kernel methods in system identification, machine learning and function estimation: A survey , 2014, Autom..

[19]  Carl E. Rasmussen,et al.  Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Sandra Hirche,et al.  Backstepping for Partially Unknown Nonlinear Systems Using Gaussian Processes , 2019, IEEE Control Systems Letters.

[21]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..