Sample-and-computation-efficient Probabilistic Model Predictive Control with Random Features

Gaussian processes (GPs) based Reinforcement Learning (RL) methods with Model Predictive Control (MPC) have demonstrated their excellent sample efficiency. However, since the computational cost of GPs largely depends on the training sample size, learning an accurate dynamics using GPs result in low control frequency in MPC. To alleviate this trade-off and achieve a sample-and-computation-efficient nature, we propose a novel model-based RL method with MPC. Our approach employs a linear Gaussian model with randomized features using the Fastfood as an approximated GP dynamics. Then, we derive an analytic moment-matching scheme in state prediction with the model and uncertain inputs. As a result, the computational cost of the MPC in our RL method does not depend on the training sample size and can improve the control frequency over previous methods. Through experiments with simulated and real robot control tasks, the sample efficiency, as well as the computation efficiency of our model-based RL method, are demonstrated.

[1]  Athanasios S. Polydoros,et al.  Survey of Model-Based Reinforcement Learning: Applications on Robotics , 2017, J. Intell. Robotic Syst..

[2]  Marc Peter Deisenroth,et al.  Efficient reinforcement learning using Gaussian processes , 2010 .

[3]  Le Song,et al.  A la Carte - Learning Fast Kernels , 2014, AISTATS.

[4]  Marc Peter Deisenroth,et al.  Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control , 2017, AISTATS.

[5]  Christopher K. I. Williams Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[6]  Fakhrul Alam,et al.  Gaussian Process Model Predictive Control of an Unmanned Quadrotor , 2016, Journal of Intelligent & Robotic Systems.

[7]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[8]  Byron Boots,et al.  Prediction under Uncertainty in Sparse Spectrum Gaussian Processes with Applications to Filtering and Control , 2017, ICML.

[9]  Paul T. Boggs,et al.  Sequential Quadratic Programming , 1995, Acta Numerica.

[10]  Takamitsu Matsubara,et al.  Reinforcement Learning Boat Autopilot: A Sample-efficient and Model Predictive Control based Approach , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[12]  Uwe D. Hanebeck,et al.  Analytic moment-based Gaussian process filtering , 2009, ICML '09.

[13]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[14]  Phil F. Culverhouse,et al.  Robust Adaptive Control of an Uninhabited Surface Vehicle , 2015, J. Intell. Robotic Syst..

[15]  Juraj Kabzan,et al.  Cautious Model Predictive Control Using Gaussian Process Regression , 2017, IEEE Transactions on Control Systems Technology.

[16]  Alexander J. Smola,et al.  Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.