Dual Control with Active Learning using Gaussian Process Regression

In many real world problems, control decisions have to be made with limited information. The controller may have no a priori (or even posteriori) data on the nonlinear system, except from a limited number of points that are obtained over time. This is either due to high cost of observation or the highly non-stationary nature of the system. The resulting conflict between information collection (identification, exploration) and control (optimization, exploitation) necessitates an active learning approach for iteratively selecting the control actions which concurrently provide the data points for system identification. This paper presents a dual control approach where the information acquired at each control step is quantified using the entropy measure from information theory and serves as the training input to a state-of-the-art Gaussian process regression (Bayesian learning) method. The explicit quantification of the information obtained from each data point allows for iterative optimization of both identification and control objectives. The approach developed is illustrated with two examples: control of logistic map as a chaotic system and position control of a cart with inverted pendulum.

[1]  Iain Murray Introduction To Gaussian Processes , 2008 .

[2]  Dan Wang,et al.  A Neural Network Based Method for Solving Discrete-Time Nonlinear Output Regulation Problem in Sampled-Data Systems , 2004, ISNN.

[3]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[4]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[5]  Michael E. Tipping Bayesian Inference: An Introduction to Principles and Practice in Machine Learning , 2003, Advanced Lectures on Machine Learning.

[6]  Phillip Boyle,et al.  Gaussian Processes for Regression and Optimisation , 2007 .

[7]  Keith R. Thompson,et al.  Implementation of gaussian process models for non-linear system identification , 2009 .

[8]  Juö Kocijan,et al.  Gaussian Process Models for Systems Identification , 2008 .

[9]  Tansu Alpcan,et al.  A framework for optimization under limited information , 2011, Journal of Global Optimization.

[10]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[11]  Klaus Obermayer,et al.  Gaussian process regression: active data selection and test point rejection , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[12]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[13]  Joseph Yame,et al.  Dual adaptive control of stochastic systems via information theory , 1987, 26th IEEE Conference on Decision and Control.

[14]  Carl E. Rasmussen,et al.  Gaussian process dynamic programming , 2009, Neurocomputing.

[15]  D. V. Gokhale,et al.  Entropy expressions and their estimators for multivariate distributions , 1989, IEEE Trans. Inf. Theory.

[16]  R. Tempo,et al.  Randomized Algorithms for Analysis and Control of Uncertain Systems , 2004 .

[17]  B. Wittenmark Adaptive dual control , 2002 .

[18]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[19]  Dieter Fox,et al.  Gaussian Processes and Reinforcement Learning for Identification and Control of an Autonomous Blimp , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[20]  Dan Wang,et al.  A neural network-based approximation method for discrete-time nonlinear servomechanism problem , 2001, IEEE Trans. Neural Networks.