Learning nonlinear state-space models for control

This paper studies the learning of nonlinear state-space models for a control task. This has some advantages over traditional methods. Variational Bayesian learning provides a framework where uncertainty is explicitly taken into account and system identification can be combined with model-predictive control. Three different control schemes are used. One of them, optimistic inference control, is a novel method based directly on the probabilistic modelling. Simulations with a cart-pole swing-up task confirm that the latent state space provides a representation that is easier to predict and control than the original observation space.

[1]  G. Sutton,et al.  The variation of hand tremor with force in healthy subjects , 1967, The Journal of physiology.

[2]  Y. Bar-Shalom Stochastic dynamic programming: Caution and probing , 1981 .

[3]  Sebastian Thrun,et al.  The role of exploration in learning control , 1992 .

[4]  Petros G. Voulgaris,et al.  On optimal ℓ∞ to ℓ∞ filtering , 1995, Autom..

[5]  Peter J. Gawthrop,et al.  Neural Networks for Modelling and Control , 1997 .

[6]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[7]  Shigenobu Kobayashi,et al.  Efficient Non-Linear Control by Combining Q-learning with Local Linear Controllers , 1999, ICML.

[8]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[9]  Kenji Doya,et al.  What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? , 1999, Neural Networks.

[10]  Niels Kjølstad Poulsen,et al.  Neural Networks for Modelling and Control of Dynamic Systems: A Practitioner’s Handbook , 2000 .

[11]  J. Nazuno Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999 , 2000 .

[12]  Karl Johan Åström,et al.  Control of complex systems , 2001 .

[13]  Zoubin Ghahramani,et al.  The variational Kalman smoother , 2001 .

[14]  Stephen P. Boyd,et al.  Future directions in control in an information-rich world , 2003 .

[15]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[16]  Juha Karhunen,et al.  An Unsupervised Ensemble Learning Method for Nonlinear Dynamic State-Space Models , 2002, Neural Computation.

[17]  J. Kocijan,et al.  Predictive control with Gaussian process models , 2003, The IEEE Region 8 EUROCON 2003. Computer as a Tool..

[18]  A. Pacut,et al.  Model-free off-policy reinforcement learning in continuous environment , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[19]  F. Rosenqvist,et al.  Realisation and estimation of piecewise-linear output-error models , 2005, Autom..

[20]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.