论文信息 - Learning nonlinear state-space models for control

Learning nonlinear state-space models for control

This paper studies the learning of nonlinear state-space models for a control task. This has some advantages over traditional methods. Variational Bayesian learning provides a framework where uncertainty is explicitly taken into account and system identification can be combined with model-predictive control. Three different control schemes are used. One of them, optimistic inference control, is a novel method based directly on the probabilistic modelling. Simulations with a cart-pole swing-up task confirm that the latent state space provides a representation that is easier to predict and control than the original observation space.

T. Raiko | M. Tornio | T. Raiko | M. Tornio | Matti Tornio

[1] G. Sutton,et al. The variation of hand tremor with force in healthy subjects , 1967, The Journal of physiology.

[2] Y. Bar-Shalom. Stochastic dynamic programming: Caution and probing , 1981 .

[3] Sebastian Thrun,et al. The role of exploration in learning control , 1992 .

[4] Petros G. Voulgaris,et al. On optimal ℓ∞ to ℓ∞ filtering , 1995, Autom..

[5] Peter J. Gawthrop,et al. Neural Networks for Modelling and Control , 1997 .

[6] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .

[7] Shigenobu Kobayashi,et al. Efficient Non-Linear Control by Combining Q-learning with Local Linear Controllers , 1999, ICML.

[8] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.

[9] Kenji Doya,et al. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? , 1999, Neural Networks.

[10] Niels Kjølstad Poulsen,et al. Neural Networks for Modelling and Control of Dynamic Systems: A Practitioner’s Handbook , 2000 .

[11] J. Nazuno. Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999 , 2000 .

[12] Karl Johan Åström,et al. Control of complex systems , 2001 .

[13] Zoubin Ghahramani,et al. The variational Kalman smoother , 2001 .

[14] Stephen P. Boyd,et al. Future directions in control in an information-rich world , 2003 .

[15] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .

[16] Juha Karhunen,et al. An Unsupervised Ensemble Learning Method for Nonlinear Dynamic State-Space Models , 2002, Neural Computation.

[17] J. Kocijan,et al. Predictive control with Gaussian process models , 2003, The IEEE Region 8 EUROCON 2003. Computer as a Tool..

[18] A. Pacut,et al. Model-free off-policy reinforcement learning in continuous environment , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[19] F. Rosenqvist,et al. Realisation and estimation of piecewise-linear output-error models , 2005, Autom..

[20] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.