论文信息 - Efficient reinforcement learning using Gaussian processes

Efficient reinforcement learning using Gaussian processes

This book examines Gaussian processes in both model-based reinforcement learning (RL) and inference in nonlinear dynamic systems. First, we introduce PILCO, a fully Bayesian approach for efficient RL in continuous-valued state and action spaces when no expert knowledge is available. PILCO takes model uncertainties consistently into account during long-term planning to reduce model bias. Second, we propose principled algorithms for robust filtering and smoothing in GP dynamic systems.

Marc Peter Deisenroth | M. Deisenroth

[1] D. Fraser,et al. The optimum linear smoother as a combination of two optimum linear filters , 1969 .

[2] H. Sorenson,et al. Nonlinear Bayesian estimation using Gaussian sum approximations , 1972 .

[3] G. Matheron. The intrinsic random functions and their applications , 1973, Advances in Applied Probability.

[4] A. O'Hagan,et al. Curve Fitting and Optimal Design for Prediction , 1978 .

[5] Temple F. Smith. Occam's razor , 1980, Nature.

[6] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[7] Raymond A. DeCarlo,et al. Continuation methods: Theory and applications , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[8] B. Silverman,et al. Some Aspects of the Spline Smoothing Approach to Non‐Parametric Regression Curve Fitting , 1985 .

[9] E. B. Andersen,et al. Information Science and Statistics , 1986 .

[10] David J. Spiegelhalter,et al. Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[11] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[12] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[13] David J. C. MacKay,et al. Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[14] I. Verdinelli,et al. Bayesian designs for maximizing information and outcome , 1992 .

[15] N. Gordon,et al. Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[16] Robert Haining,et al. Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[17] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[18] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[19] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[20] Carl E. Rasmussen,et al. In Advances in Neural Information Processing Systems , 2011 .

[21] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[22] Mark W. Spong,et al. The Pendubot: a mechatronic system for control research and education , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[23] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[24] K. Chaloner,et al. Bayesian Experimental Design: A Review , 1995 .

[25] Petros G. Voulgaris,et al. On optimal ℓ∞ to ℓ∞ filtering , 1995, Autom..

[26] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .

[27] G. Kitagawa. Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models , 1996 .

[28] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[29] Geoffrey E. Hinton,et al. Parameter estimation for linear dynamical systems , 1996 .

[30] S. Julier,et al. A General Method for Approximating Nonlinear Transformations of Probability Distributions , 1996 .

[31] J. Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[32] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[33] Daniel M. Wolpert,et al. Forward Models for Physiological Motor Control , 1996, Neural Networks.

[34] Christopher K. I. Williams. Regression with Gaussian processes , 1997 .

[35] Geoffrey E. Hinton,et al. Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .

[36] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.

[37] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.

[38] Jun S. Liu,et al. Sequential Monte Carlo methods for dynamic systems , 1997 .

[39] Jeffrey K. Uhlmann,et al. New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.

[40] Stefan Schaal,et al. Learning tasks from a single demonstration , 1997, Proceedings of International Conference on Robotics and Automation.

[41] Zoubin Ghahramani,et al. Learning Nonlinear Dynamical Systems Using an EM Algorithm , 1998, NIPS.

[42] Dong Xiang,et al. The Bias-Variance Tradeoff and the Randomized GACV , 1998, NIPS.

[43] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.

[44] Thomas G. Dietterich. Adaptive computation and machine learning , 1998 .

[45] Shigenobu Kobayashi,et al. Efficient Non-Linear Control by Combining Q-learning with Local Linear Controllers , 1999, ICML.

[46] Zoubin Ghahramani,et al. A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[47] Yoav Naveh,et al. Nonlinear Modeling and Control of a Unicycle , 1999 .

[48] Thomas P. Minka,et al. From Hidden Markov Models to Linear Dynamical Systems , 1999 .

[49] David J. C. MacKay,et al. Comparison of Approximate Methods for Handling Hyperparameters , 1999, Neural Computation.

[50] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.

[51] Simon J. Godsill,et al. On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[52] Nando de Freitas,et al. The Unscented Particle Filter , 2000, NIPS.

[53] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[54] Hugh F. Durrant-Whyte,et al. A new method for the nonlinear transformation of means and covariances in filters and estimators , 2000, IEEE Trans. Autom. Control..

[55] Alexander J. Smola,et al. Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[56] Roger Woodard,et al. Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[57] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.

[58] Rudolph van der Merwe,et al. The unscented Kalman filter for nonlinear estimation , 2000, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373).

[59] Brendan J. Frey,et al. Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[60] Tom Minka,et al. Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[61] Wei Zhong,et al. Energy and passivity based control of the double inverted pendulum on a cart , 2001, Proceedings of the 2001 IEEE International Conference on Control Applications (CCA'01) (Cat. No.01CH37204).

[62] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[63] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[64] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.

[65] Tom Minka,et al. A family of algorithms for approximate Bayesian inference , 2001 .

[66] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .

[67] Daniel Sbarbaro,et al. Nonlinear adaptive control using non-parametric Gaussian Process prior models , 2002 .

[68] Sebastian Thrun,et al. Probabilistic robotics , 2002, CACM.

[69] Juha Karhunen,et al. An Unsupervised Ensemble Learning Method for Nonlinear Dynamic State-Space Models , 2002, Neural Computation.

[70] Stuart J. Russell,et al. Dynamic bayesian networks: representation, inference and learning , 2002 .

[71] Lehel Csató,et al. Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[72] C. Rasmussen,et al. Gaussian Process Priors with Uncertain Inputs - Application to Multiple-Step Ahead Time Series Forecasting , 2002, NIPS.

[73] Rémi Coulom,et al. Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur) , 2002 .

[74] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[75] Roderick Murray-Smith,et al. Gaussian Process priors with Uncertain Inputs: Multiple-Step-Ahead Prediction , 2002 .

[76] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.

[77] Neil D. Lawrence,et al. Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[78] Agathe Girard,et al. Propagation of uncertainty in Bayesian kernel models - application to multiple-step ahead forecasting , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[79] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.

[80] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .

[81] J. Kocijan,et al. Predictive control with Gaussian process models , 2003, The IEEE Region 8 EUROCON 2003. Computer as a Tool..

[82] Agathe Girard,et al. Adaptive, Cautious, Predictive control with Gaussian Process Priors , 2003 .

[83] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.

[84] Agathe Girard,et al. Prediction at an Uncertain Input for Gaussian Processes and Relevance Vector Machines Application to Multiple-Step Ahead Time-Series Forecasting , 2002 .

[85] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[86] Hagai Attias,et al. Planning by Probabilistic Inference , 2003, AISTATS.

[87] Li-Chen Fu,et al. Passivity based control of the double inverted pendulum driven by a linear induction motor , 2003, Proceedings of 2003 IEEE Conference on Control Applications, 2003. CCA 2003..

[88] A. Pacut,et al. Model-free off-policy reinforcement learning in continuous environment , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[89] Jeffrey K. Uhlmann,et al. Unscented filtering and nonlinear estimation , 2004, Proceedings of the IEEE.

[90] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[91] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[92] Alexander Y. Bogdanov,et al. Optimal Control of a Double Inverted Pendulum on a Cart , 2004 .

[93] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.

[94] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.

[95] A. Doucet,et al. Monte Carlo Smoothing for Nonlinear Time Series , 2004, Journal of the American Statistical Association.

[96] O. Zoeter,et al. Improved unscented kalman smoothing for stock volatility estimation , 2004, Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004..

[97] J. Kocijan,et al. Gaussian process model based predictive control , 2004, Proceedings of the 2004 American Control Conference.

[98] Konrad Paul Körding,et al. The loss function of sensorimotor learning. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[99] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.

[100] Konrad Paul Kording,et al. Bayesian integration in sensorimotor learning , 2004, Nature.

[101] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[102] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[103] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[104] Tom Heskes,et al. Gaussian Quadrature Based Expectation Propagation , 2005, AISTATS.

[105] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.

[106] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[107] Ashutosh Saxena,et al. High speed obstacle avoidance using monocular vision and reinforcement learning , 2005, ICML.

[108] Carl E. Rasmussen,et al. Assessing Approximations for Gaussian Process Classification , 2005, NIPS.

[109] Carl E. Rasmussen,et al. A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[110] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[111] David J. Fleet,et al. Gaussian Process Dynamical Models , 2005, NIPS.

[112] Neil D. Lawrence,et al. Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[113] Joris De Schutter,et al. Nonlinear Kalman Filtering for Force-Controlled Robot Tasks , 2010, Springer Tracts in Advanced Robotics.

[114] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[115] Carl E. Rasmussen,et al. Healing the relevance vector machine through augmentation , 2005, ICML.

[116] Yaakov Engel,et al. Algorithms and representations for reinforcement learning (עם תקציר בעברית, תכן ושער נוסף: אלגוריתמים וייצוגים ללמידה מחיזוקים.; אלגוריתמים וייצוגים ללמידה מחיזוקים.) , 2005 .

[117] Tom Heskes,et al. Novel approximations for inference in nonlinear dynamical systems using expectation propagation , 2005, Neurocomputing.

[118] Zoubin Ghahramani,et al. Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[119] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.

[120] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.

[121] T. Raiko,et al. Learning nonlinear state-space models for control , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[122] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[123] A.G. Alleyne,et al. A survey of iterative learning control , 2006, IEEE Control Systems.

[124] Manfred Opper,et al. A Bayesian Approach to Online Learning , 2006 .

[125] Konrad Paul Kording,et al. Review TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2022 .

[126] Pieter Abbeel,et al. Using inaccurate models in reinforcement learning , 2006, ICML.

[127] Louis Wehenkel,et al. Clinical data based optimal STI strategies for HIV: a reinforcement learning approach , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[128] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.

[129] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.

[130] Larry Wasserman,et al. All of Nonparametric Statistics (Springer Texts in Statistics) , 2006 .

[131] Kaare Brandt Petersen,et al. The Matrix Cookbook , 2006 .

[132] T. Heskes,et al. Deterministic and Stochastic Gaussian Particle Smoothing , 2006, 2006 IEEE Nonlinear Statistical Signal Processing Workshop.

[133] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[134] Malte Kuß,et al. Gaussian process models for robust regression, classification, and reinforcement learning , 2006 .

[135] David Barber,et al. Expectation Correction for Smoothed Inference in Switching Linear Dynamical Systems , 2006, J. Mach. Learn. Res..

[136] Daniel M Wolpert,et al. Computational principles of sensorimotor control that minimize uncertainty and variability , 2007, The Journal of physiology.

[137] Stergios B. Fotopoulos,et al. All of Nonparametric Statistics , 2007, Technometrics.

[138] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[139] Rowland O'Flaherty,et al. Robust Global Swing-Up of the Pendubot via Hybrid Control , 2007 .

[140] Dieter Fox,et al. GP-UKF: Unscented kalman filters with Gaussian process prediction and observation models , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[141] A. Grancharova,et al. Explicit stochastic Nonlinear Predictive Control based on Gaussian process models , 2007, 2007 European Control Conference (ECC).

[142] Fabian Kappeler. Unicycle Robot , 2007 .

[143] Dieter Fox,et al. Gaussian Processes and Reinforcement Learning for Identification and Control of an Autonomous Blimp , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[144] Knut Graichen,et al. Swing-up of the double pendulum on a cart by feedforward and feedback control with experimental validation , 2007, Autom..

[145] Marc Toussaint,et al. Bayesian inference for motion control and planning , 2007 .

[146] Edward Lloyd Snelson,et al. Flexible and efficient Gaussian process models for machine learning , 2007 .

[147] Kurt Keutzer,et al. Fast support vector machine training and classification on graphics processors , 2008, ICML '08.

[148] Simo Särkkä,et al. Unscented Rauch-Tung-Striebel Smoother , 2008, IEEE Trans. Autom. Control..

[149] Duy Nguyen-Tuong,et al. Local Gaussian Process Regression for Real Time Online Model Learning , 2008, NIPS.

[150] Pascal Poupart,et al. Model-based Bayesian Reinforcement Learning in Partially Observable Domains , 2008, ISAIM.

[151] Dieter Fox,et al. GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[152] David J. Fleet,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .

[153] Carl E. Rasmussen,et al. Model-Based Reinforcement Learning with Continuous States and Actions , 2008, ESANN.

[154] Bojan Likar,et al. Gas-liquid separator modelling and simulation with Gaussian-process models , 2008, Simul. Model. Pract. Theory.

[155] Mazen Alamir,et al. Swing-up and stabilization of a Twin-Pendulum under state and control constraints by a fast NMPC scheme , 2008, Autom..

[156] Tor Arne Johansen,et al. Explicit stochastic predictive control of combustion plants based on Gaussian process models , 2008, Autom..

[157] Bernhard Schölkopf,et al. Sparse multiscale gaussian process regression , 2008, ICML '08.

[158] Iain Murray,et al. Introduction to Gaussian Processes , 2008 .

[159] Leonardo Acho,et al. Robust Orbital Stabilization of Pendubot: Algorithm Synthesis, Experimental Verification, and Application to Swing up and Balancing Control , 2008 .

[160] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[161] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..

[162] Uwe D. Hanebeck,et al. Analytic moment-based Gaussian process filtering , 2009, ICML '09.

[163] Shalabh Bhatnagar,et al. Natural actorcritic algorithms. , 2009 .

[164] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.

[165] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.

[166] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[167] Tapani Raiko,et al. Variational Bayesian learning of nonlinear hidden state-space models for model predictive control , 2009, Neurocomputing.

[168] Jason Weston,et al. Curriculum learning , 2009, ICML '09.

[169] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.

[170] S. Haykin,et al. Cubature Kalman Filters , 2009, IEEE Transactions on Automatic Control.

[171] Warren B. Powell,et al. An Approximate Dynamic Programming Algorithm for Large-Scale Fleet Management: A Case Application , 2009, Transp. Sci..

[172] Rajat Raina,et al. Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.

[173] Michalis K. Titsias,et al. Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[174] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[175] John K Kruschke,et al. Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[176] Henrik Ohlsson,et al. A Probabilistic Perspective on Gaussian Filtering and Smoothing , 2010, ArXiv.

[177] Simo Särkkä,et al. On Gaussian Optimal Smoothing of Non-Linear State Space Models , 2010, IEEE Transactions on Automatic Control.

[178] Carl E. Rasmussen,et al. State-Space Inference and Learning with Gaussian Processes , 2010, AISTATS.

[179] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[180] Sebastian Thrun,et al. A probabilistic approach to mixed open-loop and closed-loop control, with application to extreme autonomous driving , 2010, 2010 IEEE International Conference on Robotics and Automation.

[181] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[182] Sethu Vijayakumar,et al. Adaptive Optimal Feedback Control with Learned Internal Dynamics Models , 2010, From Motor Learning to Interaction Learning in Robots.

[183] Carl E. Rasmussen,et al. Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[184] Carl E. Rasmussen,et al. Model based learning of sigma points in unscented Kalman filtering , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[185] Olivier Sigaud,et al. From Motor Learning to Interaction Learning in Robots , 2010, From Motor Learning to Interaction Learning in Robots.

[186] Dieter Fox,et al. Learning GP-BayesFilters via Gaussian process latent variable models , 2009, Auton. Robots.

[187] Peter S. Maybeck,et al. Stochastic Models, Estimation And Control , 2012 .