Fast Particle Filters and Their Applications to Adaptive Control in Change-Point ARX Models and Robotics

The Kalman filter has provided an efficient and elegant solution to control problems in linear stochastic systems. For nonlinear stochastic systems, control problems become much more difficult and a large part of the literature resorts to linear approximations so that an "extended Kalman filter" or a "mixture of Kalman filters" can be used in place of the Kalman filter for linear systems. Since these linear approximations are local expansions around the estimated states, they may perform poorly when the true state differs substantially from its estimate. Substantial progress was made in the past decade for the filtering problem with the development of particle filters. This development offers promise for solving some long-standing control problems which we consider in this chapter. As noted by Ljung & Gunnarsson (1990), a parameterized description of a dynamic system that is convenient for identification is to specify the model's prediction of the output yt as a function of the parameter vector and past inputs and outputs us and ys, respectively, for s < t. When the function is linear in , this yields the regression model , which includes as a special case the ARX model (autoregressive model with exogenous inputs) that is widely used in control and signal processing. Here the regressor vector is

[1]  Graham C. Goodwin,et al.  Adaptive filtering prediction and control , 1984 .

[2]  Sebastian Thrun,et al.  Monte Carlo POMDPs , 1999, NIPS.

[3]  Jan Peters Policy gradient methods , 2010, Scholarpedia.

[4]  Wolfram Burgard,et al.  Fast and accurate SLAM with Rao-Blackwellized particle filters , 2007, Robotics Auton. Syst..

[5]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[6]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[7]  Yuguo Chen,et al.  Identification and Adaptive Control of Change-Point ARX Models Via Rao-Blackwellized Particle Filters , 2007, IEEE Transactions on Automatic Control.

[8]  Lei Guo,et al.  Adaptive Control with Recursive Identification for Stochastic Linear Systems , 1987 .

[9]  Han-Fu Chen,et al.  The AAstrom-Wittenmark self-tuning regulator revisited and ELS-based adaptive trackers , 1991 .

[10]  Geoffrey J. Gordon Stable Function Approximation in Dynamic Programming , 1995, ICML.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  John N. Tsitsiklis,et al.  Regression methods for pricing complex American-style options , 2001, IEEE Trans. Neural Networks.

[13]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[14]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[15]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[16]  P. Ramadge,et al.  Discrete time stochastic adaptive control , 1979, 1979 18th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[17]  Tze Leung Lai,et al.  Approximate Policy Optimization and Adaptive Control in Regression Models , 2005 .

[18]  T. Lai,et al.  Asymptotically efficient self-tuning regulators , 1987 .

[19]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[20]  L. Ljung,et al.  Exponential stability of general tracking algorithms , 1995, IEEE Trans. Autom. Control..

[21]  Wolfram Burgard,et al.  Improving Grid-based SLAM with Rao-Blackwellized Particle Filters by Adaptive Proposals and Selective Resampling , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[22]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[23]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[24]  Lennart Ljung,et al.  Adaptation and tracking in system identification - A survey , 1990, Autom..

[25]  Yi-Ching Yao Estimation of a Noisy Discrete-Time Step Function: Bayes and Empirical Bayes Approaches , 1984 .

[26]  Sean P. Meyn,et al.  Model reference adaptive control of time varying and stochastic systems , 1993, IEEE Trans. Autom. Control..

[27]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[28]  D. S. Bayard A forward method for optimal stochastic nonlinear and adaptive control , 1991 .

[29]  Lennart Ljung,et al.  Performance analysis of general tracking algorithms , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[30]  Lennart Ljung,et al.  Theory and Practice of Recursive Identification , 1983 .

[31]  Jan Peters,et al.  Policy gradient methods , 2010, Scholarpedia.