Implicit dual control based on particle filtering and forward dynamic programming

This paper develops a sampling-based approach to implicit dual control. Implicit dual control methods synthesize stochastic control policies by systematically approximating the stochastic dynamic programming equations of Bellman, in contrast to explicit dual control methods that artificially induce probing into the control law by modifying the cost function to include a term that rewards learning. The proposed implicit dual control approach is novel in that it combines a particle filter with a policy-iteration method for forward dynamic programming. The integration of the two methods provides a complete sampling-based approach to the problem. Implementation of the approach is simplified by making use of a specific architecture denoted as an H-block. Practical suggestions are given for reducing computational loads within the H-block for real-time applications. As an example, the method is applied to the control of a stochastic pendulum model having unknown mass, length, initial position and velocity, and unknown sign of its dc gain. Simulation results indicate that active controllers based on the described method can systematically improve closed-loop performance with respect to other more common stochastic control approaches.

[1]  Julie L. Swann,et al.  Simple Procedures for Selecting the Best Simulated System When the Number of Alternatives is Large , 2001, Oper. Res..

[2]  Heinz Unbehauen,et al.  Adaptive Dual Control , 2004 .

[3]  Tamer Basar,et al.  Dual Control Theory , 2001 .

[4]  Nando de Freitas,et al.  Sequential Monte Carlo Methods in Practice , 2001, Statistics for Engineering and Information Science.

[5]  Yaakov Bar-Shalom,et al.  Dual control guidance for simultaneous identification and interception , 1983, The 22nd IEEE Conference on Decision and Control.

[6]  Neil J. Gordon,et al.  Particles and Mixtures for Tracking and Guidance , 2001, Sequential Monte Carlo Methods in Practice.

[7]  Björn Wittenmark,et al.  On Self Tuning Regulators , 1973 .

[8]  H. Sorenson,et al.  Recursive bayesian estimation using gaussian sums , 1971 .

[9]  O. Jacobs,et al.  An optimal extremal control system , 1970 .

[10]  Eamonn Mullins,et al.  Probability and Statistics. 2nd edn. , 1988 .

[11]  Graham C. Goodwin,et al.  Adaptive filtering prediction and control , 1984 .

[12]  J. Alster,et al.  A technique for dual adaptive control , 1974, Autom..

[13]  E. Tse,et al.  Further comments on "Adaptive stochastic control for a class of linear systems" , 1972 .

[14]  R. Padilla,et al.  An innovations approach to dual control , 1982 .

[15]  Kailash Birmiwal A new adaptive LQG control algorithm , 1994 .

[16]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[17]  Y. Bar-Shalom,et al.  Concepts and Methods in Stochastic Control , 1976 .

[18]  D.G. Lainiotis,et al.  Partitioning: A unifying framework for adaptive systems, II: Control , 1976, Proceedings of the IEEE.

[19]  Bengt Lindoff,et al.  Analysis of approximations of dual control , 1999 .

[20]  Arthur Gelb,et al.  Applied Optimal Estimation , 1974 .

[21]  S. Dreyfus Some Types of Optimal Control of Stochastic Systems , 1964 .

[22]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[23]  Björn Wittenmark,et al.  Stochastic adaptive control methods: a survey , 1975 .

[24]  D. Lainiotis,et al.  Partitioning: A unifying framework for adaptive systems, I: Estimation , 1976, Proceedings of the IEEE.

[25]  Jayant G. Deshpande,et al.  Adaptive control of linear stochastic systems , 1973 .

[26]  D. Bayard,et al.  Implicit dual control for general stochastic systems , 1985 .

[27]  Roger W. Jelliffe,et al.  Multiple model (MM) dosage design: achieving target goals with maximal precision , 2001, Proceedings 14th IEEE Symposium on Computer-Based Medical Systems. CBMS 2001.

[28]  E. Tse,et al.  Actively adaptive control for nonlinear stochastic systems , 1976, Proceedings of the IEEE.

[29]  D. Alspach Dual control based on approximate a posteriori density functions , 1972 .

[30]  Branko Ristic,et al.  Beyond the Kalman Filter: Particle Filters for Tracking Applications , 2004 .

[31]  Y. Bar-Shalom Stochastic dynamic programming: Caution and probing , 1981 .

[32]  Eric Walter,et al.  Dual Control of Linearly Parameterised Models via Prediction of Posterior Densities , 1996, Eur. J. Control.

[33]  Y. Bar-Shalom,et al.  On the optimal control of discrete-time linear systems with random parameters , 1969 .

[34]  E. Walter,et al.  An actively adaptive control policy for linear models , 1996, IEEE Trans. Autom. Control..

[35]  David S. Bayard,et al.  On the evaluation of expected performance cost for partially observed closed-loop stochastic systems , 1985 .

[36]  N. Filatov,et al.  Survey of adaptive dual control methods , 2000 .

[37]  David S. Bayard,et al.  Reduced complexity dynamic programming based on policy iteration , 1992 .

[38]  G. Kitagawa Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models , 1996 .

[39]  L B Sheiner,et al.  Improved computer-assisted digoxin therapy. A method using feedback of measured serum digoxin concentrations. , 1975, Annals of internal medicine.

[40]  Yu. A. Gur'yan,et al.  Parts I and II , 1982 .

[41]  D. S. Bayard A forward method for optimal stochastic nonlinear and adaptive control , 1991 .

[42]  Björn Wittenmark,et al.  Adaptive Dual Control Methods: An Overview , 1995 .

[43]  Yaakov Bar-Shalom,et al.  An actively adaptive control for linear systems with random parameters via the dual control approach , 1972, CDC 1972.

[44]  J. J. Florentin,et al.  Optimal, Probing, Adaptive Control of a Simple Bayesian System† , 1962 .

[45]  Barry L. Nelson,et al.  Ranking and Selection for Steady-State Simulation: Procedures and Perspectives , 2002, INFORMS J. Comput..

[46]  William R. Cluett,et al.  Stochastic iterative dynamic programming: a Monte Carlo approach to dual control , 2005, Autom..

[47]  D. S. Bayard Proof of quasi-adaptivity for the m-measurement feedback class of stochastic control policies , 1987 .

[48]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[49]  Y. Bar-Shalom,et al.  Dual effect, certainty equivalence, and separation in stochastic control , 1974 .

[50]  D. Naidu,et al.  Optimal Control Systems , 2018 .

[51]  Y. Bar-Shalom,et al.  Wide-sense adaptive dual control for nonlinear stochastic systems , 1973 .

[52]  Michael A. West,et al.  Combined Parameter and State Estimation in Simulation-Based Filtering , 2001, Sequential Monte Carlo Methods in Practice.

[53]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[54]  K. Åström,et al.  Problems of Identification and Control , 1971 .

[55]  K. Åström,et al.  Dual Control of an Integrator with Unknown Gain , 1986 .

[56]  D. Magill Optimal adaptive estimation of sampled stochastic processes , 1965 .

[57]  Y. Bar-Shalom,et al.  A multiple model adaptive dual control algorithm for stochastic systems with unknown parameters , 1979, 1979 18th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.