Adaptive control of Markov chains, I: Finite parameter set

Consider a controlled Markov chain whose transition probabilities depend upon an unknown parameter α taking values in finite set A . To each α is associated a prespecified stationary control law \phi(\alpha) . The adaptive control law selects at each time t the control action indicated by \phi(\alpha_{t}) where α t is the maximum likelihood estimate of α. It is shown that α t converges to a parameter α*such that the "closed-loop" transition probabilities corresponding to α*and \phi(\alpha^{\ast}) are the same as those corresponding to α0and \phi(\alpha) where α0is the true parameter. The situation when α0does not belong to the model set A is briefly discussed.

[1]  Björn Wittenmark,et al.  On Self Tuning Regulators , 1973 .

[2]  P. Mandl,et al.  Estimation and control in Markov chains , 1974, Advances in Applied Probability.

[3]  Y. Baram,et al.  An information theoretic approach to dynamical systems modeling and identification , 1977, 1977 IEEE Conference on Decision and Control including the 16th Symposium on Adaptive Processes and A Special Symposium on Fuzzy Set Theory and Applications.

[4]  Y. Baram,et al.  Consistent estimation on finite parameter sets with application to linear systems identification , 1978 .