论文信息 - Adaptive control of a partially observed controlled Markov chain - 字舞流文

Adaptive control of a partially observed controlled Markov chain

We consider an adaptive finite state controlled Markov chain with partial state information, motivated by a class of replacement problems. We present parameter estimation techniques based on the information available after actions that reset the state to a known value are taken. We prove that the parameter estimates converge w.p.1 to the true (unknown) parameter, under the feedback structure induced by a certainty equivalent adaptive policy. We also show that the adaptive policy is self-optimizing, in a long-run average sense, for any (measurable) sequence of parameter estimates converging w.p.1 to the true parameter.

Steven I. Marcus | Aristotle Arapostathis | E. Fernandez-Gaucherand | A. Arapostathis | E. Fernández-Gaucherand | S. Marcus | Emmanuel FERNANDEZ-GAUCHERANDt | Aristotle Arapostatiiis

[1] A. Arapostathis,et al. ON THE ADAPTIVE CONTROL OF A PARTIALLY OBSERVABLE BINARY MARKOV DECISION PROCESS , 2022 .

[2] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[3] P. Mandl,et al. Estimation and control in Markov chains , 1974, Advances in Applied Probability.

[4] O. Hernández-Lerma. Adaptive Markov Control Processes , 1989 .

[5] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[6] A. Arapostathis,et al. On the adaptive control of a partially observable Markov decision process , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.

[7] O. Hernondex-lerma,et al. Adaptive Markov Control Processes , 1989 .

[8] Ari Arapostathis,et al. On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes , 1991, Ann. Oper. Res..

[9] Chelsea C. White,et al. A Markov Quality Control Process Subject to Partial Observation , 1977 .

[10] Ari Arapostathis,et al. Analysis of an identification algorithm arising in the adaptive estimation of Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.

[11] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .

[12] A. Arapostathis,et al. Analysis of an adaptive control scheme for a partially observed controlled Markov chain , 1990, 29th IEEE Conference on Decision and Control.

[13] Armand M. Makowski,et al. Comparing Policies in Markov Decision Processes: Mandl's Lemma Revisited , 1990, Math. Oper. Res..

[14] H. Mine,et al. An Optimal Inspection and Replacement Policy under Incomplete State Information: Average Cost Criterion , 1984 .

[15] M. K. Ghosh,et al. Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .

[16] Ari Arapostathis,et al. Analysis of an adaptive control scheme for a partially observed controlled Markov chain , 1990 .

[17] Subhash Kak,et al. Advances in Computing and Control , 1989 .

[18] Shunji Osaki,et al. Stochastic Models in Reliability Theory , 1984 .

[19] V. Nollau. Kushner, H. J./Clark, D. S., Stochastic Approximation Methods for Constrained and Unconstrained Systems. (Applied Mathematical Sciences 26). Berlin‐Heidelberg‐New York, Springer‐Verlag 1978. X, 261 S., 4 Abb., DM 26,40. US $ 13.20 , 1980 .

[20] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[21] Harold J. Kushner. AN AVERAGING METHOD FOR STOCHASTIC APPROXIMATIONS WITH DISCONTINUOUS DYNAMICS, CONSTRAINTS, AND STATE DEPENDENT NOISE , 1983 .