论文信息 - Analysis of an adaptive control scheme for a partially observed controlled Markov chain

Analysis of an adaptive control scheme for a partially observed controlled Markov chain

The authors consider an adaptive finite state controlled Markov chain with partial state information, motivated by a class of replacement problems. They present parameter estimation techniques based on the information available after actions that reset the state to a known value are taken. It is proved that the parameter estimates converge w.p.1 to the true (unknown) parameter, under the feedback structure induced by a certainty equivalent adaptive policy. It is shown that the adaptive policy is self-optimizing in a long-run average sense, for any (measurable) sequence of parameter estimates converging w.p.1 to the true parameter. >

Ari Arapostathis | Steven I. Marcus | E. Fernandez-Gaucherand

[1] A. Arapostathis,et al. ON THE ADAPTIVE CONTROL OF A PARTIALLY OBSERVABLE BINARY MARKOV DECISION PROCESS , 2022 .

[2] Armand M. Makowski,et al. Comparing Policies in Markov Decision Processes: Mandl's Lemma Revisited , 1990, Math. Oper. Res..

[3] A. Arapostathis,et al. On the adaptive control of a partially observable Markov decision process , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.

[4] O. Hernondex-lerma,et al. Adaptive Markov Control Processes , 1989 .

[5] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[6] Chelsea C. White,et al. A Markov Quality Control Process Subject to Partial Observation , 1977 .

[7] Ari Arapostathis,et al. Analysis of an identification algorithm arising in the adaptive estimation of Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.

[8] H. Mine,et al. An Optimal Inspection and Replacement Policy under Incomplete State Information: Average Cost Criterion , 1984 .

[9] Ari Arapostathis,et al. On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes , 1991, Ann. Oper. Res..

[10] A. Arapostathis,et al. Analysis of an adaptive control scheme for a partially observed controlled Markov chain , 1990, 29th IEEE Conference on Decision and Control.

[11] A. Arapostathis,et al. On partially observable Markov decision processes with an average cost criterion , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.

[12] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[13] J.. AN AVERAGING METHOD FOR STOCHASTIC APPROXIMATIONS WITH DISCONTINUOUS DYNAMICS , CONSTRAINTS , AND STATE DEPENDENT NOISE by , 2022 .

[14] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .

[15] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[16] P. Mandl,et al. Estimation and control in Markov chains , 1974, Advances in Applied Probability.