On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
暂无分享,去创建一个
Ari Arapostathis | Steven I. Marcus | Emmanuel Fernández-Gaucherand | A. Arapostathis | E. Fernández-Gaucherand | S. Marcus
[1] R. Bellman. A Markovian Decision Process , 1957 .
[2] Richard Bellman,et al. Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.
[3] Robert Bartle,et al. The Elements of Real Analysis , 1977, The Mathematical Gazette.
[4] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[5] H. M. Taylor. Markovian sequential replacement processes , 1965 .
[6] Onésimo Hernández-Lerma,et al. Controlled Markov Processes , 1965 .
[7] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state-information II. The convexity of the lossfunction , 1969 .
[8] S. Ross. Arbitrary State Markovian Decision Processes , 1968 .
[9] T. Yoshikawa,et al. Discrete-Time Markovian Decision Processes with Incomplete State Observation , 1970 .
[10] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[11] S. Ross. Quality Control under Markovian Deterioration , 1971 .
[12] L. G. Gubenko,et al. On discrete time Markov decision processes , 1972 .
[13] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[14] Evan L. Porteus. On the Optimality of Structured Policies in Countable Stage Decision Processes , 1975 .
[15] Robert C. Wang. Computing optimal quality control policies — two actions , 1976 .
[16] Jacob Wijngaard,et al. Stationary Markovian Decision Problems and Perturbation Theory of Quasi-Compact Linear Operators , 1977, Math. Oper. Res..
[17] Chelsea C. White,et al. A Markov Quality Control Process Subject to Partial Observation , 1977 .
[18] Evan L. Porteus,et al. On the Optimality of Structured Policies in Countable Stage Decision Processes. II: Positive and Negative Problems , 1977 .
[19] Robert C. Wang,et al. OPTIMAL REPLACEMENT POLICY WITH UNOBSERVABLE STATES , 1977 .
[20] J. P. Georgin,et al. Estimation et controle des chaines de Markov sur des espaces arbitraires , 1978 .
[21] K. M. vanHee,et al. Bayesian control of Markov chains , 1978 .
[22] C. White. Optimal Inspection and Repair of a Production Process Subject to Deterioration , 1978 .
[23] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[24] C. White. Optimal control-limit strategies for a partially observed replacement problem† , 1979 .
[25] S. Christian Albright,et al. Structural Results for Partially Observable Markov Decision Processes , 1979, Oper. Res..
[26] C. White. Bounds on optimal cost for a replacement problem with partial observations , 1979 .
[27] L. Thomas. Connectedness conditions used in finite state Markov Decision Processes , 1979 .
[28] C. White. Monotone control laws for noisy, countable-state Markov chains , 1980 .
[29] Sheldon M. Ross,et al. Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.
[30] Daniel P. Heyman,et al. Stochastic models in operations research , 1982 .
[31] Evan L. Porteus. Conditions for characterizing the structure of optimal strategies in infinite-horizon dynamic programs , 1982 .
[32] G. Monahan. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .
[33] H. Mine,et al. An Optimal Inspection and Replacement Policy under Incomplete State Information: Average Cost Criterion , 1984 .
[34] D. J. White,et al. Real Applications of Markov Decision Processes , 1985 .
[35] Ari Arapostathis,et al. Analysis of an identification algorithm arising in the adaptive estimation of Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.
[36] Hajime Kawai,et al. An optimal inspection and replacement policy under incomplete state information , 1986 .
[37] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .
[38] Masami Kurano,et al. Markov Decision Processes with a Borel Measurable Cost Function - The Average Case , 1986, Math. Oper. Res..
[39] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[40] William S. Lovejoy. Technical Note - On the Convexity of Policy Regions in Partially Observed Systems , 1987, Oper. Res..
[41] William S. Lovejoy,et al. Some Monotonicity Results for Partially Observed Markov Decision Processes , 1987, Oper. Res..
[42] S. Marcus,et al. Adaptive control of Markov processes with incomplete state information and unknown parameters , 1987 .
[43] D. J. White,et al. Further Real Applications of Markov Decision Processes , 1988 .
[44] Shaler Stidham,et al. Scheduling, Routing, and Flow Control in Stochastic Networks , 1988 .
[45] Charles H. Fine. A Quality Control Model with Learning Effects , 1988, Oper. Res..
[46] W. Hopp,et al. Multiaction maintenance under Markovian deterioration and incomplete state information , 1988 .
[47] R. Cavazos-Cadena. Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward Markov decision chains , 1988 .
[48] A. Arapostathis,et al. On the adaptive control of a partially observable Markov decision process , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.
[49] O. Hernondex-lerma,et al. Adaptive Markov Control Processes , 1989 .
[50] R. Cavazos-Cadena. Necessary conditions for the optimality equation in average-reward Markov decision processes , 1989 .
[51] Linn I. Sennott,et al. Average Cost Optimal Stationary Policies in Infinite State Markov Decision Processes with Unbounded Costs , 1989, Oper. Res..
[52] A. Arapostathis,et al. On partially observable Markov decision processes with an average cost criterion , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[53] M. Kurano. The existence of minimum pair of state and policy for Markov decision processes under the hypothesis of Doeblin , 1989 .
[54] V. Borkar. Control of Markov chains with long-run average cost criterion: the dynamic programming equations , 1989 .
[55] O. Hernández-Lerma. Adaptive Markov Control Processes , 1989 .
[56] Steven I. Marcus,et al. Ergodic control of Markov chains , 1990, 29th IEEE Conference on Decision and Control.
[57] O. Hernández-Lerma,et al. Average cost optimal policies for Markov control processes with Borel state space and unbounded costs , 1990 .
[58] A. Arapostathis,et al. Remarks on the existence of solutions to the average cost optimality equation in Markov decision processes , 1991 .
[59] Masami Kurano,et al. Average cost Markov decision processes under the hypothesis of Doeblin , 1991, Ann. Oper. Res..
[60] O. Hernández-Lerma,et al. Recurrence conditions for Markov decision processes with Borel state space: A survey , 1991 .
[61] A. Arapostathis,et al. ON THE ADAPTIVE CONTROL OF A PARTIALLY OBSERVABLE BINARY MARKOV DECISION PROCESS , 2022 .