论文信息 - Sample-Path Optimality and Variance-Minimization of Average Cost Markov Control Processes - 字舞流文

Sample-Path Optimality and Variance-Minimization of Average Cost Markov Control Processes

This paper studies several average-cost criteria for Markov control processes on Borel spaces with possibly unbounded costs. Under suitable hypotheses we show (i) the existence of a sample-path average cost (SPAC-) optimal stationary policy; (ii) a stationary policy is SPAC-optimal if and only if it is expected average cost (EAC-) optimal; and (iii) within the class of stationary SPAC-optimal (equivalently, EAC-optimal) policies there exists one with a minimal limiting average variance.

Onésimo Hernández-Lerma | Oscar Vega-Amaya | Guadalupe Carrasco | O. Hernández-Lerma | Guadalupe Carrasco | Óscar Vega-Amaya

[1] M. Loève. Probability Theory II , 1978 .

[2] P. Mandl,et al. Estimation and control in Markov chains , 1974, Advances in Applied Probability.

[3] Arie Hordijk,et al. Blackwell optimality in the class of all policies in Markov decision chains with a Borel state space and unbounded rewards , 1999, Math. Methods Oper. Res..

[4] N. Kartashov,et al. Strongly stable Markov chains , 1986 .

[5] Rolando Cavazos-Cadena,et al. Denumerable controlled Markov chains with average reward criterion: Sample path optimality , 1995, Math. Methods Oper. Res..

[6] Henk Tijms,et al. Stochastic modelling and analysis: a computational approach , 1986 .

[7] L. Sennott. Stochastic Dynamic Programming and the Control of Queueing Systems , 1998 .

[8] Marie Duflo. Méthodes récursives aléatoires , 1990 .

[9] Jean B. Lasserre. Sample-path average optimality for Markov control processes , 1999, IEEE Trans. Autom. Control..

[10] P. Hall,et al. Martingale Limit Theory and Its Application , 1980 .

[11] Petr Mandl. A connection between controlled Markov chains and martingales , 1973, Kybernetika.

[12] Oscar Vega-Amaya. Sample path average optimality of Markov control processes with strictly unbounded cost , 1999 .

[13] Arie Hordijk,et al. Dynamic programming and Markov potential theory , 1974 .

[14] Onésimo Hernández-Lerma,et al. Average cost Markov control processes with weighted norms: existence of canonical policies , 1995 .

[15] O. Hernández-Lerma,et al. Policy Iteration for Average Cost Markov Control Processes on Borel Spaces , 1997 .

[16] Vivek S. Borkar,et al. Control of Markov Chains with Long-Run Average Cost Criterion , 1988 .

[17] M. Kurano. Markov decision processes with a minimum-variance criterion , 1987 .

[18] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[19] Rommert Dekker,et al. Recurrence Conditions for Average and Blackwell Optimality in Denumerable State Markov Decision Chains , 1992, Math. Oper. Res..

[20] O. Hernández-Lerma,et al. Average cost Markov control processes with weighted norms: value iteration , 1994 .