Sample-Path Optimality and Variance-Minimization of Average Cost Markov Control Processes

This paper studies several average-cost criteria for Markov control processes on Borel spaces with possibly unbounded costs. Under suitable hypotheses we show (i) the existence of a sample-path average cost (SPAC-) optimal stationary policy; (ii) a stationary policy is SPAC-optimal if and only if it is expected average cost (EAC-) optimal; and (iii) within the class of stationary SPAC-optimal (equivalently, EAC-optimal) policies there exists one with a minimal limiting average variance.

[1]  M. Loève Probability Theory II , 1978 .

[2]  P. Mandl,et al.  Estimation and control in Markov chains , 1974, Advances in Applied Probability.

[3]  Arie Hordijk,et al.  Blackwell optimality in the class of all policies in Markov decision chains with a Borel state space and unbounded rewards , 1999, Math. Methods Oper. Res..

[4]  N. Kartashov,et al.  Strongly stable Markov chains , 1986 .

[5]  Rolando Cavazos-Cadena,et al.  Denumerable controlled Markov chains with average reward criterion: Sample path optimality , 1995, Math. Methods Oper. Res..

[6]  Henk Tijms,et al.  Stochastic modelling and analysis: a computational approach , 1986 .

[7]  L. Sennott Stochastic Dynamic Programming and the Control of Queueing Systems , 1998 .

[8]  Marie Duflo Méthodes récursives aléatoires , 1990 .

[9]  Jean B. Lasserre Sample-path average optimality for Markov control processes , 1999, IEEE Trans. Autom. Control..

[10]  P. Hall,et al.  Martingale Limit Theory and Its Application , 1980 .

[11]  Petr Mandl A connection between controlled Markov chains and martingales , 1973, Kybernetika.

[12]  Oscar Vega-Amaya Sample path average optimality of Markov control processes with strictly unbounded cost , 1999 .

[13]  Arie Hordijk,et al.  Dynamic programming and Markov potential theory , 1974 .

[14]  Onésimo Hernández-Lerma,et al.  Average cost Markov control processes with weighted norms: existence of canonical policies , 1995 .

[15]  O. Hernández-Lerma,et al.  Policy Iteration for Average Cost Markov Control Processes on Borel Spaces , 1997 .

[16]  Vivek S. Borkar,et al.  Control of Markov Chains with Long-Run Average Cost Criterion , 1988 .

[17]  M. Kurano Markov decision processes with a minimum-variance criterion , 1987 .

[18]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[19]  Rommert Dekker,et al.  Recurrence Conditions for Average and Blackwell Optimality in Denumerable State Markov Decision Chains , 1992, Math. Oper. Res..

[20]  O. Hernández-Lerma,et al.  Average cost Markov control processes with weighted norms: value iteration , 1994 .