论文信息 - On the Relation Between Recurrence and Ergodicity Properties in Denumerable Markov Decision Chains

On the Relation Between Recurrence and Ergodicity Properties in Denumerable Markov Decision Chains

This paper studies two properties of the set of Markov chains induced by the deterministic policies in a Markov decision chain. These properties are called µ-uniform geometric ergodicity and µ-uniform geometric recurrence. µ-uniform ergodicity generalises a quasi-compactness condition. It can be interpreted as a strong version of stability, as it implies that the Markov chains generated by the deterministic stationary policies are uniformly stable. µ-uniform geometric recurrence can be shown to be equivalent to the simultaneous Doeblin condition, If µ is bounded. Both properties imply the existence of deterministic average and sensitive optimal policies. The second Key theorem in this paper shows the equivalence of µ-uniform geometric ergodicity and weak µ-uniform geometric recurrence under appropriate continuity conditions. In the literature numerous recurrence conditions have been used. The first Key theorem derives the relation between several of these conditions, which interestingly turn out to be equivalent in most cases.

[1] H. Deppe. Continuity of mean recurrence times in denumerable semi-Markov processes , 1985 .

[2] A. Hordijk,et al. On ergodicity and recurrence properties of a Markov chain by an application to an open jackson network , 1992, Advances in Applied Probability.

[3] Manfred Schäl,et al. On the Second Optimality Equation for Semi-Markov Decision Models , 1992, Math. Oper. Res..

[4] J. Neveu,et al. Mathematical foundations of the calculus of probability , 1965 .

[5] F. Spieksma,et al. Geometric Ergodicity of the ALOHA-system and a Coupled Processors Model , 1991, Probability in the Engineering and Informational Sciences.

[6] Rommert Dekker,et al. Recurrence Conditions for Average and Blackwell Optimality in Denumerable State Markov Decision Chains , 1992, Math. Oper. Res..

[7] Arie Hordijk,et al. Dynamic programming and Markov potential theory , 1974 .

[8] J. B. Lasserre,et al. Conditions for Existence of Average and Blackwell Optimal Stationary Policies in Denumerable Markov Decision Processes , 1988 .

[9] Arie Hordijk,et al. Average, Sensitive and Blackwell Optimal Policies in Denumerable Markov Decision Chains with Unbounded Rewards , 1988, Math. Oper. Res..

[10] Flos Spieksma,et al. The existence of sensitive optimal policies in two multi-dimensional queueing models , 1991 .

[11] A. Federgruen,et al. A note on simultaneous recurrence conditions on a set of denumerable stochastic matrices : (preprint) , 1978 .

[12] J. Doob. Stochastic processes , 1953 .

[13] H. Zijm. THE OPTIMALITY EQUATIONS IN MULTICHAIN DENUMERABLE STATE MARKOV DECISION PROCESSES WITH THE AVERAGE COST CRITERION: THE BOUNDED COST CASE MULTISTAGE BAYESIAN ACCEPTANCE SAMPLING: OPTIMALITY OF A (z,c",c'^)-SAMPLING PLAN IN GASE OF A POLYA PRIOR DISTRIBUTION , 1985 .

[14] Erhan Çinlar,et al. Introduction to stochastic processes , 1974 .

[15] Awi Federgruen,et al. RECURRENCE CONDITIONS IN DENUMERABLE STATE MARKOV DECISION PROCESSES , 1977 .