The structure of persistently nearly-optimal strategies in stochastic dynamic programming problems

This p a p e r d e a l s wi th t h e s t r u c t u r e of p e r s i s t e n t l y n e a r l y o p t i m a l s t r a t e g i e s in d i s c r e t e t i m e c o u n t a b l e s t a t e s t o c h a s t i c dynam i c p r o g r a m m i n g models . F o r mode ls with f i n i t e s t a t e and a c t i o n s e t s t h e p r o b l e m h a s b e e n c o m p l e t e l y s o l v e d by Blackwel l [1] a n d K r y l o v [Z]. Using d i f f e r e n t me thods , t h e y b o t h p r o v e d t h e e x i s t e n c e of s t a t i o n a r y op t ima l s t r a t e g i e s . Howeve r , s t a t i o n a r y op t i m a l ( o r e v e n n e a r l y o p t i m a l ) s t r a t e g i e s may no t e x i s t f o r mode ls wi th i n f i n i t e a c t i o n s e t s [3,4] . In t h i s c o n n e c t i o n t h e e x i s t e n c e of s t a t i o n a r y n e a r l y o p t i m a l s t r a t e g i e s h a s b e e n p r o v e d f o r c e r t a i n c l a s s e s of mode ls ( f o r example , p o s i t i v e mode ls [ 5 7 ] , models with f i n i t e o r c o m p a c t a c t i o n s e t s [8 ,9] , s t r o n g l y c o n v e r g e n t mode l s [ 9 1 1 ] , and c o n t r a c t i n g mode l s [9 ,12]) .

[1]  D. Blackwell Discrete Dynamic Programming , 1962 .

[2]  Onésimo Hernández-Lerma,et al.  Controlled Markov Processes , 1965 .

[3]  N. Krylov The Construction of an Optimal Strategy for a Finite Controlled Chain , 1965 .

[4]  David Blackwell,et al.  Positive dynamic programming , 1967 .

[5]  D. Ornstein On the existence of stationary optimal strategies , 1969 .

[6]  E. B. Frid On a Problem of D. Lackwell from the Theory of Dynamic Programming , 1970 .

[7]  J. A. E. E. van Nunen Contracting Markov decision processes , 1976 .

[8]  van der J Jan Wal,et al.  Successive approximations for convergent dynamic programming , 1977 .

[9]  Alʹbert Nikolaevich Shiri︠a︡ev,et al.  Optimal stopping rules , 1977 .

[10]  Kees M. van Hee,et al.  Markov Strategies in Dynamic Programming , 1978, Math. Oper. Res..

[11]  Martin L. Puterman,et al.  Contracting Markov Decision Processes. (Mathematical Centre Tract 71.) , 1978 .

[12]  J Jaap Wessels,et al.  Markov Decision Theory , 1979 .

[13]  van der,et al.  On uniformly nearly optimal stationary strategies , 1981 .

[14]  A. Yushkevich,et al.  Controlled random sequences and Markov chains , 1982 .

[15]  E. Fainberg Non-Randomized Markov and Semi-Markov Strategies in Dynamic Programming , 1982 .

[16]  Manfred Schäl,et al.  Stationary Policies in Dynamic Programming Models Under Compactness Assumptions , 1983, Math. Oper. Res..

[17]  E. Fainberg Controlled Markov Processes with Arbitrary Numerical Criteria , 1983 .

[18]  E. Fainberg,et al.  Stationary and Markov policies in countable state dynamic programming , 1983 .

[19]  van der,et al.  On Uniformly Nearly-Optimal Markov Strategies , 1983 .

[20]  Jan van der Wal,et al.  ON THE USE OF INFORMATION IN MARKOV DECISION PROCESSES , 1984 .

[21]  Rolf van Dawen Stationäre Politiken in stochastischen Entscheidungsmodellen , 1984 .

[22]  Jan van der Wal,et al.  On Stationary Strategies in Countable State Total Reward Markov Decision Processes , 1984, Math. Oper. Res..

[23]  E. Fainberg Sufficient Classes of Strategies in Discrete Dynamic Programming I: Decomposition of Randomized Strategies and Embedded Models , 1987 .