Convergence of probability measures and Markov decision models with incomplete information

This paper deals with three major types of convergence of probability measures on metric spaces: weak convergence, setwise convergence, and convergence in total variation. First, it describes and compares necessary and sufficient conditions for these types of convergence, some of which are well-known, in terms of convergence of probabilities of open and closed sets and, for the probabilities on the real line, in terms of convergence of distribution functions. Second, it provides criteria for weak and setwise convergence of probability measures and continuity of stochastic kernels in terms of convergence of probabilities defined on the base of the topology generated by the metric. Third, it provides applications to control of partially observable Markov decision processes and, in particular, to Markov decision models with incomplete information.

[1]  Dimitri P. Bertsekas,et al.  Stochastic optimal control : the discrete time case , 2007 .

[2]  Michael Z. Zgurovsky,et al.  Partially Observable Total-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities , 2014, Math. Oper. Res..

[3]  A. Yushkevich Reduction of a Controlled Markov Model with Incomplete Data to a Problem with Complete Information in the Case of Borel State and Control Space , 1976 .

[4]  A. Shiryaev,et al.  Limit Theorems for Stochastic Processes , 1987 .

[5]  A. Shiryaev,et al.  Some limit theorems for simple point processes (a martingale approach) , 1980 .

[6]  E. Dynkin Controlled Random Sequences , 1965 .

[7]  O. Hernández-Lerma Adaptive Markov Control Processes , 1989 .

[8]  U. Rieder,et al.  Markov Decision Processes with Applications to Finance , 2011 .

[9]  K. Parthasarathy,et al.  Probability measures on metric spaces , 1967 .

[10]  R. Ash,et al.  Real analysis and probability , 1975 .

[11]  A. Bensoussan Stochastic Control of Partially Observable Systems , 1992 .

[12]  D. Rhenius Incomplete Information in Markovian Decision Models , 1974 .

[13]  Charlotte Striebel,et al.  Optimal Control of Discrete Time Stochastic Systems , 1975 .

[14]  Onésimo Hernández-Lerma,et al.  Controlled Markov Processes , 1965 .

[15]  U. Rieder Bayesian dynamic programming , 1975, Advances in Applied Probability.

[16]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[17]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[18]  K. Hinderer,et al.  Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[19]  Michael Z. Zgurovsky,et al.  Optimality conditions for total-cost Partially Observable Markov Decision Processes , 2013, 52nd IEEE Conference on Decision and Control.

[20]  Eugene A. Feinberg,et al.  Optimality Conditions for Partially Observable Markov Decision Processes , 2014 .

[21]  Eugene A. Feinberg,et al.  Average Cost Markov Decision Processes with Weakly Continuous Transition Probabilities , 2012, Math. Oper. Res..

[22]  R. Ahmad,et al.  Information Theory, Statistical Decision Functions, Random Processes , 1989 .

[23]  M. Aoki Optimal control of partially observable Markovian systems , 1965 .

[24]  E. Feinberg,et al.  Bergeʼs maximum theorem for noncompact image sets , 2013, 1309.7708.

[25]  E. Feinberg,et al.  Berge’s theorem for noncompact image sets , 2012, 1203.1340.

[26]  P. Billingsley,et al.  Convergence of Probability Measures , 1970, The Mathematical Gazette.

[27]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[28]  T. Yoshikawa,et al.  Discrete-Time Markovian Decision Processes with Incomplete State Observation , 1970 .