A survey of recent results on continuous-time Markov decision processes

[1]  Martin L. Puterman,et al.  Sensitive Discount Optimality , 2008 .

[2]  T. Prieto-Rumeau Blackwell Optimality in the Class of Markov Policies for Continuous-Time Controlled Markov Chains , 2006 .

[3]  Xianping Guo,et al.  Average optimality for continuous-time Markov decision processes in polish spaces , 2006, math/0607098.

[4]  Gautam Choudhury,et al.  Optimal design and control of queues , 2005 .

[5]  O. Hernández-Lerma,et al.  Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates , 2005 .

[6]  V. Borkar Controlled diffusion processes , 2005, math/0511077.

[7]  Xi-Ren Cao,et al.  Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards , 2005, SIAM J. Control. Optim..

[8]  Onésimo Hernández-Lerma,et al.  Bias and overtaking equilibria for zero-sum continuous-time Markov games , 2005, Math. Methods Oper. Res..

[9]  Xianping Guo,et al.  Nonzero-sum games for continuous-time Markov chains with unbounded discounted payoffs , 2005, Journal of Applied Probability.

[10]  Xi-Ren Cao,et al.  Basic Ideas for Event-Based Optimization of Markov Systems , 2005, Discret. Event Dyn. Syst..

[11]  R. M. Feldman,et al.  Correction: Transient analysis of state-dependent queueing networks via cumulant functions , 2005 .

[12]  Onésimo Hernández-Lerma,et al.  The Laurent series, sensitive discount and Blackwell optimality for continuous-time controlled Markov chains , 2005, Math. Methods Oper. Res..

[13]  Xi-Ren Cao,et al.  The potential structure of sample paths and performance sensitivities of Markov systems , 2004, IEEE Transactions on Automatic Control.

[14]  Onésimo Hernández-Lerma,et al.  The Scalarization Approach to Multiobjective Markov Control Problems: Why Does It Work? , 2004 .

[15]  Alexei B. Piunovskiy,et al.  Multicriteria impulsive control of jump Markov processes , 2004, Math. Methods Oper. Res..

[16]  Anna Jaskiewicz,et al.  On the Equivalence of Two Expected Average Cost Criteria for Semi-Markov Control Processes , 2004, Math. Oper. Res..

[17]  Xianping Guo,et al.  Continuous-Time Controlled Markov Chains with Discounted Rewards , 2003 .

[18]  Onésimo Hernández-Lerma,et al.  Bias Optimality versus Strong 0-Discount Optimality in Markov Control Processes with Unbounded Costs , 2003 .

[19]  Xianping Guo,et al.  Nonzero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates , 2003 .

[20]  Xi-Ren Cao,et al.  Semi-Markov decision problems and performance sensitivity analysis , 2003, IEEE Trans. Autom. Control..

[21]  Xianping Guo,et al.  Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion , 2003, IEEE Trans. Autom. Control..

[22]  Weiping Zhu,et al.  Denumerable-state continuous-time Markov decision processes with unbounded transition and reward rates under the discounted criterion , 2002, Journal of Applied Probability.

[23]  Weiping Zhu,et al.  Denumerable state continuous time Markov decision processes with unbounded cost and transition rates under average criterion , 2002, The ANZIAM Journal.

[24]  Hayriye Ayhan,et al.  Bias optimal admission control policies for a multiclass nonstationary queueing system , 2002, Journal of Applied Probability.

[25]  Xianping Guo,et al.  A note on optimality conditions for continuous-time Markov decision processes with average cost criterion , 2001, IEEE Trans. Autom. Control..

[26]  Onésimo Hernández-Lerma,et al.  Nonstationary Continuous-Time Markov Control Processes with Discounted Costs on Infinite Horizon , 2001 .

[27]  Martin L. Puterman,et al.  A note on bias optimality in controlled queueing systems , 2000, Journal of Applied Probability.

[28]  Arie Hordijk,et al.  Blackwell optimality in the class of all policies in Markov decision chains with a Borel state space and unbounded rewards , 1999, Math. Methods Oper. Res..

[29]  Qinru Qiu,et al.  Stochastic modeling of a power-managed system: construction and optimization , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[30]  Hayriye Ayhan,et al.  BIAS OPTIMALITY IN A QUEUE WITH ADMISSION CONTROL , 1999, Probability in the Engineering and Informational Sciences.

[31]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[32]  M. Puterman,et al.  Bias Optimality in Controlled Queueing Systems , 1998, Journal of Applied Probability.

[33]  Alexander A. Yushkevich,et al.  Blackwell Optimality in Borelian Continuous-in-Action Markov Decision Processes , 1997 .

[34]  Qiying Hu,et al.  Continuous Time Markov Decision Processes with Discounted Moment Criterion , 1996 .

[35]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[36]  S. Meyn,et al.  Computable exponential convergence rates for stochastically ordered Markov processes , 1996 .

[37]  Arie Leizarowitz,et al.  Overtaking and Almost-Sure Optimality for Infinite Horizon Markov Decision Processes , 1996, Math. Oper. Res..

[38]  A. A. Yushkevich,et al.  Blackwell optimal policies in a Markov decision process with a Borel state space , 1994, Math. Methods Oper. Res..

[39]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[40]  S. Meyn,et al.  Stability of Markovian processes III: Foster–Lyapunov criteria for continuous-time processes , 1993, Advances in Applied Probability.

[41]  Q. Hu,et al.  Discounted and average Markov decision processes with unbounded rewards: New conditions , 1992 .

[42]  Manfred Schäl,et al.  On the Second Optimality Equation for Semi-Markov Decision Models , 1992, Math. Oper. Res..

[43]  Rommert Dekker,et al.  Recurrence Conditions for Average and Blackwell Optimality in Denumerable State Markov Decision Chains , 1992, Math. Oper. Res..

[44]  O. Hernández-Lerma Adaptive Markov Control Processes , 1989 .

[45]  J. B. Lasserre,et al.  Conditions for Existence of Average and Blackwell Optimal Stationary Policies in Denumerable Markov Decision Processes , 1988 .

[46]  Arie Hordijk,et al.  Average, Sensitive and Blackwell Optimal Policies in Denumerable Markov Decision Chains with Unbounded Rewards , 1988, Math. Oper. Res..

[47]  M. Y. Kitayev Semi-Markov and Jump Markov Controlled Models: Average Cost Criterion , 1986 .

[48]  Claude Lefèvre,et al.  Optimal Control of a Birth and Death Epidemic Process , 1981, Oper. Res..

[49]  Richard F. Serfozo,et al.  Technical Note - An Equivalence Between Continuous and Discrete Time Markov Decision Processes , 1979, Oper. Res..

[50]  Claude Lefèvre,et al.  Optimal control of the simple stochastic epidemic with variable recovery rates , 1979 .

[51]  E. Fainberg,et al.  On Homogeneous Markov Models with Continuous Time and Finite or Countable State Space , 1979 .

[52]  Wayne L. Winston,et al.  A birth–death model of advertising and pricing , 1979, Advances in Applied Probability.

[53]  N. Ling The Mathematical Theory of Infectious Diseases and its applications , 1978 .

[54]  A. A. Yushkevich,et al.  Controlled Markov Models with Countable State Space and Continuous Time , 1978 .

[55]  P. Kakumanu,et al.  Relation between continuous and discrete time markovian decision problems , 1977 .

[56]  K. Wickwire Mathematical models for the control of pests and infectious diseases: a survey. , 1977, Theoretical population biology.

[57]  B. Doshi Continuous Time Control of Markov Processes on an Arbitrary State Space: Discounted Rewards , 1976 .

[58]  R. Serfozo An Equivalence between Continuous and Discrete Time Markov Decision Processes. , 1976 .

[59]  J. Bather Optimal stationary policies for denumerable Markov chains in continuous time , 1976, Advances in Applied Probability.

[60]  P. Kakumanu,et al.  Continuous time Markovian decision processes average return criterion , 1975 .

[61]  Steven A. Lippman,et al.  Applying a New Device in the Optimization of Exponential Queuing Systems , 1975, Oper. Res..

[62]  A. A. Yushkevich,et al.  On a Class of Strategies in General Markov Decision Models , 1974 .

[63]  M. Puterman Sensitive Discount Optimality in Controlled One-Dimensional Diffusions , 1974 .

[64]  David B. Montgomery,et al.  A Further Step, Though Sometimes Uncertain, Toward Quantifying Buying Behavior@@@Stochastic Models of Buying Behavior , 1972 .

[65]  P. Kakumanu,et al.  Nondiscounted Continuous Time Markovian Decision Process with Countable State Space , 1972 .

[66]  P. Kakumanu Continuously Discounted Markov Decision Model with Countable State and Action Space , 1971 .

[67]  A. F. Veinott Discrete Dynamic Programming with Sensitive Discount Optimality Criteria , 1969 .

[68]  B. L. Miller,et al.  Discrete Dynamic Programming with a Small Interest Rate , 1969 .

[69]  B. L. Miller Finite state continuous time Markov decision processes with an infinite planning horizon , 1968 .

[70]  L. Fisher,et al.  On Recurrent Denumerable Decision Processes , 1968 .

[71]  A. F. Veinott ON FINDING OPTIMAL POLICIES IN DISCRETE DYNAMIC PROGRAMMING WITH NO DISCOUNTING , 1966 .

[72]  von Weizäscker,et al.  Existence of Optimal Programs of Accumulation for an Infinite Time Horizon , 1965 .

[73]  D. Blackwell Discrete Dynamic Programming , 1962 .

[74]  R. Howard Dynamic Programming and Markov Processes , 1960 .

[75]  M. L. Vidale,et al.  An Operations-Research Study of Sales Response to Advertising , 1957 .

[76]  W. Feller,et al.  On the integro-differential equations of purely discontinuous Markoff processes , 1940 .

[77]  F. Ramsey,et al.  THE MATHEMATICAL THEORY OF SAVING , 1928 .

[78]  W. O. Kermack,et al.  A contribution to the mathematical theory of epidemics , 1927 .

[79]  Onésimo Hernández Lerma,et al.  Existence and regularity of nonhomogeneous Q(t)-processes under measurability conditions , 2007 .

[80]  Onésimo Hernández-Lerma,et al.  Bias Optimality for Continuous-Time Controlled Markov Chains , 2006, SIAM J. Control. Optim..

[81]  Xi-Ren Cao,et al.  Partially observable Markov decision processes with reward information , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[82]  L. Allen An introduction to stochastic processes with applications to biology , 2003 .

[83]  Xi-Ren Cao,et al.  A Sensitivity View of Markov Decision Processes and Reinforcement Learning , 2003 .

[84]  O. Hernández-Lerma,et al.  Continuous-time controlled Markov chains , 2003 .

[85]  Sylvain Sorin,et al.  Stochastic Games and Applications , 2003 .

[86]  Xi-Ren Cao,et al.  From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[87]  Eugene A. Feinberg,et al.  Handbook of Markov Decision Processes , 2002 .

[88]  Martin L. Puterman,et al.  A probabilistic analysis of bias optimality in unichain Markov decision processes , 2001, IEEE Trans. Autom. Control..

[89]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[90]  Q. Hu,et al.  Continuous time markov decision processes with nonuniformly bounded transition rate: expected total rewards , 1998 .

[91]  A. B. Piunovskii A Controlled Jump Discounted Model with Constraints , 1998 .

[92]  C. Wu CONTINUOUS TIME MARKOV DECISION PROCESSES WITH UNBOUNDED REWARDS AND NON-UNIFORMLY BOUNDED TRANSITION RATES UNDER DISCOUNTED CRITERION , 1997 .

[93]  Onésimo Hernández-Lerma,et al.  Markov Control Processes , 1996 .

[94]  W. J. Anderson Continuous-Time Markov Chains , 1991 .

[95]  Q Hu,et al.  CTMDP AND ITS RELATIONSHIP WITH DTMDP , 1990 .

[96]  Wayne L. Winston,et al.  A BIRTH-DEATH MODEL OF ADVERTISING , 1979 .

[97]  Howard M. Taylor,et al.  A laurent series for the resolvent of a strongly continuous stochastic semi-group , 1976 .

[98]  Mark R. Lembersky On Maximal Rewards and $|varepsilon$-Optimal Policies in Continuous Time Markov Decision Chains , 1974 .

[99]  D. Gale On Optimal Development in a Multi-Sector Economy , 1967 .

[100]  V. Rykov Markov Decision Processes with Finite State and Decision Spaces , 1966 .

[101]  Onésimo Hernández-Lerma,et al.  Controlled Markov Processes , 1965 .