On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies
暂无分享,去创建一个
[1] H. D. Miller. A Convexity Property in the Theory of Random Variables Defined on a Finite Markov Chain , 1961 .
[2] W. M. Hirsch. A strong law for the maximum cumulative sum of independent random variables , 1965 .
[3] Cyrus Derman,et al. Finite State Markovian Decision Processes , 1970 .
[4] B. Hajek. Hitting-time and occupation-time bounds implied by drift analysis with applications , 1982, Advances in Applied Probability.
[5] John S. Edwards,et al. Linear Programming and Finite Markovian Control Problems , 1983 .
[6] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .
[7] E. Altman,et al. Markov decision problems and state-action frequencies , 1991 .
[8] N. Shimkin. Extremal large deviations in controlled i.i.d. processes with applications to hypothesis testing , 1993, Advances in Applied Probability.
[9] Eitan Altman,et al. Rate of Convergence of Empirical Measures and Costs in Controlled Markov Chains and Transient Optimality , 1994, Math. Oper. Res..
[10] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[11] Robert G. Gallager,et al. Discrete Stochastic Processes , 1995 .
[12] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[13] John N. Tsitsiklis,et al. Introduction to linear optimization , 1997, Athena scientific optimization and computation series.
[14] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[15] Amir Dembo,et al. Large Deviations Techniques and Applications , 1998 .
[16] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[17] S. Meyn,et al. Multiplicative ergodicity and large deviations for an irreducible Markov chain , 2000 .
[18] S. Balajiy,et al. Multiplicative Ergodicity and Large Deviations for an Irreducible Markov Chain , 2000 .
[19] P. Glynn,et al. Hoeffding's inequality for uniformly ergodic Markov chains , 2002 .