Shannon meets Bellman: Feature based Markovian models for detection and optimization
暂无分享,去创建一个
[1] S. Eddy. Hidden Markov models. , 1996, Current opinion in structural biology.
[2] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[3] Francisco S. Melo,et al. Convergence of Q-learning with linear function approximation , 2007, 2007 European Control Conference (ECC).
[4] Sean P. Meyn. Control Techniques for Complex Networks: Workload , 2007 .
[5] Alexandre J. Chorin,et al. Optimal prediction with memory , 2002 .
[6] R. Zwanzig. Nonequilibrium statistical mechanics , 2001, Physics Subject Headings (PhySH).
[7] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[8] V. Borkar. Dynamic programming for ergodic control with partial observations , 2003 .
[9] Sean P. Meyn,et al. The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning , 2000, SIAM J. Control. Optim..
[10] Steven A. Lippman,et al. Applying a New Device in the Optimization of Exponential Queuing Systems , 1975, Oper. Res..
[11] S. Meyn,et al. Spectral theory and limit theorems for geometrically ergodic Markov processes , 2002, math/0209200.
[12] E. Altman. Constrained Markov Decision Processes , 1999 .
[13] Leandros Tassiulas,et al. Jointly optimal routing and scheduling in packet radio networks , 1992, IEEE Trans. Inf. Theory.
[14] E. M.,et al. Statistical Mechanics , 2021, Manual for Theoretical Chemistry.
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Vivek S. Borkar,et al. Average Cost Dynamic Programming Equations For Controlled Markov Chains With Partial Observations , 2000, SIAM J. Control. Optim..
[17] S. Varadhan,et al. Asymptotic evaluation of certain Markov process expectations for large time , 1975 .
[18] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[19] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[20] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[21] Eitan Altman,et al. Sensitivity of constrained Markov decision processes , 1991, Ann. Oper. Res..
[22] A J Chorin,et al. Optimal prediction and the Mori-Zwanzig representation of irreversible processes. , 2000, Proceedings of the National Academy of Sciences of the United States of America.
[23] John B. Moore,et al. Hidden Markov Models: Estimation and Control , 1994 .
[24] P. Schweitzer. Perturbation theory and finite Markov chains , 1968 .
[25] Vivek S. Borkar,et al. Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..
[26] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.
[27] M. Uschold,et al. Methods and applications , 1953 .
[28] Alexandre J. Chorin,et al. Non-Markovian Optimal Prediction , 2001, Monte Carlo Methods Appl..
[29] H. Mori. Transport, Collective Motion, and Brownian Motion , 1965 .
[30] Eugene A. Feinberg,et al. Handbook of Markov Decision Processes , 2002 .
[31] Sean P. Meyn. The policy iteration algorithm for average reward Markov decision processes with general state space , 1997, IEEE Trans. Autom. Control..
[32] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[33] Sean P. Meyn,et al. An analysis of reinforcement learning with function approximation , 2008, ICML '08.