Near Optimal Behavior via Approximate State Abstraction
暂无分享,去创建一个
Michael L. Littman | David Abel | D. Ellis Hershkowitz | M. Littman | David Abel | D. E. Hershkowitz
[1] Robert L. Smith,et al. Aggregation in Dynamic Programming , 1987, Oper. Res..
[2] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[3] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[4] D. B. Davis,et al. Intel Corp. , 1993 .
[5] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[6] Leslie Pack Kaelbling,et al. On the Complexity of Solving Markov Decision Problems , 1995, UAI.
[7] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[8] Craig Boutilier,et al. Abstraction and Approximate Decision-Theoretic Planning , 1997, Artif. Intell..
[9] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[10] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[12] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[13] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[14] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[15] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[16] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[17] Yishay Mansour,et al. Approximate Equivalence of Markov Decision Processes , 2003, COLT.
[18] Doina Precup,et al. Metrics for Finite Markov Decision Processes , 2004, AAAI.
[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[20] Peter Stone,et al. State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.
[21] Bram Bakker,et al. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .
[22] Doina Precup,et al. Methods for Computing State Similarity in Markov Decision Processes , 2006, UAI.
[23] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[24] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[25] Thomas J. Walsh. Transferring State Abstractions Between MDPs , 2006 .
[26] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[27] Antoine Dutot,et al. GraphStream: A Tool for bridging the gap between Complex Systems and Dynamic Graphs , 2008, ArXiv.
[28] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[29] Ronald Ortner,et al. Noname manuscript No. (will be inserted by the editor) Adaptive Aggregation for Reinforcement Learning in Average Reward Markov Decision Processes , 2022 .
[30] Thomas G. Dietterich,et al. State Aggregation in Monte Carlo Tree Search , 2014, AAAI.
[31] Marcus Hutter,et al. Extreme State Aggregation beyond MDPs , 2014, ALT.
[32] Nan Jiang,et al. Improving UCT planning via approximate homomorphisms , 2014, AAMAS.
[33] Nan Jiang,et al. Abstraction Selection in Model-based Reinforcement Learning , 2015, ICML.
[34] Stefanie Tellex,et al. Goal-Based Action Priors , 2015, ICAPS.
[35] Marcus Hutter,et al. Extreme state aggregation beyond Markov decision processes , 2016, Theor. Comput. Sci..
[36] Zoran Popovic,et al. Efficient Bayesian Clustering for Reinforcement Learning , 2016, IJCAI.