Reachability and Differential based Heuristics for Solving Markov Decision Processes
暂无分享,去创建一个
[1] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[2] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .
[3] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[4] D. J. White,et al. A Survey of Applications of Markov Decision Processes , 1993 .
[5] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[6] Groupe Pdmia. Markov Decision Processes In Artificial Intelligence , 2009 .
[7] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.
[8] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[9] J. Shanthikumar,et al. First-passage times with PF r densities , 1985, Journal of Applied Probability.
[10] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[11] Olivier Buffet,et al. Markov Decision Processes in Artificial Intelligence: Sigaud/Markov Decision Processes in Artificial Intelligence , 2013 .
[12] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[13] David Andre,et al. Generalized Prioritized Sweeping , 1997, NIPS.
[14] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[15] Gerald L. Thompson,et al. Finite Mathematical Structures , 1960 .
[16] Shlomo Zilberstein,et al. LAO*: A heuristic search algorithm that finds solutions with loops , 2001, Artif. Intell..
[17] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[18] Marco Wiering,et al. Reinforcement Learning and Markov Decision Processes , 2012, Reinforcement Learning.
[19] Craig Boutilier,et al. Stochastic dynamic programming with factored representations , 2000, Artif. Intell..
[20] Blai Bonet,et al. Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.
[21] Ronen I. Brafman,et al. Structured Reachability Analysis for Markov Decision Processes , 1998, UAI.
[22] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[23] John G. Kemeny,et al. Finite Mathematical Structures , 1960 .
[24] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[25] Kevin D. Seppi,et al. Prioritization Methods for Accelerating MDP Solvers , 2005, J. Mach. Learn. Res..
[26] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.