暂无分享,去创建一个
[1] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[2] Joelle Pineau,et al. Efficient learning and planning with compressed predictive states , 2013, J. Mach. Learn. Res..
[3] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[4] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..
[5] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[6] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[7] Peter Dayan,et al. Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search , 2013, J. Artif. Intell. Res..
[8] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[9] Joelle Pineau,et al. Anytime Point-Based Approximations for Large POMDPs , 2006, J. Artif. Intell. Res..
[10] Dean Alderucci. A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .
[11] Yunlong Liu,et al. Predictive State Representations with State Space Partitioning , 2015, AAMAS.
[12] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..
[13] Ariadna Quattoni,et al. Spectral learning of weighted automata , 2014, Machine Learning.
[14] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[15] Michèle Sebag,et al. The grand challenge of computer Go , 2012, Commun. ACM.
[16] Joelle Pineau,et al. Online Planning Algorithms for POMDPs , 2008, J. Artif. Intell. Res..
[17] Joelle Pineau,et al. Theoretical Analysis of Heuristic Search Methods for Online POMDPs , 2007, NIPS.
[18] Frans A. Oliehoek,et al. Bayesian Reinforcement Learning in Factored POMDPs , 2018, AAMAS.
[19] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[20] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[21] David Hsu,et al. DESPOT: Online POMDP Planning with Regularization , 2013, NIPS.
[22] Frans A. Oliehoek,et al. Learning in POMDPs with Monte Carlo Tree Search , 2017, ICML.
[23] Erik Talvitie,et al. Learning to Make Predictions In Partially Observable Environments Without a Generative Model , 2011, J. Artif. Intell. Res..
[24] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[25] Yunlong Liu,et al. Solving partially observable problems with inaccurate PSR models , 2014, Inf. Sci..
[26] Yunlong Liu,et al. Learning Predictive State Representations via Monte-Carlo Tree Search , 2016, IJCAI.
[27] Yunlong Liu,et al. Basis selection in spectral learning of predictive state representations , 2018, Neurocomputing.
[28] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[29] Joelle Pineau,et al. Bayes-Adaptive POMDPs , 2007, NIPS.
[30] Joelle Pineau,et al. A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes , 2011, J. Mach. Learn. Res..
[31] David Silver,et al. Monte-Carlo tree search and rapid action value estimation in computer Go , 2011, Artif. Intell..
[32] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[33] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[34] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.