暂无分享,去创建一个
Kamyar Azizzadenesheli | Anima Anandkumar | Alessandro Lazaric | A. Lazaric | K. Azizzadenesheli | Anima Anandkumar
[1] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[2] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[3] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[4] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[5] Theodore J. Perkins,et al. Reinforcement learning for POMDPs based on action values and stochastic optimization , 2002, AAAI/IAAI.
[6] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[7] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[8] Joelle Pineau,et al. Bayes-Adaptive POMDPs , 2007, NIPS.
[9] Omid Madani. On the Computability of Infinite-Horizon Partially Observable Markov Decision Processes , 2007 .
[10] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[11] Pascal Poupart,et al. Model-based Bayesian Reinforcement Learning in Partially Observable Domains , 2008, ISAIM.
[12] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..
[13] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[14] Anima Anandkumar,et al. A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.
[15] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[16] Alessandro Lazaric,et al. Regret Bounds for Reinforcement Learning with Policy Advice , 2013, ECML/PKDD.
[17] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..
[18] Le Song,et al. Nonparametric Estimation of Multi-View Latent Variable Models , 2013, ICML.
[19] Joelle Pineau,et al. Efficient learning and planning with compressed predictive states , 2013, J. Mach. Learn. Res..
[20] Alexander J. Smola,et al. Fast and Guaranteed Tensor Decomposition via Sketching , 2015, NIPS.
[21] Kamyar Azizzadenesheli,et al. Reinforcement Learning of POMDPs using Spectral Methods , 2016, COLT.
[22] John Langford,et al. Contextual-MDPs for PAC-Reinforcement Learning with Rich Observations , 2016, ArXiv.
[23] Kamyar Azizzadenesheli,et al. Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies , 2016, COLT.
[24] John Langford,et al. PAC Reinforcement Learning with Rich Observations , 2016, NIPS.