暂无分享,去创建一个
Kamyar Azizzadenesheli | Anima Anandkumar | Emma Brunskill | Zachary C. Lipton | Brandon Yang | Weitang Liu
[1] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[2] Fernando Diaz,et al. Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains , 2016, ArXiv.
[3] Jing He,et al. Policy Networks with Two-Stage Training for Dialogue Systems , 2016, SIGDIAL Conference.
[4] Thomas B. Schön,et al. From Pixels to Torques: Policy Learning with Deep Dynamical Models , 2015, ICML 2015.
[5] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[6] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[7] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[8] Kamyar Azizzadenesheli,et al. Reinforcement Learning of POMDPs using Spectral Methods , 2016, COLT.
[9] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[10] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[11] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[12] David Vandyke,et al. A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.
[13] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[14] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.
[15] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[16] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[17] Csaba Szepesvári,et al. Partial Monitoring - Classification, Regret Bounds, and Algorithms , 2014, Math. Oper. Res..
[18] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.
[19] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[20] A. P. Hyper-parameters. Count-Based Exploration with Neural Density Models , 2017 .
[21] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[22] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[23] Heriberto Cuayáhuitl,et al. SimpleDS: A Simple Deep Reinforcement Learning Dialogue System , 2016, IWSDS.
[24] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..
[25] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[26] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[27] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Kamyar Azizzadenesheli,et al. Efficient Exploration Through Bayesian Deep Q-Networks , 2018, 2018 Information Theory and Applications Workshop (ITA).
[29] Jianfeng Gao,et al. Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear , 2016, ArXiv.
[30] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[31] Kamyar Azizzadenesheli,et al. Reinforcement Learning in Rich-Observation MDPs using Spectral Methods , 2016, 1611.03907.
[32] Zachary Chase Lipton,et al. Efficient Exploration for Dialogue Policy Learning with BBQ Networks & Replay Buffer Spiking , 2016 .
[33] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[34] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.
[35] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..