Active deep Q-learning with demonstration
暂无分享,去创建一个
Masashi Sugiyama | Hsuan-Tien Lin | Voot Tangkaratt | Si-An Chen | Masashi Sugiyama | Hsuan-Tien Lin | Voot Tangkaratt | Si-An Chen
[1] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[2] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[3] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[4] Traian Rebedea,et al. Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay , 2016, ArXiv.
[5] Andrea Lockerd Thomaz,et al. Exploration from Demonstration for Interactive Reinforcement Learning , 2016, AAMAS.
[6] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[7] Rajesh P. N. Rao,et al. Active Imitation Learning , 2007, AAAI.
[8] William A. Gale,et al. A sequential algorithm for training text classifiers , 1994, SIGIR '94.
[9] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .
[10] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[11] Jianfeng Gao,et al. Efficient Exploration for Dialog Policy Learning with Deep BBQ Networks \& Replay Buffer Spiking , 2016, ArXiv.
[12] Sonia Chernova,et al. Reinforcement Learning from Demonstration through Shaping , 2015, IJCAI.
[13] Bartosz Krawczyk,et al. Online query by committee for active learning from drifting data streams , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).
[14] Matthieu Geist,et al. Boosted Bellman Residual Minimization Handling Expert Demonstrations , 2014, ECML/PKDD.
[15] Byron Boots,et al. Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction , 2017, ICML.
[16] Jiashi Feng,et al. Policy Optimization with Demonstrations , 2018, ICML.
[17] Tom M. Mitchell,et al. Generalization as Search , 2002 .
[18] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[19] Thomas G. Dietterich,et al. Active lmitation learning: formal and practical reductions to I.I.D. learning , 2014, J. Mach. Learn. Res..
[20] Alex M. Andrew,et al. ROBOT LEARNING, edited by Jonathan H. Connell and Sridhar Mahadevan, Kluwer, Boston, 1993/1997, xii+240 pp., ISBN 0-7923-9365-1 (Hardback, 218.00 Guilders, $120.00, £89.95). , 1999, Robotica (Cambridge. Print).
[21] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[22] Sonia Chernova,et al. Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.
[23] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[24] Burr Settles,et al. Active Learning Literature Survey , 2009 .
[25] Shlomo Argamon,et al. Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .
[26] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[27] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[28] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[29] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[30] Alborz Geramifard,et al. RLPy: a value-function-based reinforcement learning framework for education and research , 2015, J. Mach. Learn. Res..
[31] Zoubin Ghahramani,et al. Deep Bayesian Active Learning with Image Data , 2017, ICML.
[32] Sonia Chernova,et al. Learning from Demonstration for Shaping through Inverse Reinforcement Learning , 2016, AAMAS.
[33] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[34] Matthew E. Taylor,et al. Improving Reinforcement Learning with Confidence-Based Demonstrations , 2017, IJCAI.
[35] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[36] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .