Deep Q-learning From Demonstrations
暂无分享,去创建一个
Tom Schaul | Andrew Sendonaris | Joel Z. Leibo | Olivier Pietquin | Gabriel Dulac-Arnold | Ian Osband | Todd Hester | Marc Lanctot | Bilal Piot | Dan Horgan | Matej Vecerík | John Quan | John Agapiou | Audrunas Gruslys | Dan Horgan | J. Agapiou | T. Schaul | Matej Vecerík | Ian Osband | Bilal Piot | John Quan | Marc Lanctot | A. Gruslys | Todd Hester | O. Pietquin | A. Sendonaris | Gabriel Dulac-Arnold
[1] Matthieu Geist,et al. Boosted Bellman Residual Minimization Handling Expert Demonstrations , 2014, ECML/PKDD.
[2] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[3] Alessandro Lazaric,et al. Direct Policy Iteration with Demonstrations , 2015, IJCAI.
[4] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[5] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[6] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[7] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[8] Matthieu Geist,et al. Learning from Demonstrations: Is It Worth Estimating a Reward Function? , 2013, ECML/PKDD.
[9] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[10] Tsuyoshi Murata,et al. {m , 1934, ACML.
[11] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[12] Marc G. Bellemare,et al. The Reactor: A Sample-Efficient Actor-Critic Architecture , 2017, ArXiv.
[13] Byron Boots,et al. Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction , 2017, ICML.
[14] A. P. Hyper-parameters. Count-Based Exploration with Neural Density Models , 2017 .
[15] Peter Stone,et al. TEXPLORE: real-time sample-efficient reinforcement learning for robots , 2012, Machine Learning.
[16] Tom Schaul,et al. Learning from Demonstrations for Real World Reinforcement Learning , 2017, ArXiv.
[17] Yuxi Li,et al. Deep Reinforcement Learning , 2018, Reinforcement Learning for Cyber-Physical Systems.
[18] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[19] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[20] Matthieu Geist,et al. Boosted and reward-regularized classification for apprenticeship learning , 2014, AAMAS.
[21] Sonia Chernova,et al. Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.
[22] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[23] David Silver,et al. Learning values across many orders of magnitude , 2016, NIPS.
[24] Jianfeng Gao,et al. Efficient Exploration for Dialog Policy Learning with Deep BBQ Networks \& Replay Buffer Spiking , 2016, ArXiv.
[25] Sonia Chernova,et al. Reinforcement Learning from Demonstration through Shaping , 2015, IJCAI.
[26] Michael L. Littman,et al. Apprenticeship Learning About Multiple Intentions , 2011, ICML.
[27] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.
[28] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[29] Andrea Lockerd Thomaz,et al. Exploration from Demonstration for Interactive Reinforcement Learning , 2016, AAMAS.
[30] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[31] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[32] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..
[33] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[34] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[35] Traian Rebedea,et al. Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay , 2016, ArXiv.
[36] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[37] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[38] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[39] Andrea Lockerd Thomaz,et al. Policy Shaping with Human Teachers , 2015, IJCAI.
[40] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[41] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[42] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[43] Moshe Dor,et al. אבן, and: Stone , 2017 .
[44] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[45] Sonia Chernova,et al. Learning from Demonstration for Shaping through Inverse Reinforcement Learning , 2016, AAMAS.
[46] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.