暂无分享,去创建一个
Nan Jiang | Yisong Yue | Miroslav Dudík | Hal Daumé | Alekh Agarwal | Hoang Minh Le | Yisong Yue | Alekh Agarwal | Miroslav Dudík | Hal Daumé | Nan Jiang
[1] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[2] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.
[3] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[4] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[5] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[6] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[7] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[8] John Langford,et al. Search-based structured prediction , 2009, Machine Learning.
[9] Nicholas Roy,et al. PUMA: Planning Under Uncertainty with Macro-Actions , 2010, AAAI.
[10] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[11] Eyke Hüllermeier,et al. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm , 2012, Mach. Learn..
[12] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..
[13] J. Andrew Bagnell,et al. Reinforcement and Imitation Learning via Interactive No-Regret Learning , 2014, ArXiv.
[14] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[15] David L. Roberts,et al. Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning , 2015, Autonomous Agents and Multi-Agent Systems.
[16] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[17] John Langford,et al. Learning to Search Better than Your Teacher , 2015, ICML.
[18] Yisong Yue,et al. Generating Long-term Trajectories Using Deep Hierarchical Networks , 2016, NIPS.
[19] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[20] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[21] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[22] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[23] Peter Stone,et al. Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.
[24] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[25] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[26] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[27] Hannes Schulz,et al. Frames: a corpus for adding memory to goal-oriented dialogue systems , 2017, SIGDIAL Conference.
[28] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[29] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[30] Byron Boots,et al. Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction , 2017, ICML.
[31] Kam-Fai Wong,et al. Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning , 2017, EMNLP.
[32] Alessandro Lazaric,et al. Exploration – Exploitation in MDPs with Options , 2016 .
[33] Stefanie Tellex,et al. Deep Abstract Q-Networks , 2017, AAMAS.
[34] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[35] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.