Hybrid Reward Architecture for Reinforcement Learning
暂无分享,去创建一个
Romain Laroche | Harm van Seijen | Mehdi Fatemi | Joshua Romoff | Jeffrey Tsang | Tavian Barnes | H. V. Seijen | Mehdi Fatemi | R. Laroche | Joshua Romoff | Tavian Barnes | Jeffrey Tsang
[1] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[2] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[3] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[4] K. Jellinger. Cortex and Mind. Unifying Cognition , 2003 .
[5] Dana H. Ballard,et al. Multiple-Goal Reinforcement Learning with Modular Sarsa(0) , 2003, IJCAI.
[6] Stuart J. Russell,et al. Q-Decomposition for Reinforcement Learning Agents , 2003, ICML.
[7] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[8] Andrew G. Barto,et al. Intrinsically Motivated Reinforcement Learning: A Promising Framework for Developmental Robot Learning , 2005 .
[9] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[10] M. Gluck,et al. Learning and Memory: From Brain to Behavior , 2007 .
[11] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[12] Shimon Whiteson,et al. A theoretical and empirical analysis of Expected Sarsa , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[13] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[14] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[15] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[16] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[17] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[18] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[19] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[20] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[21] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[22] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[23] David Silver,et al. Learning values across many orders of magnitude , 2016, NIPS.
[24] Alex Graves,et al. Strategic Attentive Writer for Learning Macro-Actions , 2016, NIPS.
[25] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[26] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[27] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[28] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[29] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[30] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.