The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously
暂无分享,去创建一个
Misha Denil | Sergio Gomez Colmenarejo | Matthew W. Hoffman | Nando de Freitas | Ziyu Wang | Serkan Cabi | Ziyun Wang | N. D. Freitas | Misha Denil | Serkan Cabi
[1] Terry Winograd,et al. Understanding natural language , 1974 .
[2] E. Thelen. Rhythmical stereotypies in normal human infants , 1979, Animal Behaviour.
[3] R. A. Brooks,et al. Intelligence without Representation , 1991, Artif. Intell..
[4] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.
[5] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[6] J. W. Sparling,et al. Fetal and neonatal hand movement. , 1999, Physical therapy.
[7] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[8] Giulio Sandini,et al. Developmental robotics: a survey , 2003, Connect. Sci..
[9] Stuart J. Russell,et al. Q-Decomposition for Reinforcement Learning Agents , 2003, ICML.
[10] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[11] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[12] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[13] H. G. Marques,et al. Twitching in Sensorimotor Development from Sleeping Rats to Robots , 2013, Current Biology.
[14] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[15] Wojciech Zaremba,et al. Learning to Execute , 2014, ArXiv.
[16] Eric Eaton,et al. Online Multi-Task Learning for Policy Gradient Methods , 2014, ICML.
[17] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[18] K. Adolph,et al. 4 Motor Development , 2015 .
[19] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Jianfeng Gao,et al. Recurrent Reinforcement Learning: A Hybrid Approach , 2015, ArXiv.
[22] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[23] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[24] Yulia Tsvetkov,et al. Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning , 2016, ACL.
[25] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[26] Nando de Freitas,et al. Neural Programmer-Interpreters , 2015, ICLR.
[27] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[28] Demis Hassabis,et al. Grounded Language Learning in a Simulated 3D World , 2017, ArXiv.
[29] Misha Denil,et al. Programmable Agents , 2017, ArXiv.
[30] Wei Xu,et al. A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment , 2017, ArXiv.
[31] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[32] Romain Laroche,et al. Hybrid Reward Architecture for Reinforcement Learning , 2017, NIPS.
[33] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[34] Vladlen Koltun,et al. Learning to Act by Predicting the Future , 2016, ICLR.
[35] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[36] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.
[37] Ufuk Topcu,et al. Environment-Independent Task Specifications via GLTL , 2017, ArXiv.