Universal Value Function Approximators
暂无分享,去创建一个
Tom Schaul | David Silver | Karol Gregor | Daniel Horgan | Dan Horgan | T. Schaul | David Silver | Karol Gregor
[1] Peter Englert,et al. Multi-task policy search for robotics , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[2] Sanjoy Dasgupta,et al. Off-Policy Temporal Difference Learning with Function Approximation , 2001, ICML.
[3] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[4] Andrea Montanari,et al. Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.
[5] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[6] Martijn van Otterlo,et al. The Logic of Adaptive Behavior - Knowledge Representation and Algorithms for Adaptive Sequential Decision Making under Uncertainty in First-Order and Relational Domains , 2009, Frontiers in Artificial Intelligence and Applications.
[7] Peter Dayan,et al. Structure in the Space of Value Functions , 2002, Machine Learning.
[8] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[9] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[10] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.
[11] Tom Schaul,et al. Better Generalization with Forecasts , 2013, IJCAI.
[12] Richard S. Sutton,et al. Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..
[13] Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.
[14] Richard S. Sutton,et al. Temporal-Difference Networks , 2004, NIPS.
[15] Bruno Castro da Silva,et al. Learning Parameterized Skills , 2012, ICML.
[16] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[17] Andrew G. Barto,et al. Transfer in Reinforcement Learning via Shared Features , 2012, J. Mach. Learn. Res..
[18] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[19] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[20] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[21] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[22] Jan Peters,et al. Nonamemanuscript No. (will be inserted by the editor) Reinforcement Learning to Adjust Parametrized Motor Primitives to , 2011 .
[23] Andrea Montanari,et al. Matrix completion from a few entries , 2009, ISIT.
[24] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[25] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[26] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.
[27] Shalabh Bhatnagar,et al. Universal Option Models , 2014, NIPS.