Value Iteration Networks
暂无分享,去创建一个
Pieter Abbeel | Yi Wu | Aviv Tamar | Garrett Thomas | Sergey Levine | S. Levine | P. Abbeel | Aviv Tamar | Yi Wu | G. Thomas
[1] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[2] Thomas Hofmann,et al. Predicting Structured Data (Neural Information Processing) , 2007 .
[3] Jürgen Schmidhuber,et al. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[4] Shie Mannor,et al. Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations , 2014, ICML.
[5] John Langford,et al. Learning to Search Better than Your Teacher , 2015, ICML.
[6] R. Bellman. Dynamic programming. , 1957, Science.
[7] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[8] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[9] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[10] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
[11] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.
[12] Csaba Szepesvári,et al. Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.
[13] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[14] Sergey Levine,et al. Nonlinear Inverse Reinforcement Learning with Gaussian Processes , 2011, NIPS.
[15] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[16] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[17] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[18] Aude Billard,et al. Dynamical System Modulation for Robot Learning via Kinesthetic Demonstrations , 2008, IEEE Transactions on Robotics.
[19] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[20] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[21] Kyunghyun Cho,et al. End-to-End Goal-Driven Web Navigation , 2016, NIPS.
[22] J. Andrew Bagnell,et al. Reinforcement Planning: RL for optimal planners , 2012, 2012 IEEE International Conference on Robotics and Automation.
[23] Martial Hebert,et al. Activity Forecasting , 2012, ECCV.
[24] Camille Couprie,et al. Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[25] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[27] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .
[28] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[29] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[30] LeCunYann,et al. Learning Hierarchical Features for Scene Labeling , 2013 .
[31] Leslie Pack Kaelbling,et al. Hierarchical task and motion planning in the now , 2011, 2011 IEEE International Conference on Robotics and Automation.
[32] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.
[33] David Silver,et al. Learning to search: Functional gradient techniques for imitation learning , 2009, Auton. Robots.
[34] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[35] Wojciech Zaremba,et al. Learning Simple Algorithms from Examples , 2015, ICML.
[36] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[37] Doina Precup,et al. Bounding Performance Loss in Approximate MDP Homomorphisms , 2008, NIPS.
[38] Ashutosh Saxena,et al. Robobarista: Object Part Based Transfer of Manipulation Trajectories from Crowd-Sourcing in 3D Pointclouds , 2015, ISRR.
[39] Honglak Lee,et al. Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games , 2016, IJCAI.
[40] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[41] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[42] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.
[43] Alborz Geramifard,et al. Reinforcement learning with misspecified model classes , 2013, 2013 IEEE International Conference on Robotics and Automation.
[44] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[45] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[46] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[47] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[48] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[49] Kyunghyun Cho,et al. WebNav: A New Large-Scale Task for Natural Language based Sequential Decision Making , 2016, ArXiv.
[50] Jürgen Schmidhuber,et al. A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots , 2016, IEEE Robotics and Automation Letters.
[51] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[52] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[53] Sebastian Nowozin,et al. Structured Learning and Prediction in Computer Vision , 2011, Found. Trends Comput. Graph. Vis..
[54] P.J. Werbos,et al. Efficient Learning in Cellular Simultaneous Recurrent Neural Networks - The Case of Maze Navigation Problem , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[55] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[56] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[57] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[58] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[59] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[60] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[61] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.