Model-Based Planning with Discrete and Continuous Actions
暂无分享,去创建一个
[1] S. Dreyfus. The numerical solution of variational problems , 1962 .
[2] Jürgen Schmidhuber,et al. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[3] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[4] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[5] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..
[6] D. Signorini,et al. Neural networks , 1995, The Lancet.
[7] Pat Langley,et al. Crafting Papers on Machine Learning , 2000, ICML.
[8] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[9] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..
[10] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[11] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[12] Flavien Balbo,et al. Using a monte-carlo approach for bus regulation , 2009, 2009 12th International IEEE Conference on Intelligent Transportation Systems.
[13] Richard B. Segal,et al. On the Scalability of Parallel UCT , 2010, Computers and Games.
[14] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[15] Ashish Sabharwal,et al. Guiding Combinatorial Optimization with UCT , 2012, CPAIOR.
[16] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[17] Jordan L. Boyd-Graber,et al. Don't Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation , 2014, EMNLP.
[18] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[19] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[22] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[23] Sergey Levine,et al. Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[24] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.
[25] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[26] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[27] Sergey Levine,et al. Deep Reinforcement Learning for Robotic Manipulation , 2016, ArXiv.
[28] Razvan Pascanu,et al. Metacontrol for Adaptive Imagination-Based Optimization , 2017, ICLR.
[29] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[30] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[31] Razvan Pascanu,et al. Learning model-based planning from scratch , 2017, ArXiv.
[32] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.