LoOP: Iterative learning for optimistic planning on robots
暂无分享,去创建一个
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[3] Razvan Pascanu,et al. Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.
[4] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.
[5] Marc D. Killpack,et al. A Versatile Multi-Robot Monte Carlo Tree Search Planner for On-Line Coverage Path Planning , 2020, ArXiv.
[6] George Konidaris,et al. Constructing Abstraction Hierarchies Using a Skill-Symbol Loop , 2015, IJCAI.
[7] Daniele Nardi,et al. Q-CP: Learning Action Values for Cooperative Planning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[8] Sergey Levine,et al. Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[10] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[11] George Konidaris,et al. An Analysis of Monte Carlo Tree Search , 2017, AAAI.
[12] Zhiyong Liu,et al. Learning Individual Features to Decompose State Space for Robotic Skill Learning , 2020, 2020 Chinese Control And Decision Conference (CCDC).
[13] Richard S. Sutton,et al. Temporal-difference search in computer Go , 2012, Machine Learning.
[14] Aleksander Czechowski,et al. Decentralized MCTS via Learned Teammate Models , 2020, IJCAI.
[15] Sylvain Gelly,et al. Modifications of UCT and sequence-like simulations for Monte-Carlo Go , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.
[16] Martial Hebert,et al. Improved Learning of Dynamics Models for Control , 2016, ISER.
[17] Alejandro Agostini,et al. Reinforcement Learning with a Gaussian mixture model , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[18] Yevgen Chebotar,et al. Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[19] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[20] Peter Stone,et al. iCORPP: Interleaved Commonsense Reasoning and Probabilistic Planning on Robots , 2020, ArXiv.
[21] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[22] Yoshihiko Nakamura,et al. Sequential Monte Carlo controller that integrates physical consistency and motion knowledge , 2018, Auton. Robots.
[23] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[24] Girish Chowdhary,et al. Off-policy reinforcement learning with Gaussian processes , 2014, IEEE/CAA Journal of Automatica Sinica.
[25] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[26] Stefanos Nikolaidis,et al. Efficient Model Learning from Joint-Action Demonstrations for Human-Robot Collaborative Tasks , 2015, 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[27] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[28] Larry P. Heck,et al. Robot Learning by Collaborative Network Training: A Self-Supervised Method using Ranking , 2019, AAMAS.
[29] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[30] Darwin G. Caldwell,et al. Learning and Reproduction of Gestures by Imitation , 2010, IEEE Robotics & Automation Magazine.
[31] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[32] Scott Kuindersma,et al. Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..
[33] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[34] David Silver,et al. Monte-Carlo tree search and rapid action value estimation in computer Go , 2011, Artif. Intell..
[35] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[36] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[37] Prasad Tadepalli,et al. Solving multiagent assignment Markov decision processes , 2009, AAMAS.
[38] Jessica B. Hamrick,et al. Combining Q-Learning and Search with Amortized Value Estimates , 2020, ICLR.
[39] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[40] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[41] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[42] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[43] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..
[44] Shlomo Zilberstein,et al. Policy Iteration for Decentralized Control of Markov Decision Processes , 2009, J. Artif. Intell. Res..
[45] Jun Morimoto,et al. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..
[46] Yuchen Cui,et al. Uncertainty-Aware Data Aggregation for Deep Imitation Learning , 2019, 2019 International Conference on Robotics and Automation (ICRA).
[47] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[48] Sergey Levine,et al. Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[49] M.R. Meybodi,et al. Solving Multi-Agent Markov Decision Processes using learning automata , 2008, 2008 6th International Symposium on Intelligent Systems and Informatics.
[50] Lynne E. Parker,et al. A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains , 2005, J. Intell. Robotic Syst..
[51] Jorge Cortés,et al. Exploiting Bias for Cooperative Planning in Multi-Agent Tree Search , 2020, IEEE Robotics and Automation Letters.
[52] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[53] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[54] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[55] Ciro Potena,et al. Automatic model based dataset generation for fast and accurate crop and weeds detection , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[56] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[57] Tom Schaul,et al. Better Generalization with Forecasts , 2013, IJCAI.
[58] Steven M. LaValle,et al. Rapidly-Exploring Random Trees: Progress and Prospects , 2000 .
[59] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[60] Bo He,et al. Improving Interactive Reinforcement Agent Planning with Human Demonstration , 2019, ArXiv.
[61] Peter Stone,et al. A synthesis of automated planning and reinforcement learning for efficient, robust decision-making , 2016, Artif. Intell..
[62] J. Andrew Bagnell,et al. Reinforcement and Imitation Learning via Interactive No-Regret Learning , 2014, ArXiv.
[63] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .