Action Selection for Composable Modular Deep Reinforcement Learning
暂无分享,去创建一个
Akshat Kumar | Praveen Paruchuri | Daksh Anand | Vaibhav Gupta | P. Paruchuri | Akshat Kumar | Vaibhav Gupta | Daksh Anand
[1] R. Mazo. On the theory of brownian motion , 1973 .
[2] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[3] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.
[4] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[5] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[6] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[7] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[8] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[9] Stuart J. Russell,et al. Q-Decomposition for Reinforcement Learning Agents , 2003, ICML.
[10] Christopher L. Simpkins,et al. Composable Modular Reinforcement Learning , 2019, AAAI.
[11] Alessandro Saffiotti,et al. A Multivalued Logic Approach to Integrating Planning and Control , 1995, Artif. Intell..
[12] Mark Humphreys,et al. Action selection methods using reinforcement learning , 1997 .
[13] Gregor Schöner,et al. A dynamical systems approach to task-level system integration used to plan and control autonomous vehicle motion , 1992, Robotics Auton. Syst..
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] Feudal Q-LearningPeter Dayan. Feudal Q-learning , 1995 .
[16] Sergey Levine,et al. Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning? , 2019, ArXiv.
[17] Taeseok Jin,et al. Command Fusion Based Fuzzy Controller Design for Moving Obstacle Avoidance of Mobile Robot , 2013 .
[18] Rodney A. Brooks,et al. Achieving Artificial Intelligence through Building Robots , 1986 .
[19] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[20] Kagan Tumer,et al. Unifying temporal and structural credit assignment problems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[21] Mark Humphrys. W-learning: Competition among selfish Q-learners , 1995 .
[22] Balaraman Ravindran,et al. Advice Replay Approach for Richer Knowledge Transfer in Teacher Student Framework , 2019, AAMAS.
[23] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[24] Jonas Karlsson,et al. Learning to Solve Multiple Goals , 1997 .
[25] Yoram Koren,et al. Potential field methods and their inherent limitations for mobile robot navigation , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.
[26] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .
[27] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[28] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[29] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[30] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[31] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[32] Ashutosh Saxena,et al. High speed obstacle avoidance using monocular vision and reinforcement learning , 2005, ICML.
[33] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[34] Craig Boutilier,et al. Data center cooling using model-predictive control , 2018, NeurIPS.
[35] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[36] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[37] Michael Mateas,et al. Towards adaptive programming: integrating reinforcement learning into a programming language , 2008, OOPSLA.
[38] Dana H. Ballard,et al. Multiple-Goal Reinforcement Learning with Modular Sarsa(0) , 2003, IJCAI.
[39] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[40] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[41] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.
[42] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[43] Rodney A. Brooks,et al. MIT mobile robots-what's next? , 1988, Proceedings. 1988 IEEE International Conference on Robotics and Automation.
[44] Balaraman Ravindran,et al. An Enhanced Advising Model in Teacher-Student Framework using State Categorization , 2021, AAAI.
[45] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[46] Vinny Cahill,et al. Distributed W-Learning: Multi-Policy Optimization in Self-Organizing Systems , 2009, 2009 Third IEEE International Conference on Self-Adaptive and Self-Organizing Systems.