Cooperative Multi-Agent Reinforcement Learning for Multi-Component Robotic Systems: guidelines for future research
暂无分享,去创建一个
Manuel Graña | José Manuel López-Guede | Borja Fernández-Gauna | J. M. López-Guede | M. Graña | B. Fernández-Gauna
[1] Fengyu Zhou,et al. Mobile Robot Path Planning Based on Q-ANN , 2007, 2007 IEEE International Conference on Automation and Logistics.
[2] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[3] Ramón Moreno,et al. Experiments on Robotic Multi-agent System for Hose Deployment and Transportation , 2010, PAAMS.
[4] Ming Tan,et al. Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.
[5] Jing Shen,et al. Multi-Agent Hierarchical Reinforcement Learning by Integrating Options into MAXQ , 2006, First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS'06).
[6] Sridhar Mahadevan,et al. Decision-Theoretic Planning with Concurrent Temporally Extended Actions , 2001, UAI.
[7] Sridhar Mahadevan,et al. Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.
[8] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[9] Carlos Sagüés,et al. Distributed consensus algorithms for merging feature-based maps with limited communication , 2011, Robotics Auton. Syst..
[10] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[11] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[12] Sridhar Mahadevan,et al. Learning to communicate and act using hierarchical reinforcement learning , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[13] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[14] Jingyuan Zhang,et al. Application of Artificial Neural Network Based on Q-learning for Mobile Robot Path Planning , 2006, 2006 IEEE International Conference on Information Acquisition.
[15] Randal W. Beard,et al. Distributed Consensus in Multi-vehicle Cooperative Control - Theory and Applications , 2007, Communications and Control Engineering.
[16] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[17] Norihiko Ono,et al. A Modular Approach to Multi-Agent Reinforcement Learning , 1996, ECAI Workshop LDAIS / ICMAS Workshop LIOME.
[18] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[19] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[20] Richard J. Duro,et al. On the potential contributions of hybrid intelligent approaches to Multicomponent Robotic System development , 2010, Inf. Sci..
[21] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[22] Robert Fitch,et al. Structural Abstraction Experiments in Reinforcement Learning , 2005, Australian Conference on Artificial Intelligence.
[23] Abdelhamid Mellouk,et al. Advances in Reinforcement Learning , 2011 .
[24] Manuela Veloso,et al. Scalable Learning in Stochastic Games , 2002 .
[25] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[26] Laurent Jeanpierre,et al. Distributed value functions for multi-robot exploration , 2012, 2012 IEEE International Conference on Robotics and Automation.
[27] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[28] Nikos A. Vlassis,et al. Anytime algorithms for multiagent decision making using coordination graphs , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).
[29] Daniel Kudenko,et al. Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.
[30] M. Asada,et al. Modular Learning Systems for Soccer Robot , 2004 .
[31] Sridhar Mahadevan,et al. Robot Learning , 1993 .
[32] Daphne Koller,et al. Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.
[33] Manuel Graña,et al. Linked multi-component mobile robots: Modeling, simulation and control , 2010, Robotics Auton. Syst..
[34] Mark Humphrys,et al. Action Selection methods using Reinforcement Learning , 1996 .
[35] Andrew G. Barto,et al. Causal Graph Based Decomposition of Factored MDPs , 2006, J. Mach. Learn. Res..
[36] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[37] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[38] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[39] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[40] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[41] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[42] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[43] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[44] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[45] Jonas Karlsson,et al. Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging , 1993 .
[46] Prasad Tadepalli,et al. Scaling Up Average Reward Reinforcement Learning by Approximating the Domain Models and the Value Function , 1996, ICML.
[47] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[48] Yong Duan,et al. Fuzzy reinforcement learning and its application in robot navigation , 2005, 2005 International Conference on Machine Learning and Cybernetics.
[49] Bingrong Hong,et al. A Modular On-line Profit Sharing Approach in Multiagent Domains , 2008 .
[50] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[51] Steffen Udluft,et al. Solving Partially Observable Reinforcement Learning Problems with Recurrent Neural Networks , 2012, Neural Networks: Tricks of the Trade.
[52] Bruce L. Digney,et al. Learning hierarchical control structures for multiple tasks and changing environments , 1998 .
[53] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[54] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[55] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.
[56] Andrew G. Barto,et al. A causal approach to hierarchical decomposition of factored MDPs , 2005, ICML.
[57] Karl Tuyls,et al. An Overview of Cooperative and Competitive Multiagent Learning , 2005, LAMAS.
[58] Von-Wun Soo,et al. Subgoal Identification for Reinforcement Learning and Planning in Multiagent Problem Solving , 2007, MATES.
[59] Mark Humphreys,et al. Action selection methods using reinforcement learning , 1997 .
[60] Nikos A. Vlassis,et al. Sparse cooperative Q-learning , 2004, ICML.
[61] Von-Wun Soo,et al. AUTOMATIC COMPLEXITY REDUCTION IN REINFORCEMENT LEARNING , 2010, Comput. Intell..
[62] Matthew E. Taylor,et al. Abstraction and Generalization in Reinforcement Learning: A Summary and Framework , 2009, ALA.
[63] Nikos A. Vlassis,et al. Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..
[64] Mitsuo Kawato,et al. Inter-module credit assignment in modular reinforcement learning , 2003, Neural Networks.
[65] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[66] Thomas G. Dietterich. An Overview of MAXQ Hierarchical Reinforcement Learning , 2000, SARA.
[67] Abhijit Gosavi,et al. Self-Improving Factory Simulation using Continuous-time Average-Reward Reinforcement Learning , 2007 .
[68] David W. Aha,et al. Instance-Based Learning Algorithms , 1991, Machine Learning.
[69] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.
[70] William D. Smart. Explicit Manifold Representations for Value-Function Approximation in Reinforcement Learning , 2004, ISAIM.
[71] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.
[72] Hamid R. Berenji,et al. Fuzzy Reinforcement Learning and Dynamic Programming , 1993, Fuzzy Logic in Artificial Intelligence.
[73] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[74] H. R. Berenji,et al. Fuzzy Q-learning for generalization of reinforcement learning , 1996, Proceedings of IEEE 5th International Fuzzy Systems.
[75] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[76] Francisco S. Melo,et al. Coordinated learning in multiagent MDPs with infinite state-space , 2010, Autonomous Agents and Multi-Agent Systems.
[77] Peter Stone,et al. State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.
[78] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[79] Von-Wun Soo,et al. Subgoal Identifications in Reinforcement Learning: A Survey , 2011 .