暂无分享,去创建一个
Jonathan P. How | Christopher Amato | Jason Pazis | Shayegan Omidshafiei | John Vian | J. How | Shayegan Omidshafiei | Jason Pazis | Chris Amato | J. Vian
[1] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[2] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[3] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[4] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[5] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[6] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[8] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[9] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[10] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[11] Olivier Buffet,et al. Multi-Agent Systems by Incremental Gradient Reinforcement Learning , 2001, IJCAI.
[12] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[13] Daniel Kudenko,et al. Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.
[14] Masayuki Yamamura,et al. Multitask reinforcement learning on the distribution of MDPs , 2003, Proceedings 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation. Computational Intelligence in Robotics and Automation for the New Millennium (Cat. No.03EX694).
[15] Tanaka Fumihide,et al. Multitask Reinforcement Learning on the Distribution of MDPs , 2003 .
[16] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[17] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.
[18] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[19] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[20] Dan Ventura,et al. Predicting and Preventing Coordination Problems in Cooperative Q-learning Systems , 2007, IJCAI.
[21] Guillaume J. Laurent,et al. Hysteretic q-learning :an algorithm for decentralized reinforcement learning in cooperative multi-agent teams , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[22] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[23] Alan Fern,et al. Learning and transferring roles in multi-agent MDPs , 2008, AAAI 2008.
[24] Alan Fern,et al. Learning and Transferring Roles in Multi-Agent Reinforcement , 2008 .
[25] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[26] Shlomo Zilberstein,et al. Incremental Policy Generation for Finite-Horizon DEC-POMDPs , 2009, ICAPS.
[27] Feng Wu,et al. Rollout Sampling Policy Iteration for Decentralized POMDPs , 2010, UAI.
[28] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[29] Bart De Schutter,et al. Multi-agent Reinforcement Learning: An Overview , 2010 .
[30] Lakhmi C. Jain,et al. Innovations in Multi-Agent Systems and Applications - 1 , 2010 .
[31] N. Le Fort-Piat,et al. The world of independent learners is not markovian , 2011, Int. J. Knowl. Based Intell. Eng. Syst..
[32] Bikramjit Banerjee,et al. Sample Bounded Distributed Reinforcement Learning for Decentralized POMDPs , 2012, AAAI.
[33] Wei Zhang,et al. Multiagent-Based Reinforcement Learning for Optimal Reactive Power Dispatch , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[34] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[35] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[36] Feng Wu,et al. Monte-Carlo Expectation Maximization for Decentralized POMDPs , 2013, IJCAI.
[37] Siobhán Clarke,et al. Transfer learning in multi-agent systems through parallel transfer , 2013 .
[38] Lihong Li,et al. Sample Complexity of Multi-task Reinforcement Learning , 2013, UAI.
[39] Panagiotis Tzionas,et al. A robust approach for multi-agent natural resource allocation based on stochastic optimization algorithms , 2014, Appl. Soft Comput..
[40] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[41] Jonathan P. How,et al. Stick-Breaking Policy Learning in Dec-POMDPs , 2015, IJCAI.
[42] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[43] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[44] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[45] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[46] Shimon Whiteson,et al. Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks , 2016, ArXiv.
[47] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[48] Jonathan P. How,et al. Learning for Decentralized Control of Multiagent Systems in Large, Partially-Observable Stochastic Environments , 2016, AAAI.
[49] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.