Deep Reinforcement Learning Variants of Multi-Agent Learning Algorithms
暂无分享,去创建一个
[1] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[2] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[3] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[4] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[5] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[6] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[7] Larry D. Pyeatt,et al. Decision Tree Function Approximation in Reinforcement Learning , 1999 .
[8] Martin A. Riedmiller. Concepts and Facilities of a Neural Reinforcement Learning Control Architecture for Technical Process Control , 1999, Neural Computing & Applications.
[9] Sebastian Thrun,et al. Coastal Navigation with Mobile Robots , 1999, NIPS.
[10] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[11] Milind Tambe,et al. Multiagent teamwork: analyzing the optimality and complexity of key theories and models , 2002, AAMAS '02.
[12] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[13] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[14] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[15] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[16] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[17] Nikos A. Vlassis,et al. Utile Coordination: Learning Interdependencies Among Cooperative Agents , 2005, CIG.
[18] Kagan Tumer,et al. Quicker Q-Learning in Multi-Agent Systems , 2005 .
[19] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[20] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[21] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[22] Daniel Kudenko,et al. Reinforcement Learning of Coordination in Heterogeneous Cooperative Multi-agent Systems , 2005, Adaptive Agents and Multi-Agent Systems.
[23] Sridhar Mahadevan,et al. Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.
[24] Manuela M. Veloso,et al. Exploiting factored representations for decentralized execution in multiagent teams , 2007, AAMAS '07.
[25] Mirco Hering. Cooperative Multi-Agent Systems in Automobiles , 2008 .
[26] Peter Vrancx,et al. Learning multi-agent state space representations , 2010, AAMAS.
[27] Karl Tuyls,et al. Frequency adjusted multi-agent Q-learning , 2010, AAMAS.
[28] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[29] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[30] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[31] Gerhard Weiss,et al. Multiagent Learning: Basics, Challenges, and Prospects , 2012, AI Mag..
[32] Warren B. Powell,et al. An Intelligent Battery Controller Using Bias-Corrected Q-learning , 2012, AAAI.
[33] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[34] Alborz Geramifard,et al. A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning , 2013, Found. Trends Mach. Learn..
[35] Sherief Abdallah,et al. Addressing the policy-bias of q-learning by repeating updates , 2013, AAMAS.
[36] ImageNet Classification with Deep Convolutional Neural , 2013 .
[37] Minjie Zhang,et al. Multiagent Learning of Coordination in Loosely Coupled , 2014 .
[38] Jason Weston,et al. Question Answering with Subgraph Embeddings , 2014, EMNLP.
[39] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[40] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[41] Risto Miikkulainen,et al. A Neuroevolution Approach to General Atari Game Playing , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[42] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[43] Mikhail Pavlov,et al. Deep Attention Recurrent Q-Network , 2015, ArXiv.
[44] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[45] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[46] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[47] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[48] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[49] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[50] Sherief Abdallah,et al. Addressing Environment Non-Stationarity by Repeating Q-learning Updates , 2016, J. Mach. Learn. Res..
[51] Dorian Kodelja,et al. Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.
[52] Nando de Freitas,et al. Learning to Perform Physics Experiments via Deep Reinforcement Learning , 2016, ICLR.