Cooperation and communication in multiagent deep reinforcement learning
暂无分享,去创建一个
[1] Richard L. Lewis,et al. Optimal rewards in multiagent teams , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[2] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[3] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[4] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[5] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[6] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[7] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[8] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[9] Peter Stone and Patrick Riley and Manuela Veloso. Defining and Using Ideal Teammate and Opponent Models , 2000 .
[10] Feng Wu,et al. Online planning for large MDPs with MAXQ decomposition , 2012, AAMAS.
[11] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[12] Victor R. Lesser,et al. Coordinating multi-agent reinforcement learning with limited communication , 2013, AAMAS.
[13] Sarit Kraus,et al. Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.
[14] Sean P. Meyn,et al. An analysis of reinforcement learning with function approximation , 2008, ICML '08.
[15] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[16] Peter Stone,et al. Learning Powerful Kicks on the Aibo ERS-7: The Quest for a Striker , 2010, RoboCup.
[17] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[18] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[19] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[20] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[21] Tucker R. Balch,et al. Distributed sensor fusion for object position estimation by multi-robot systems , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[22] Astro Teller,et al. Evolving Team Darwin United , 1998, RoboCup.
[23] Sarit Kraus,et al. Learning Teammate Models for Ad Hoc Teamwork , 2012, AAMAS 2012.
[24] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[25] Fei-Fei Li,et al. Visualizing and Understanding Recurrent Networks , 2015, ArXiv.
[26] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[27] Shimon Whiteson,et al. Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks , 2016, ArXiv.
[28] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[29] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[30] Peter Stone,et al. Intrinsically motivated model learning for developing curious robots , 2017, Artif. Intell..
[31] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[33] Samuel Barrett,et al. Making Friends on the Fly: Advances in Ad Hoc Teamwork , 2015, Studies in Computational Intelligence.
[34] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[35] Sergey Levine,et al. Learning Visual Feature Spaces for Robotic Manipulation with Deep Spatial Autoencoders , 2015, ArXiv.
[36] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[37] Patrick MacAlpine,et al. UT Austin Villa 2014: RoboCup 3D Simulation League Champion via Overlapping Layered Learning , 2015, AAAI.
[38] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[39] Risto Miikkulainen,et al. A Neuroevolution Approach to General Atari Game Playing , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[40] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[41] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[42] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[43] Richard L. Lewis,et al. Optimal Rewards for Cooperative Agents , 2014, IEEE Transactions on Autonomous Mental Development.
[44] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[45] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[46] Risto Miikkulainen,et al. Efficient Reinforcement Learning Through Evolving Neural Network Topologies , 2002, GECCO.
[47] Manuela M. Veloso,et al. Layered Learning , 2000, ECML.
[48] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[49] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[50] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[51] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[52] Feng Wu,et al. Towards a Principled Solution to Simulated Robot Soccer , 2012, RoboCup.
[53] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[54] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[55] Peter Stone,et al. Cooperating with Unknown Teammates in Complex Domains: A Robot Soccer Case Study of Ad Hoc Teamwork , 2015, AAAI.
[56] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[57] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.
[58] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[59] Richard L. Lewis,et al. Reward Design via Online Gradient Ascent , 2010, NIPS.
[60] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[61] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[62] Matthew Hausknecht and Peter Stone. On-Policy vs. Off-Policy Updates for Deep Reinforcement Learning , 2016 .
[63] Bruno Castro da Silva,et al. Learning parameterized motor skills on a humanoid robot , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[64] Martin A. Riedmiller,et al. Reinforcement learning in feedback control , 2011, Machine Learning.
[65] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[66] Regina Barzilay,et al. Language Understanding for Text-based Games using Deep Reinforcement Learning , 2015, EMNLP.
[67] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[68] Shimon Whiteson,et al. Concurrent layered learning , 2003, AAMAS '03.
[69] Peter Stone,et al. Source Task Creation for Curriculum Learning , 2016, AAMAS.
[70] Martín Abadi,et al. Learning to Protect Communications with Adversarial Neural Cryptography , 2016, ArXiv.
[71] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.
[72] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[73] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[74] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[75] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[76] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[77] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[78] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[79] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[80] Jelle R. Kok,et al. Mutual Modeling of Teammate Behavior , 2002 .
[81] Francisco S. Melo,et al. Q -Learning with Linear Function Approximation , 2007, COLT.
[82] Peter Stone,et al. Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study , 2006, RoboCup.
[83] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[84] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[85] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[86] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[87] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[88] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[89] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[90] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[91] Milind Tambe,et al. Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..
[92] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[93] Peter Stone,et al. Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.
[94] Richard L. Lewis,et al. Reward Mapping for Transfer in Long-Lived Agents , 2013, NIPS.
[95] Martin A. Riedmiller,et al. On Experiences in a Complex and Competitive Gaming Domain: Reinforcement Learning Meets RoboCup , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.
[96] Shimon Whiteson,et al. Adaptive Representations for Reinforcement Learning , 2010, Studies in Computational Intelligence.
[97] Pucheng Zhou,et al. Multi-agent cooperation by reinforcement learning with teammate modeling and reward allotment , 2011, 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).
[98] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[99] Pravesh Ranchod,et al. Reinforcement Learning with Parameterized Actions , 2015, AAAI.
[100] Martin A. Riedmiller,et al. Autonomous reinforcement learning on raw visual input data in a real world application , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[101] DarrellTrevor,et al. End-to-end training of deep visuomotor policies , 2016 .
[102] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[103] Markus Wulfmeier,et al. Deep Inverse Reinforcement Learning , 2015, ArXiv.