暂无分享,去创建一个
[1] S. Shankar Sastry,et al. Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning , 2017, ArXiv.
[2] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[3] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[4] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[5] Geoffrey E. Hinton,et al. Reinforcement Learning with Factored States and Actions , 2004, J. Mach. Learn. Res..
[6] Qingfeng Lan,et al. Maxmin Q-learning: Controlling the Estimation Bias of Q-learning , 2020, ICLR.
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[9] Yan Zheng,et al. Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments , 2018, PRICAI.
[10] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[11] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[12] A. Cardoso,et al. Modeling Forms of Surprise in Artificial Agents: Empirical and Theoretical Study of Surprise Functions , 2004 .
[13] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.
[14] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[15] Yuhong Yang,et al. Information Theory, Inference, and Learning Algorithms , 2005 .
[16] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[17] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[18] Drew Wicke,et al. Multiagent Soft Q-Learning , 2018, AAAI Spring Symposia.
[19] E.M. Atkins,et al. A survey of consensus problems in multi-agent coordination , 2005, Proceedings of the 2005, American Control Conference, 2005..
[20] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[21] Christopher Amato,et al. Likelihood Quantile Networks for Coordinating Multi-Agent Reinforcement Learning , 2018, AAMAS.
[22] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[23] Bart De Schutter,et al. Multi-Agent Reinforcement Learning: A Survey , 2006, 2006 9th International Conference on Control, Automation, Robotics and Vision.
[24] Koray Kavukcuoglu,et al. PGQ: Combining policy gradient and Q-learning , 2016, ArXiv.
[25] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[26] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[27] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[28] Shimon Whiteson,et al. Weighted QMIX: Expanding Monotonic Value Function Factorisation , 2020, ArXiv.
[29] Kavosh Asadi,et al. An Alternative Softmax Operator for Reinforcement Learning , 2016, ICML.
[30] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[31] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[32] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[33] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[34] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[35] Sergey Levine,et al. SMiRL: Surprise Minimizing RL in Entropic Environments , 2019 .
[36] A. Cardoso,et al. The role of Surprise, Curiosity and Hunger on Exploration of Unknown Environments Populated with Entities , 2005, 2005 portuguese conference on artificial intelligence.
[37] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[38] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[39] Jun Wang,et al. Multi-Agent Reinforcement Learning , 2020, Deep Reinforcement Learning.
[40] Sergey Levine,et al. Efficient Exploration via State Marginal Matching , 2019, ArXiv.
[41] Masashi Sugiyama,et al. Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics , 2019, ArXiv.
[42] Marc'Aurelio Ranzato,et al. Energy-Based Models in Document Recognition and Computer Vision , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).
[43] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[44] Koray Kavukcuoglu,et al. Combining policy gradient and Q-learning , 2016, ICLR.
[45] Saeid Nahavandi,et al. Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications , 2018, IEEE Transactions on Cybernetics.
[46] Akshay Krishnamurthy,et al. Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning , 2019, ICML.
[47] Bikramjit Banerjee,et al. Multi-agent reinforcement learning as a rehearsal for decentralized planning , 2016, Neurocomputing.
[48] Haitham Bou-Ammar,et al. Balancing Two-Player Stochastic Games with Soft Q-Learning , 2018, IJCAI.
[49] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[50] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[51] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[52] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .
[53] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[54] John Langford,et al. Efficient Exploration in Reinforcement Learning , 2017, Encyclopedia of Machine Learning and Data Mining.
[55] Jerry Zikun Chen. Reinforcement Learning Generalization with Surprise Minimization , 2020, ArXiv.
[56] Yee Whye Teh,et al. Energy-Based Models for Sparse Overcomplete Representations , 2003, J. Mach. Learn. Res..
[57] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[58] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.
[59] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[60] Bart De Schutter,et al. Multi-agent Reinforcement Learning: An Overview , 2010 .
[61] Shimon Whiteson,et al. The StarCraft Multi-Agent Challenge , 2019, AAMAS.
[62] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[63] Tuomas Haarnoja,et al. Acquiring Diverse Robot Skills via Maximum Entropy Deep Reinforcement Learning , 2018 .
[64] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[65] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.