暂无分享,去创建一个
Radha Poovendran | Baicen Xiao | Bhaskar Ramasubramanian | R. Poovendran | Baicen Xiao | B. Ramasubramanian | Bhaskar Ramasubramanian
[1] Tiejun Huang,et al. Graph Convolutional Reinforcement Learning , 2020, ICLR.
[2] Hayong Shin,et al. Multi-Agent Actor-Critic with Hierarchical Graph Attention Network , 2020, AAAI.
[3] Yung Yi,et al. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.
[4] Yuan Zhou,et al. Learning Guidance Rewards with Trajectory-space Smoothing , 2020, NeurIPS.
[5] Thomas Blaschke,et al. Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.
[6] Yuk Ying Chung,et al. Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning , 2020, NeurIPS.
[7] Yujing Hu,et al. Multi-Agent Game Abstraction via Graph Attention Neural Network , 2019, AAAI.
[8] G. A. Young,et al. High‐dimensional Statistics: A Non‐asymptotic Viewpoint, Martin J.Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9 , 2020, International Statistical Review.
[9] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[10] Kagan Tumer,et al. QUICR-Learning for Multi-Agent Coordination , 2006, AAAI.
[11] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[12] Zongqing Lu,et al. Learning Attentional Communication for Multi-Agent Cooperation , 2018, NeurIPS.
[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[14] Zhen Xiao,et al. Modelling the Dynamic Joint Policy of Teammates with Attention Multi-agent DDPG , 2018, AAMAS.
[15] Sidney Nascimento Givigi,et al. Policy Invariance under Reward Transformations for General-Sum Stochastic Games , 2011, J. Artif. Intell. Res..
[16] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[17] Yujing Hu,et al. Q-value Path Decomposition for Deep Multiagent Reinforcement Learning , 2020, ICML.
[18] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[19] Etienne Perot,et al. Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.
[20] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[21] Alexander J. Smola,et al. Deep Sets , 2017, 1703.06114.
[22] Pedro M. Domingos. A Unified Bias-Variance Decomposition for Zero-One and Squared Loss , 2000, AAAI/IAAI.
[23] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[24] Xi Chen,et al. Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement Learning , 2019, ArXiv.
[25] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[26] Ioannis Mitliagkas,et al. A Modern Take on the Bias-Variance Tradeoff in Neural Networks , 2018, ArXiv.
[27] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[28] Dorian Kodelja,et al. Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.
[29] Sam Devlin,et al. An Empirical Study of Potential-Based Reward Shaping and Advice in Complex, Multi-Agent Systems , 2011, Adv. Complex Syst..
[30] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[31] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[32] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[33] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[34] Shimon Whiteson,et al. MAVEN: Multi-Agent Variational Exploration , 2019, NeurIPS.
[35] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[36] Katherine Rose Driggs-Campbell,et al. HG-DAgger: Interactive Imitation Learning with Human Experts , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[37] Alexander G. Schwing,et al. PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning , 2019, CoRL.
[38] Sepp Hochreiter,et al. RUDDER: Return Decomposition for Delayed Rewards , 2018, NeurIPS.
[39] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[40] Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.
[41] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[42] Shimon Whiteson,et al. The StarCraft Multi-Agent Challenge , 2019, AAMAS.
[43] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[44] Prabhat Nagarajan,et al. Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations , 2019, ICML.