暂无分享,去创建一个
[1] Stefano Ermon,et al. Multi-Agent Generative Adversarial Imitation Learning , 2018, NeurIPS.
[2] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[3] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[4] Junwei Lu,et al. Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation , 2020, NeurIPS.
[5] Victor Talpaert,et al. Deep Reinforcement Learning for Autonomous Driving: A Survey , 2020, IEEE Transactions on Intelligent Transportation Systems.
[6] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[7] Sonia Chernova,et al. Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.
[8] Julian Togelius,et al. A hybrid search agent in pommerman , 2018, FDG.
[9] Robert E. Schapire,et al. A Reduction from Apprenticeship Learning to Classification , 2010, NIPS.
[10] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[11] Prashant Doshi,et al. Multi-robot inverse reinforcement learning under occlusion with interactions , 2014, AAMAS.
[12] Sime Curkovic,et al. Sustainable Development : Authoritative and Leading Edge Content for Environmental Management , 2012 .
[13] Sarit Kraus,et al. Making friends on the fly: Cooperating with new teammates , 2017, Artif. Intell..
[14] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[15] Eder Santana,et al. Learning a Driving Simulator , 2016, ArXiv.
[16] Jürgen Schmidhuber,et al. A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots , 2016, IEEE Robotics and Automation Letters.
[17] Ofir Marom,et al. Belief Reward Shaping in Reinforcement Learning , 2018, AAAI.
[18] J. Neumann. Zur Theorie der Gesellschaftsspiele , 1928 .
[19] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.
[20] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[21] Miguel G. Cruz,et al. Assessing crown fire potential in coniferous forests of western North America: a critique of current approaches and recent simulation studies. , 2010 .
[22] Rob Fergus,et al. Modeling Others using Oneself in Multi-Agent Reinforcement Learning , 2018, ICML.
[23] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[24] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[25] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[26] Nir Levine,et al. Challenges of real-world reinforcement learning: definitions, benchmarks and analysis , 2021, Machine Learning.
[27] Stephen C. Adams,et al. Multi-agent Inverse Reinforcement Learning for Certain General-sum Stochastic Games , 2019, J. Artif. Intell. Res..
[28] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[29] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[30] Diogo Carvalho,et al. A new convergent variant of Q-learning with linear function approximation , 2020, NeurIPS.
[31] A. M. Fink,et al. Equilibrium in a stochastic $n$-person game , 1964 .
[32] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.
[33] Eder Santana,et al. Exploring the Limitations of Behavior Cloning for Autonomous Driving , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[34] Yuan Zhou,et al. Learning Guidance Rewards with Trajectory-space Smoothing , 2020, NeurIPS.
[35] Sergey Levine,et al. Nonlinear Inverse Reinforcement Learning with Gaussian Processes , 2011, NIPS.
[36] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[37] Zhang-Wei Hong,et al. A Deep Policy Inference Q-Network for Multi-Agent Systems , 2017, AAMAS.
[38] Nando de Freitas,et al. Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.
[39] Crystal S. Stonesifer,et al. Wildfire Response Performance Measurement: Current and Future Directions , 2018, Fire.
[40] Mykel J. Kochenderfer,et al. Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.
[41] Canan Eryiğit. Marketing Models: A Review of the Literature , 2017 .
[42] Garrison W. Cottrell,et al. Principled Methods for Advising Reinforcement Learning Agents , 2003, ICML.
[43] Siyuan Liu,et al. Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise , 2014, AAAI.
[44] H.H.T. Liu,et al. A cooperative UAV/UGV platform for wildfire detection and fighting , 2008, 2008 Asia Simulation Conference - 7th International Conference on System Simulation and Scientific Computing.
[45] Jordan L. Boyd-Graber,et al. Opponent Modeling in Deep Reinforcement Learning , 2016, ICML.
[46] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[47] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[48] Junliang Xing,et al. Hybrid Learning for Multi-agent Cooperation with Sub-optimal Demonstrations , 2020, IJCAI.
[49] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[50] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[51] D. Roberts,et al. Evaluating the Ability of FARSITE to Simulate Wildfires Influenced by Extreme, Downslope Winds in Santa Barbara, California , 2020, Fire.
[52] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[53] Felipe Leno da Silva,et al. Simultaneously Learning and Advising in Multiagent Reinforcement Learning , 2017, AAMAS.
[54] R. Rothermel. A Mathematical Model for Predicting Fire Spread in Wildland Fuels , 2017 .
[55] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[56] Mykel J. Kochenderfer,et al. Imitating driver behavior with generative adversarial networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).
[57] Stefan Schaal,et al. Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.
[58] Marek Petrik,et al. Robust Maximum Entropy Behavior Cloning , 2021, ArXiv.
[59] Jan Peters,et al. Relative Entropy Inverse Reinforcement Learning , 2011, AISTATS.
[60] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[61] Yu Wei,et al. Risk Management and Analytics in Wildfire Response , 2019, Current Forestry Reports.
[62] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[63] Peter A. Beling,et al. Multi-agent Inverse Reinforcement Learning for Zero-sum Games , 2014, ArXiv.
[64] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[65] S. G. Ponnambalam,et al. Reinforcement learning: exploration–exploitation dilemma in multi-agent foraging task , 2012 .
[66] Kevin Waugh,et al. Computational Rationalization: The Inverse Equilibrium Problem , 2011, ICML.
[67] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[68] Philip S. Yu,et al. Differential Advising in Multi-Agent Reinforcement Learning , 2020, ArXiv.
[69] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[70] Markus Wulfmeier,et al. Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.
[71] Tamer Basar,et al. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.
[72] Yisong Yue,et al. Coordinated Multi-Agent Imitation Learning , 2017, ICML.
[73] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[74] J. San-Miguel-Ayanz,et al. Use of Remote Sensing in Wildfire Management , 2012 .
[75] Sergey Levine,et al. A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models , 2016, ArXiv.
[76] Julian Togelius,et al. Pommerman: A Multi-Agent Playground , 2018, AIIDE Workshops.
[77] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[78] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[79] M. Finney. FARSITE : Fire Area Simulator : model development and evaluation , 1998 .
[80] Matthew E. Taylor,et al. A conceptual framework for externally-influenced agents: an assisted reinforcement learning review , 2020, Journal of Ambient Intelligence and Humanized Computing.
[81] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.
[82] Ananth Hari,et al. PettingZoo: Gym for Multi-Agent Reinforcement Learning , 2020, 2009.14471.
[83] V Nikitin,et al. Development of a robotic vehicle complex for wildfire-fighting by means of fire-protection roll screens , 2019 .
[84] Vinicius G. Goecks,et al. Integrating Behavior Cloning and Reinforcement Learning for Improved Performance in Dense and Sparse Reward Environments , 2020, AAMAS.
[85] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[86] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[87] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.
[88] Yang Gao,et al. Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.
[89] Xingyu Wang,et al. Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations , 2018, ICML.
[90] C. Boutilier,et al. Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..
[91] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[92] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[93] Jonathan P. How,et al. Learning to Teach in Cooperative Multiagent Reinforcement Learning , 2018, AAAI.
[94] Sham M. Kakade,et al. Mind the Duality Gap: Logarithmic regret algorithms for online optimization , 2008, NIPS.
[95] Matthew E. Taylor,et al. Improving Reinforcement Learning with Confidence-Based Demonstrations , 2017, IJCAI.
[96] David M. Bradley,et al. Boosting Structured Prediction for Imitation Learning , 2006, NIPS.
[97] Radha Poovendran,et al. Shaping Advice in Deep Multi-Agent Reinforcement Learning , 2021, ArXiv.
[98] Gergely V. Záruba,et al. Inverse reinforcement learning for decentralized non-cooperative multiagent systems , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).
[99] Chao Gao,et al. On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman , 2019, AIIDE.
[100] Anca D. Dragan,et al. DART: Noise Injection for Robust Imitation Learning , 2017, CoRL.
[102] Harshad Khadilkar,et al. Accelerating Training in Pommerman with Imitation and Reinforcement Learning , 2019, ArXiv.
[103] G. DeJong,et al. Theory and Application of Reward Shaping in Reinforcement Learning , 2004 .
[104] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .
[105] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2019, Autonomous Agents and Multi-Agent Systems.
[106] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[107] Mark Crowley,et al. A review of machine learning applications in wildfire science and management , 2020, Environmental Reviews.
[108] Mao Li,et al. Two-level Q-learning: learning from conflict demonstrations , 2019, The Knowledge Engineering Review.
[109] Fermín J. Alcasena,et al. Evaluating fire modelling systems in recent wildfires of the Golestan National Park, Iran , 2016 .
[110] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[111] Lantao Yu,et al. Multi-Agent Adversarial Inverse Reinforcement Learning , 2019, ICML.
[112] Xinli Cai,et al. Wildfire management in Canada: Review, challenges and opportunities , 2020, Progress in Disaster Science.
[113] Manuela Veloso,et al. Reinforcement Learning for Market Making in a Multi-agent Dealer Market , 2019, ArXiv.
[114] Felipe Leno da Silva,et al. A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems , 2019, J. Artif. Intell. Res..
[115] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[116] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[117] Gang Pan,et al. Knowledge-Guided Agent-Tactic-Aware Learning for StarCraft Micromanagement , 2018, IJCAI.
[118] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.
[119] Matthieu Geist,et al. Boosted Bellman Residual Minimization Handling Expert Demonstrations , 2014, ECML/PKDD.
[120] Alessandro Lazaric,et al. Direct Policy Iteration with Demonstrations , 2015, IJCAI.
[121] Wenbing Huang,et al. Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance , 2019, AAAI.
[122] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[123] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[124] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[125] Eric van Damme,et al. Non-Cooperative Games , 2000 .
[126] Sam Devlin,et al. An Empirical Study of Potential-Based Reward Shaping and Advice in Complex, Multi-Agent Systems , 2011, Adv. Complex Syst..