暂无分享,去创建一个
Karl Tuyls | Michael Kaisers | Daan Bloembergen | Richard Klíma | K. Tuyls | D. Bloembergen | M. Kaisers | R. Klíma
[1] Roderic A. Grupen,et al. Robust Reinforcement Learning in Motion Planning , 1993, NIPS.
[2] Flaviu Cristian,et al. Fault-tolerance in air traffic control systems , 1996, TOCS.
[3] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.
[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[5] H. Robbins. A Stochastic Approximation Method , 1951 .
[6] John C. Knight,et al. Safety critical systems: challenges and directions , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Zhang-Wei Hong,et al. A Deep Policy Inference Q-Network for Multi-Agent Systems , 2017, AAMAS.
[9] Richard S. Sutton,et al. Multi-step Reinforcement Learning: A Unifying Algorithm , 2017, AAAI.
[10] J. Doyle,et al. Essentials of Robust Control , 1997 .
[11] Yevgeniy Vorobeychik,et al. Multidefender Security Games , 2015, IEEE Intelligent Systems.
[12] Karl Tuyls,et al. Markov Security Games : Learning in Spatial Security Problems , 2016 .
[13] Martin L. Shooman,et al. Reliability of Computer Systems and Networks: Fault Tolerance,Analysis,and Design , 2002 .
[14] Shimon Whiteson,et al. A theoretical and empirical analysis of Expected Sarsa , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[15] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[16] Vincent Conitzer,et al. Stackelberg vs. Nash in Security Games: An Extended Investigation of Interchangeability, Equivalence, and Uniqueness , 2011, J. Artif. Intell. Res..
[17] Sui Ruan,et al. Patrolling in a Stochastic Environment , 2005 .
[18] E. Zeidler,et al. Fixed-point theorems , 1986 .
[19] John N. Tsitsiklis,et al. Asynchronous stochastic approximation and Q-learning , 1994, Mach. Learn..
[20] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[21] Bo An,et al. PROTECT: a deployed game theoretic system to protect the ports of the United States , 2012, AAMAS.
[22] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[23] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[24] Sarit Kraus,et al. Deployed ARMOR protection: the application of a game theoretic model for security at the Los Angeles International Airport , 2008, AAMAS 2008.
[25] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[26] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[27] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[28] Csaba Szepesv Ari,et al. Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms , 1996 .
[29] Chris Gaskett,et al. Reinforcement learning under circumstances beyond its control , 2003 .
[30] Yang Xiao,et al. Cyber Security and Privacy Issues in Smart Grids , 2012, IEEE Communications Surveys & Tutorials.
[31] A. Dvoretzky. On Stochastic Approximation , 1956 .
[32] Hamid Sharif,et al. A Survey on Smart Grid Communication Infrastructures: Motivations, Requirements and Challenges , 2013, IEEE Communications Surveys & Tutorials.
[33] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[34] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[35] Shimon Whiteson,et al. OFFER: Off-Environment Reinforcement Learning , 2017, AAAI.