暂无分享,去创建一个
[1] E. Cheney. Analysis for Applied Mathematics , 2001 .
[2] Quanyan Zhu,et al. Dynamic policy-based IDS configuration , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.
[3] Ah-Hwee Tan,et al. Integrating Temporal Difference Methods and Self-Organizing Neural Networks for Reinforcement Learning With Delayed Evaluative Feedback , 2008, IEEE Transactions on Neural Networks.
[4] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[5] Arslan Munir,et al. Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles , 2018, IEEE Intelligent Transportation Systems Magazine.
[6] D. Ernst,et al. Power systems stability control: reinforcement learning framework , 2004, IEEE Transactions on Power Systems.
[7] Quanyan Zhu,et al. A cyber-physical game framework for secure and resilient multi-agent autonomous systems , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).
[8] Ming-Yu Liu,et al. Tactics of Adversarial Attack on Deep Reinforcement Learning Agents , 2017, IJCAI.
[9] Laurent Orseau,et al. Reinforcement Learning with a Corrupted Reward Channel , 2017, IJCAI.
[10] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[11] Quanyan Zhu,et al. Adaptive Honeypot Engagement through Reinforcement Learning of Semi-Markov Decision Processes , 2019, GameSec.
[12] Xiaojin Zhu,et al. Policy Poisoning in Batch Reinforcement Learning and Control , 2019, NeurIPS.
[13] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[14] Arslan Munir,et al. The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning , 2018, ArXiv.
[15] Sean P. Meyn,et al. The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning , 2000, SIAM J. Control. Optim..
[16] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[17] E. Kreyszig. Introductory Functional Analysis With Applications , 1978 .
[18] T. Urbanik,et al. Reinforcement learning-based multi-agent system for network traffic signal control , 2010 .
[19] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[20] Quanyan Zhu,et al. Deceptive Reinforcement Learning Under Adversarial Manipulations on Cost Signals , 2019, GameSec.
[21] Bo Li,et al. Reinforcement Learning with Perturbed Rewards , 2018, AAAI.
[22] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[23] Balsman,et al. The Theorems of the Alternative , 1991 .
[24] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[25] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[26] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .