Challenges and Countermeasures for Adversarial Attacks on Deep Reinforcement Learning

Deep Reinforcement Learning (DRL) has numerous applications in the real world thanks to its outstanding ability in quickly adapting to the surrounding environments. Despite its great advantages, DRL is susceptible to adversarial attacks, which precludes its use in real-life critical systems and applications (e.g., smart grids, traffic controls, and autonomous vehicles) unless its vulnerabilities are addressed and mitigated. Thus, this paper provides a comprehensive survey that discusses emerging attacks in DRL-based systems and the potential countermeasures to defend against these attacks. We first cover some fundamental backgrounds about DRL and present emerging adversarial attacks on machine learning techniques. We then investigate more details of the vulnerabilities that the adversary can exploit to attack DRL along with the state-of-the-art countermeasures to prevent such attacks. Finally, we highlight open issues and research challenges for developing solutions to deal with attacks for DRL-based intelligent systems.

[1]  Jakub W. Pachocki,et al.  Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.

[2]  Nando de Freitas,et al.  Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.

[3]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[4]  Jiqiang Liu,et al.  Gradient Band-based Adversarial Training for Generalized Attack Immunity of A3C Path Finding , 2018, ArXiv.

[5]  Mengdi Huai,et al.  Malicious Attacks against Deep Reinforcement Learning Interpretations , 2020, KDD.

[6]  Marco Wiering,et al.  Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .

[7]  Mingjie Sun,et al.  Characterizing Attacks on Deep Reinforcement Learning , 2019, AAMAS.

[8]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[9]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[10]  Elman Mansimov,et al.  Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.

[11]  Ying-Chang Liang,et al.  Applications of Deep Reinforcement Learning in Communications and Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[12]  Demis Hassabis,et al.  Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.

[13]  Junaid Qadir,et al.  Securing Connected & Autonomous Vehicles: Challenges Posed by Adversarial Machine Learning and the Way Forward , 2019, IEEE Communications Surveys & Tutorials.

[14]  Panayotis Mertikopoulos,et al.  On the robustness of learning in games with stochastically perturbed payoff observations , 2014, Games Econ. Behav..

[15]  Javier García,et al.  A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[16]  Natalia Díaz Rodríguez,et al.  Explainability in Deep Reinforcement Learning , 2020, Knowl. Based Syst..

[17]  Jonathan P. How,et al.  Certified Adversarial Robustness for Deep Reinforcement Learning , 2019, CoRL.

[18]  Sergey Levine,et al.  Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.

[19]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[20]  Jinfeng Yi,et al.  AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks , 2018, AAAI.

[21]  Christopher Leckie,et al.  Reinforcement Learning for Autonomous Defence in Software-Defined Networking , 2018, GameSec.

[22]  Sabir Hossain,et al.  Driverless Car: Autonomous Driving Using Deep Reinforcement Learning in Urban Environment , 2018, 2018 15th International Conference on Ubiquitous Robots (UR).

[23]  Chao Yang,et al.  A Survey on Deep Transfer Learning , 2018, ICANN.

[24]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[25]  David A. Wagner,et al.  Defensive Distillation is Not Robust to Adversarial Examples , 2016, ArXiv.

[26]  Arslan Munir,et al.  Mitigation of Policy Manipulation Attacks on Deep Q-Networks with Parameter-Space Noise , 2018, SAFECOMP Workshops.

[27]  Shie Mannor,et al.  Action Robust Reinforcement Learning and Applications in Continuous Control , 2019, ICML.

[28]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[29]  Seong Joon Oh,et al.  Sequential Attacks on Agents for Long-Term Adversarial Goals , 2018, ArXiv.

[30]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[31]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[32]  Soumik Sarkar,et al.  Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning Agents , 2020, AAAI.

[33]  Pieter Abbeel,et al.  Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.

[34]  David Barber,et al.  Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.

[35]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[37]  Chin-Hui Lee,et al.  Enhanced Adversarial Strategically-Timed Attacks Against Deep Reinforcement Learning , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Anant Sahai,et al.  Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication , 2018, ArXiv.

[39]  Matthieu Geist,et al.  Targeted Attacks on Deep Reinforcement Learning Agents through Adversarial Observations , 2019, ArXiv.

[40]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[41]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[42]  Amnon Shashua,et al.  Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[43]  Matthias Althoff,et al.  Falsification-Based Robust Adversarial Reinforcement Learning , 2020, 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA).

[44]  Yasaman Esfandiari,et al.  Robustifying Reinforcement Learning Agents via Action Space Adversarial Training , 2020, 2020 American Control Conference (ACC).

[45]  Matthieu Geist,et al.  CopyCAT: : Taking Control of Neural Policies with Constant Attacks , 2020, AAMAS.

[46]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[47]  Lei Ma,et al.  Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning , 2020, AAAI.

[48]  Victor Talpaert,et al.  Deep Reinforcement Learning for Autonomous Driving: A Survey , 2020, IEEE Transactions on Intelligent Transportation Systems.

[49]  William Hsu,et al.  Adversarial Exploitation of Policy Imitation , 2019, AISafety@IJCAI.

[50]  Benjamin Van Roy,et al.  Deep Exploration via Bootstrapped DQN , 2016, NIPS.

[51]  Christian N. Madu,et al.  Urban sustainability management: A deep learning perspective , 2017 .

[52]  Mahesh K. Marina,et al.  Examining Machine Learning for 5G and Beyond Through an Adversarial Lens , 2020, IEEE Internet Computing.

[53]  Pascal Frossard,et al.  Analysis of universal adversarial perturbations , 2017, ArXiv.

[54]  William Glisson,et al.  A Malicious Attack on the Machine Learning Policy of a Robotic System , 2018, 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE).

[55]  Anatolij Zubow,et al.  ns-3 meets OpenAI Gym: The Playground for Machine Learning in Networking Research , 2019, MSWiM.

[56]  Roi Naveiro,et al.  Reinforcement Learning under Threats , 2018, AAAI.

[57]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[58]  Girish Chowdhary,et al.  Robust Deep Reinforcement Learning with Adversarial Attacks , 2017, AAMAS.

[59]  Ian S. Fischer,et al.  Learning to Attack: Adversarial Transformation Networks , 2018, AAAI.

[60]  Abhinav Gupta,et al.  Robust Adversarial Reinforcement Learning , 2017, ICML.

[61]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[62]  Huan Zhang,et al.  Robust Reinforcement Learning on State Observations with Learned Optimal Adversary , 2021, ICLR.

[63]  Youyong Kong,et al.  Deep Direct Reinforcement Learning for Financial Signal Representation and Trading , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[64]  Mikhail Pavlov,et al.  Deep Attention Recurrent Q-Network , 2015, ArXiv.

[65]  Glen Berseth,et al.  DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning , 2017, ACM Trans. Graph..

[66]  Arslan Munir,et al.  Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks , 2017, MLDM.

[67]  Pieter Abbeel,et al.  Robust Reinforcement Learning using Adversarial Populations , 2020, ArXiv.

[68]  Dan Boneh,et al.  The Space of Transferable Adversarial Examples , 2017, ArXiv.

[69]  Zihao Zhang,et al.  Deep Reinforcement Learning for Trading , 2019, The Journal of Financial Data Science.

[70]  William Hsu,et al.  RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies , 2019, SAFECOMP Workshops.

[71]  Ming-Yu Liu,et al.  Tactics of Adversarial Attack on Deep Reinforcement Learning Agents , 2017, IJCAI.

[72]  Jiqiang Liu,et al.  A PCA-Based Model to Predict Adversarial Examples on Q-Learning of Path Finding , 2018, 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC).

[73]  Wenchao Li,et al.  TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents , 2019, ArXiv.

[74]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[75]  Quanyan Zhu,et al.  Deceptive Reinforcement Learning Under Adversarial Manipulations on Cost Signals , 2019, GameSec.

[76]  Arslan Munir,et al.  The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning , 2018, ArXiv.

[77]  Cho-Jui Hsieh,et al.  Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations , 2020, Neural Information Processing Systems.

[78]  Arslan Munir,et al.  Whatever Does Not Kill Deep Reinforcement Learning, Makes It Stronger , 2017, ArXiv.

[79]  Ming-Yu Liu,et al.  Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight , 2017, ArXiv.

[80]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[81]  Tuomas P. Oikarinen,et al.  Robust Deep Reinforcement Learning through Adversarial Loss , 2020, NeurIPS.

[82]  Yasaman Esfandiari,et al.  Query-based targeted action-space adversarial policies on deep reinforcement learning agents , 2020, ICCPS.

[83]  Dario Amodei,et al.  Benchmarking Safe Exploration in Deep Reinforcement Learning , 2019 .

[84]  Jiliang Zhang,et al.  Adversarial Examples: Opportunities and Challenges , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[85]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[86]  Xiaohui Ye,et al.  Horizon: Facebook's Open Source Applied Reinforcement Learning Platform , 2018, ArXiv.

[87]  Arslan Munir,et al.  Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles , 2018, IEEE Intelligent Transportation Systems Magazine.

[88]  Shane Legg,et al.  Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.

[89]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[90]  Yang Gao,et al.  Risk Averse Robust Adversarial Reinforcement Learning , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[91]  Damien Ernst,et al.  Deep Reinforcement Learning Solutions for Energy Microgrids Management , 2016 .

[92]  Abhishek Gupta,et al.  Defending Adversarial Attacks without Adversarial Attacks in Deep Reinforcement Learning , 2020, ArXiv.

[93]  Cewu Lu,et al.  Virtual to Real Reinforcement Learning for Autonomous Driving , 2017, BMVC.

[94]  Elena Smirnova,et al.  Distributionally Robust Reinforcement Learning , 2019, ArXiv.

[95]  Bo Li,et al.  Reinforcement Learning with Perturbed Rewards , 2018, AAAI.

[96]  William Hsu,et al.  Sequential Triggers for Watermarking of Deep Reinforcement Learning Policies , 2019, ArXiv.

[97]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[98]  Yang Liu,et al.  Stealing Deep Reinforcement Learning Models for Fun and Profit , 2020, AsiaCCS.

[99]  Ali Farhadi,et al.  Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[100]  Silvio Savarese,et al.  Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[101]  Xiaojin Zhu,et al.  Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning , 2020, ICML.

[102]  Mehmed Kantardzic,et al.  Learning from Data , 2011 .

[103]  Sergey Levine,et al.  Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.

[104]  Ian J. Goodfellow,et al.  Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .

[105]  Nicholas R. Gans,et al.  Minimax Iterative Dynamic Game: Application to Nonlinear Robot Control Tasks , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[106]  Razvan Pascanu,et al.  Distilling Policy Distillation , 2019, AISTATS.

[107]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[108]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[109]  Shane Legg,et al.  Noisy Networks for Exploration , 2017, ICLR.

[110]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[111]  Yan Li,et al.  Adversarial Attacks on Reinforcement Learning based Energy Management Systems of Extended Range Electric Delivery Vehicles , 2020, ArXiv.

[112]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[113]  Haitham Bou-Ammar,et al.  Wasserstein Robust Reinforcement Learning , 2019, ArXiv.

[114]  Yuxi Li,et al.  Deep Reinforcement Learning , 2018, Reinforcement Learning for Cyber-Physical Systems.

[115]  William Hsu,et al.  Analysis and Improvement of Adversarial Training in DQN Agents With Adversarially-Guided Exploration (AGE) , 2019, ArXiv.

[116]  Ajmal Mian,et al.  Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey , 2018, IEEE Access.

[117]  Jingjing Liu,et al.  Adversarial Examples Construction Towards White-Box Q Table Variation in DQN Pathfinding Training , 2018, 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC).

[118]  Aashish Kumar,et al.  Enhancing performance of reinforcement learning models in the presence of noisy rewards , 2019 .

[119]  Matthew Mirman,et al.  Online Robustness Training for Deep Reinforcement Learning , 2019, ArXiv.

[120]  Dawn Xiaodong Song,et al.  Delving into adversarial attacks on deep policies , 2017, ICLR.

[121]  Jiayu Zhou,et al.  Transfer Learning in Deep Reinforcement Learning: A Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[122]  Patrick P. K. Chan,et al.  Adversarial Attack against Deep Reinforcement Learning with Static Reward Impact Map , 2020, AsiaCCS.

[123]  Jinfeng Yi,et al.  Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach , 2018, ICLR.

[124]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[125]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[126]  Eric M. S. P. Veith,et al.  Explainable Reinforcement Learning: A Survey , 2020, CD-MAKE.

[127]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[128]  Peter Szolovits,et al.  Continuous State-Space Models for Optimal Sepsis Treatment: a Deep Reinforcement Learning Approach , 2017, MLHC.

[129]  Razvan Pascanu,et al.  Policy Distillation , 2015, ICLR.

[130]  Sergey Levine,et al.  Adversarial Policies: Attacking Deep Reinforcement Learning , 2019, ICLR.

[131]  Honglak Lee,et al.  Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.

[132]  Alberto Leon-Garcia,et al.  On the Robustness of Cooperative Multi-Agent Reinforcement Learning , 2020, 2020 IEEE Security and Privacy Workshops (SPW).

[133]  William T. B. Uther,et al.  Adversarial Reinforcement Learning , 2003 .

[134]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[135]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[136]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[137]  Soumik Sarkar,et al.  Online Robust Policy Learning in the Presence of Unknown Adversaries , 2018, NeurIPS.