CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing

As reinforcement learning (RL) has achieved great success and been even adopted in safety-critical domains such as autonomous vehicles, a range of empirical studies have been conducted to improve its robustness against adversarial attacks. However, how to certify its robustness with theoretical guarantees still remains challenging. In this paper, we present the first unified framework CROP (Certifying Robust Policies for RL) to provide robustness certification on both action and reward levels. In particular, we propose two robustness certification criteria: robustness of per-state actions and lower bound of cumulative rewards. We then develop a local smoothing algorithm for policies derived from Q-functions to guarantee the robustness of actions taken along the trajectory; we also develop a global smoothing algorithm for certifying the lower bound of a finite-horizon cumulative reward, as well as a novel local smoothing algorithm to perform adaptive search in order to obtain tighter reward certification. Empirically, we apply CROP to evaluate several existing empirically robust RL algorithms, including adversarial training and different robust regularization, in four environments (two representative Atari games, Highway, and CartPole). Furthermore, by evaluating these algorithms against adversarial attacks, we demonstrate that our certifications are often tight. All experiment results are available at website https://crop-leaderboard.github.io.

[1]  S. Levine,et al.  Maximum Entropy RL (Provably) Solves Some Robust RL Problems , 2021, ICLR.

[2]  Jonathan P. How,et al.  Certifiable Robustness to Adversarial State Uncertainty in Deep Reinforcement Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Aounon Kumar,et al.  Policy Smoothing for Provably Robust Reinforcement Learning , 2021, ICLR.

[4]  Furong Huang,et al.  Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL , 2021, ICLR.

[5]  Huan Zhang,et al.  Robust Reinforcement Learning on State Observations with Learned Optimal Adversary , 2021, ICLR.

[6]  Priya L. Donti,et al.  Enforcing robust control guarantees within neural network policies , 2020, ICLR.

[7]  Tuomas P. Oikarinen,et al.  Robust Deep Reinforcement Learning through Adversarial Loss , 2020, NeurIPS.

[8]  Donald J. Trump,et al.  Executive Order 13960: Promoting the Use of Trustworthy Artificial Intelligence in the Federal Government , 2020 .

[9]  Tom Goldstein,et al.  Certifying Confidence via Randomized Smoothing , 2020, NeurIPS.

[10]  Tom Goldstein,et al.  Detection as Regression: Certified Object Detection by Median Smoothing , 2020, ArXiv.

[11]  Jinwoo Shin,et al.  Consistency Regularization for Certified Robustness of Smoothed Classifiers , 2020, NeurIPS.

[12]  Tuo Zhao,et al.  Deep Reinforcement Learning with Robust and Smooth Policy , 2020, ICML.

[13]  Cho-Jui Hsieh,et al.  Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations , 2020, Neural Information Processing Systems.

[14]  Minlie Huang,et al.  Automatic Perturbation Analysis on General Computational Graphs , 2020, ArXiv.

[15]  Ilya P. Razenshteyn,et al.  Randomized Smoothing of All Shapes and Sizes , 2020, ICML.

[16]  Yada Zhu,et al.  Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction States , 2020, AAAI.

[17]  O. Pietquin,et al.  CopyCAT: : Taking Control of Neural Policies with Constant Attacks , 2019, AAMAS.

[18]  Sergey Levine,et al.  Adversarial Policies: Attacking Deep Reinforcement Learning , 2019, ICLR.

[19]  Martin T. Vechev,et al.  Online Robustness Training for Deep Reinforcement Learning , 2019, ArXiv.

[20]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[21]  Alexandre Proutière,et al.  Optimal Attacks on Reinforcement Learning Policies , 2019, ArXiv.

[22]  Greg Yang,et al.  Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[23]  Tommi S. Jaakkola,et al.  Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers , 2019, NeurIPS.

[24]  Ludwig Schmidt,et al.  Unlabeled Data Improves Adversarial Robustness , 2019, NeurIPS.

[25]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[26]  Gábor Orosz,et al.  End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.

[27]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[28]  Suman Jana,et al.  Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[29]  Russ Tedrake,et al.  Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[30]  Jaime F. Fisac,et al.  A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.

[31]  Cho-Jui Hsieh,et al.  Efficient Neural Network Robustness Certification with General Activation Functions , 2018, NeurIPS.

[32]  Timothy A. Mann,et al.  On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[33]  Matthew Mirman,et al.  Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[34]  Arslan Munir,et al.  Mitigation of Policy Manipulation Attacks on Deep Q-Networks with Parameter-Space Noise , 2018, SAFECOMP Workshops.

[35]  Atul Prakash,et al.  Robust Physical-World Attacks on Deep Learning Visual Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[37]  Inderjit S. Dhillon,et al.  Towards Fast Computation of Certified Robustness for ReLU Networks , 2018, ICML.

[38]  Aditi Raghunathan,et al.  Certified Defenses against Adversarial Examples , 2018, ICLR.

[39]  Girish Chowdhary,et al.  Robust Deep Reinforcement Learning with Adversarial Attacks , 2017, AAMAS.

[40]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[41]  Shane Legg,et al.  Noisy Networks for Exploration , 2017, ICLR.

[42]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[43]  Arslan Munir,et al.  Whatever Does Not Kill Deep Reinforcement Learning, Makes It Stronger , 2017, ArXiv.

[44]  Steve Y. Yang,et al.  An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown , 2017, Expert Syst. Appl..

[45]  Silvio Savarese,et al.  Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[46]  Percy Liang,et al.  Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[47]  Matthias Hein,et al.  Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation , 2017, NIPS.

[48]  Rüdiger Ehlers,et al.  Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.

[49]  Athanasios S. Polydoros,et al.  Survey of Model-Based Reinforcement Learning: Applications on Robotics , 2017, J. Intell. Robotic Syst..

[50]  Chih-Hong Cheng,et al.  Maximum Resilience of Artificial Neural Networks , 2017, ATVA.

[51]  Etienne Perot,et al.  Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.

[52]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[53]  Ming-Yu Liu,et al.  Tactics of Adversarial Attack on Deep Reinforcement Learning Agents , 2017, IJCAI.

[54]  Youyong Kong,et al.  Deep Direct Reinforcement Learning for Financial Signal Representation and Trading , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[55]  Dawn Xiaodong Song,et al.  Delving into adversarial attacks on deep policies , 2017, ICLR.

[56]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[57]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[58]  Arslan Munir,et al.  Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks , 2017, MLDM.

[59]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[60]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[61]  Wojciech Zaremba,et al.  Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.

[62]  Amnon Shashua,et al.  Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[63]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[64]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[66]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[67]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[68]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[69]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[70]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[71]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.

[72]  S. Manikandan,et al.  Measures of central tendency: Median and mode , 2011, Journal of pharmacology & pharmacotherapeutics.

[73]  Laurent El Ghaoui,et al.  Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..

[74]  Garud Iyengar,et al.  Robust Dynamic Programming , 2005, Math. Oper. Res..

[75]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[76]  J. Glaz,et al.  Simultaneous confidence intervals for multinomial proportions , 1999 .

[77]  Peter van Emde Boas,et al.  Preserving order in a forest in less than logarithmic time , 1975, 16th Annual Symposium on Foundations of Computer Science (sfcs 1975).

[78]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[79]  C. Stein A bound for the error in the normal approximation to the distribution of a sum of dependent random variables , 1972 .

[80]  DONALD MICHIE,et al.  “Memo” Functions and Machine Learning , 1968, Nature.

[81]  W. Hoeffding Probability inequalities for sum of bounded random variables , 1963 .

[82]  Claude E. Shannon,et al.  Communication theory of secrecy systems , 1949, Bell Syst. Tech. J..