Verifying Deep-RL-Driven Systems

Deep reinforcement learning (RL) has recently been successfully applied to networking contexts including routing, flow scheduling, congestion control, packet classification, cloud resource management, and video streaming. Deep-RL-driven systems automate decision making, and have been shown to outperform state-of-the-art handcrafted systems in important domains. However, the (typical) non-explainability of decisions induced by the deep learning machinery employed by these systems renders reasoning about crucial system properties, including correctness and security, extremely difficult. We show that despite the obscurity of decision making in these contexts, verifying that deep-RL-driven systems adhere to desired, designer-specified behavior, is achievable. To this end, we initiate the study of formal verification of deep RL and present Verily, a system for verifying deep-RL-based systems that leverages recent advances in verification of deep neural networks. We employ Verily to verify recently-introduced deep-RL-driven systems for adaptive video streaming, cloud resource management, and Internet congestion control. Our results expose scenarios in which deep-RL-driven decision making yields undesirable behavior. We discuss guidelines for building deep-RL-driven systems that are both safer and easier to verify.

[1]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[2]  Hongzi Mao,et al.  Neural Adaptive Video Streaming with Pensieve , 2017, SIGCOMM.

[3]  Srikanth Kandula,et al.  Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[4]  Armando Solar-Lezama,et al.  Verifiable Reinforcement Learning via Policy Extraction , 2018, NeurIPS.

[5]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[6]  Corina S. Pasareanu,et al.  DeepSafe: A Data-driven Approach for Checking Adversarial Robustness in Neural Networks , 2017, ArXiv.

[7]  Clark W. Barrett,et al.  Provably Minimally-Distorted Adversarial Examples , 2017 .

[8]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[9]  Brighten Godfrey,et al.  Internet Congestion Control via Deep Reinforcement Learning , 2018, ArXiv.

[10]  Adnan Darwiche,et al.  Compiling Neural Networks into Tractable Boolean Circuits , 2019 .

[11]  Alexander Aiken,et al.  From invariant checking to invariant inference using randomized search , 2014, Formal Methods Syst. Des..

[12]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[13]  Mykel J. Kochenderfer,et al.  Toward Scalable Verification for Safety-Critical Deep Networks , 2018, ArXiv.

[14]  Mykel J. Kochenderfer,et al.  The Marabou Framework for Verification and Analysis of Deep Neural Networks , 2019, CAV.

[15]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[16]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[17]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[18]  Corina S. Pasareanu,et al.  DeepSafe: A Data-Driven Approach for Assessing Robustness of Neural Networks , 2018, ATVA.

[19]  Mykel J. Kochenderfer,et al.  Algorithms for Verifying Deep Neural Networks , 2019, Found. Trends Optim..

[20]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[21]  Russ Tedrake,et al.  Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[22]  Ion Stoica,et al.  Learning to Optimize Join Queries With Deep Reinforcement Learning , 2018, ArXiv.

[23]  Sriram Sankaranarayanan,et al.  Reachability analysis for neural feedback systems using regressive polynomial rule inference , 2019, HSCC.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  John Wawrzynek,et al.  AutoPhase: Compiler Phase-Ordering for HLS with Deep Reinforcement Learning , 2019, 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[26]  Junfeng Yang,et al.  Formal Security Analysis of Neural Networks using Symbolic Intervals , 2018, USENIX Security Symposium.

[27]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[28]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[29]  Armin Biere,et al.  Symbolic Model Checking without BDDs , 1999, TACAS.

[30]  Mykel J. Kochenderfer,et al.  Towards Proving the Adversarial Robustness of Deep Neural Networks , 2017, FVAV@iFM.

[31]  Yasser Shoukry,et al.  Formal verification of neural network controlled autonomous systems , 2018, HSCC.

[32]  Xin Jin,et al.  Neural packet classification , 2019, SIGCOMM.

[33]  Dafna Shahaf,et al.  Learning to Route , 2017, HotNets.