Verification of Recurrent Neural Networks Through Rule Extraction

The verification problem for neural networks is verifying whether a neural network will suffer from adversarial samples, or approximating the maximal allowed scale of adversarial perturbation that can be endured. While most prior work contributes to verifying feed-forward networks, little has been explored for verifying recurrent networks. This is due to the existence of a more rigorous constraint on the perturbation space for sequential data, and the lack of a proper metric for measuring the perturbation. In this work, we address these challenges by proposing a metric which measures the distance between strings, and use deterministic finite automata (DFA) to represent a rigorous oracle which examines if the generated adversarial samples violate certain constraints on a perturbation. More specifically, we empirically show that certain recurrent networks allow relatively stable DFA extraction. As such, DFAs extracted from these recurrent networks can serve as a surrogate oracle for when the ground truth DFA is unknown. We apply our verification mechanism to several widely used recurrent networks on a set of the Tomita grammars. The results demonstrate that only a few models remain robust against adversarial samples. In addition, we show that for grammars with different levels of complexity, there is also a difference in the difficulty of robust learning of these grammars.

[1]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[2]  Russ Tedrake,et al.  Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[3]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[4]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[5]  P. Frasconi,et al.  Representation of Finite State Automata in Recurrent Radial Basis Function Networks , 1996, Machine Learning.

[6]  Henrik Jacobsson,et al.  Rule Extraction from Recurrent Neural Networks: ATaxonomy and Review , 2005, Neural Computation.

[7]  Chih-Hong Cheng,et al.  Maximum Resilience of Artificial Neural Networks , 2017, ATVA.

[8]  Luca Pulina,et al.  Automated Verification of Neural Networks: Advances, Challenges and Perspectives , 2018, ArXiv.

[9]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[10]  Xue Liu,et al.  An Empirical Evaluation of Rule Extraction from Recurrent Neural Networks , 2017, Neural Computation.

[11]  Matteo Fischetti,et al.  Deep Neural Networks as 0-1 Mixed Integer Linear Programs: A Feasibility Study , 2017, ArXiv.

[12]  Wenbo Guo,et al.  Adversary Resistant Deep Neural Networks with an Application to Malware Detection , 2016, KDD.

[13]  Ananthram Swami,et al.  Crafting adversarial input sequences for recurrent neural networks , 2016, MILCOM 2016 - 2016 IEEE Military Communications Conference.

[14]  Rüdiger Ehlers,et al.  Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.

[15]  Russ Tedrake,et al.  Verifying Neural Networks with Mixed Integer Programming , 2017, ArXiv.

[16]  Xue Liu,et al.  A Comparison of Rule Extraction for Different Recurrent Neural Network Models and Grammatical Complexity , 2018, ArXiv.

[17]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[18]  C. Lee Giles,et al.  Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[19]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[20]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[21]  John F. Kolen,et al.  Fool's Gold: Extracting Finite State Machines from Recurrent Network Dynamics , 1993, NIPS.

[22]  Geoffrey E. Hinton,et al.  Distilling a Neural Network Into a Soft Decision Tree , 2017, CEx@AI*IA.

[23]  Piotr Indyk,et al.  Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false) , 2014, STOC.

[24]  Colin de la Higuera,et al.  Grammatical Inference: Learning Automata and Grammars , 2010 .

[25]  Joachim Diederich,et al.  Knowledge Extraction and Recurrent Neural Networks: An Analysis of an Elman Network trained on a Natural Language Learning Task , 1998, CoNLL.

[26]  Giovanni Soda,et al.  Representation of finite state automata in Recurrent Radial Basis Function networks , 2004, Machine Learning.

[27]  Padhraic Smyth,et al.  Self-clustering recurrent networks , 1993, IEEE International Conference on Neural Networks.

[28]  David L. Dill,et al.  Ground-Truth Adversarial Examples , 2017, ArXiv.

[29]  Eran Yahav,et al.  Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples , 2017, ICML.

[30]  C. Lee Giles,et al.  Symbolic Knowledge Representation in Recurrent Neural Networks: Insights from Theoretical Models of , 2000 .

[31]  Ying Zhang,et al.  On Multiplicative Integration with Recurrent Neural Networks , 2016, NIPS.

[32]  Rajeev Motwani,et al.  Introduction to automata theory, languages, and computation - international edition, 2nd Edition , 2003 .

[33]  Leonid Ryzhyk,et al.  Verifying Properties of Binarized Deep Neural Networks , 2017, AAAI.

[34]  Verification of Recurrent Neural Networks , 2018 .

[35]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..