Synthesizing Context-free Grammars from Recurrent Neural Networks

We present an algorithm for extracting a subclass of the context free grammars (CFGs) from a trained recurrent neural network (RNN). We develop a new framework, pattern rule sets (PRSs), which describe sequences of deterministic finite automata (DFAs) that approximate a non-regular language. We present an algorithm for recovering the PRS behind a sequence of such automata, and apply it to the sequences of automata extracted from trained RNNs using the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L^{*}$$\end{document}L∗ algorithm. We then show how the PRS may converted into a CFG, enabling a familiar and useful presentation of the learned language. Extracting the learned language of an RNN is important to facilitate understanding of the RNN and to verify its correctness. Furthermore, the extracted CFG can augment the RNN in classifying correct sentences, as the RNN’s predictive accuracy decreases when the recursion depth and distance between matching delimiters of its input sequences increases.

[1]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[2]  Gail Weiss,et al.  Synthesizing Context-free Grammars from Recurrent Neural Networks , .

[3]  JacobssonHenrik Rule Extraction from Recurrent Neural Networks: A Taxonomy and Review , 2005 .

[4]  C. Lee Giles,et al.  Extraction of rules from discrete-time recurrent neural networks , 1996, Neural Networks.

[5]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[6]  C. Lee Giles,et al.  The Neural Network Pushdown Automaton: Architecture, Dynamics and Training , 1997, Summer School on Neural Networks.

[7]  Robert C. Berwick,et al.  Evaluating the Ability of LSTMs to Learn Context-Free Grammars , 2018, BlackboxNLP@EMNLP.

[8]  Eran Yahav,et al.  Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples , 2017, ICML.

[9]  Surya Ganguli,et al.  RNNs Can Generate Bounded Hierarchical Languages with Optimal Memory , 2020, EMNLP.

[10]  Tameru Hailesilassie,et al.  Rule Extraction Algorithm for Deep Neural Networks: A Review , 2016, ArXiv.

[11]  Adelmo Luis Cechin,et al.  State automata extraction from recurrent neural nets using k-means and fuzzy clustering , 2003, 23rd International Conference of the Chilean Computer Science Society, 2003. SCCC 2003. Proceedings..

[12]  Amaury Habrard,et al.  A Polynomial Algorithm for the Inference of Context Free Languages , 2008, ICGI.

[13]  Ngoc Thang Vu,et al.  Learning the Dyck Language with Attention-based Seq2Seq Models , 2019, BlackboxNLP@ACL.

[14]  Patrizia Grifoni,et al.  A survey of grammatical inference methods for natural language learning , 2011, Artificial Intelligence Review.

[15]  Colin Giles,et al.  Learning Context-free Grammars: Capabilities and Limitations of a Recurrent Neural Network with an External Stack Memory (cid:3) , 1992 .

[16]  Stéphane Ayache,et al.  Explaining Black Boxes on Sequential Data using Weighted Automata , 2018, ICGI.

[17]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[18]  Dietrich Klakow,et al.  Closing Brackets with Recurrent Neural Networks , 2018, BlackboxNLP@EMNLP.

[19]  Samuel A. Korsky,et al.  On the Computational Power of RNNs , 2019, ArXiv.

[20]  Sebastian Thrun,et al.  Extracting Rules from Artifical Neural Networks with Distributed Representations , 1994, NIPS.

[21]  Dana Angluin,et al.  Inductive Inference of Formal Languages from Positive Data , 1980, Inf. Control..

[22]  Jean-Philippe Bernardy,et al.  Can Recurrent Neural Networks Learn Nested Recursion? , 2018, LILT.

[23]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[24]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[25]  Dexter Kozen The Chomsky—Schützenberger Theorem , 1977 .

[26]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[27]  C. Lee Giles,et al.  Connecting First and Second Order Recurrent Networks with Deterministic Finite Automata , 2019, ArXiv.

[28]  James R. Cordy,et al.  A survey of grammatical inference in software engineering , 2014, Sci. Comput. Program..

[29]  Alexander Clark,et al.  Polynomial Identification in the Limit of Substitutable Context-free Languages , 2005 .