论文信息 - A Comparative Study of Rule Extraction for Recurrent Neural Networks

A Comparative Study of Rule Extraction for Recurrent Neural Networks

Understanding recurrent networks through rule extraction has a long history. This has taken on new interests due to the need for interpreting or verifying neural networks. One basic form for representing stateful rules is deterministic finite automata (DFA). Previous research shows that extracting DFAs from trained second-order recurrent networks is not only possible but also relatively stable. Recently, several new types of recurrent networks with more complicated architectures have been introduced. These handle challenging learning tasks usually involving sequential data. However, it remains an open problem whether DFAs can be adequately extracted from these models. Specifically, it is not clear how DFA extraction will be affected when applied to different recurrent networks trained on data sets with different levels of complexity. Here, we investigate DFA extraction on several widely adopted recurrent networks that are trained to learn a set of seven regular Tomita grammars. We first formally analyze the complexity of Tomita grammars and categorize these grammars according to that complexity. Then we empirically evaluate different recurrent networks for their performance of DFA extraction on all Tomita grammars. Our experiments show that for most recurrent networks, their extraction performance decreases as the complexity of the underlying grammar increases. On grammars of lower complexity, most recurrent networks obtain desirable extraction performance. As for grammars with the highest level of complexity, while several complicated models fail with only certain recurrent networks having satisfactory extraction performance.

[1] Bin Yu,et al. Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs , 2018, ICLR.

[2] Soummya Kar,et al. Convergence Analysis of Distributed Inference with Vector-Valued Gaussian Belief Propagation , 2016, J. Mach. Learn. Res..

[3] Eran Yahav,et al. Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples , 2017, ICML.

[4] Raymond L. Watrous,et al. Induction of Finite-State Languages Using Second-Order Recurrent Networks , 1992, Neural Computation.

[5] Jonathan Berant,et al. Inducing Regular Grammars Using Recurrent Neural Networks , 2017, ArXiv.

[6] C. Lee Giles,et al. An Empirical Evaluation of Recurrent Neural Network Rule Extraction , 2017, ArXiv.

[7] Dana Angluin,et al. Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[8] Henrik Jacobsson,et al. Rule Extraction from Recurrent Neural Networks: ATaxonomy and Review , 2005, Neural Computation.

[9] David Reitter,et al. Learning Simpler Language Models with the Differential State Framework , 2017, Neural Computation.

[10] Joachim Diederich,et al. Knowledge Extraction and Recurrent Neural Networks: An Analysis of an Elman Network trained on a Natural Language Learning Task , 1998, CoNLL.

[11] Arthur Szlam,et al. Automatic Rule Extraction from Long Short Term Memory Networks , 2016, ICLR.

[12] Colin de la Higuera,et al. Grammatical Inference: Learning Automata and Grammars , 2010 .

[13] Yoshua Bengio,et al. An EM approach to grammatical inference: input/output HMMs , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[14] C. Lee Giles,et al. Symbolic Knowledge Representation in Recurrent Neural Networks: Insights from Theoretical Models of , 2000 .

[15] Xue Liu,et al. An Empirical Evaluation of Rule Extraction from Recurrent Neural Networks , 2017, Neural Computation.

[16] Artur S. d'Avila Garcez,et al. Learning and Representing Temporal Knowledge in Recurrent Networks , 2011, IEEE Transactions on Neural Networks.

[17] Alberto Sanfeliu,et al. Active Grammatical Inference: A New Learning Methodology , 1994 .

[18] Walter Daelemans. Colin de la Higuera: Grammatical inference: learning automata and grammars , 2011, Machine Translation.

[19] C. Lee Giles,et al. Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[20] C. Lee Giles,et al. Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[21] Rafael C. Carrasco. Accurate Computation of the Relative Entropy Between Stochastic Regular Grammars , 1997, RAIRO Theor. Informatics Appl..

[22] C. Lee Giles,et al. Extraction of rules from discrete-time recurrent neural networks , 1996, Neural Networks.

[23] John F. Kolen,et al. Fool's Gold: Extracting Finite State Machines from Recurrent Network Dynamics , 1993, NIPS.

[24] Ruslan Salakhutdinov,et al. Linguistic Knowledge as Memory for Recurrent Neural Networks , 2017, ArXiv.

[25] Ying Zhang,et al. On Multiplicative Integration with Recurrent Neural Networks , 2016, NIPS.