Symbolic Representation of Recurrent Neural Network Dynamics

Simple recurrent error backpropagation networks have been widely used to learn temporal sequence data, including regular and context-free languages. However, the production of relatively large and opaque weight matrices during learning has inspired substantial research on how to extract symbolic human-readable interpretations from trained networks. Unlike feedforward networks, where research has focused mainly on rule extraction, most past work with recurrent networks has viewed them as dynamical systems that can be approximated symbolically by finite-state machine (FSMs). With this approach, the network's hidden layer activation space is typically divided into a finite number of regions. Past research has mainly focused on better techniques for dividing up this activation space. In contrast, very little work has tried to influence the network training process to produce a better representation in hidden layer activation space, and that which has been done has had only limited success. Here we propose a powerful general technique to bias the error backpropagation training process so that it learns an activation space representation from which it is easier to extract FSMs. Using four publicly available data sets that are based on regular and context-free languages, we show via computational experiments that the modified learning method helps to extract FSMs with substantially fewer states and less variance than unmodified backpropagation learning, without decreasing the neural networks' accuracy. We conclude that modifying error backpropagation so that it more effectively separates learned pattern encodings in the hidden layer is an effective way to improve contemporary FSM extraction methods.

[1]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[2]  Mikel L. Forcada,et al.  Constrained Second-Order Recurrent Networks for Finite-State Automata Induction , 1998 .

[3]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  C. Lee Giles,et al.  How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies , 1998, Neural Networks.

[5]  Michael C. Mozer,et al.  Dynamic On-line Clustering and State Extraction: An Approach to Symbolic Learning , 1998, Neural Networks.

[6]  James A. Reggia,et al.  Guiding Hidden Layer Representations for Improved Rule Extraction From Neural Networks , 2011, IEEE Transactions on Neural Networks.

[7]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[8]  Peter Tiño,et al.  Extracting finite-state representations from recurrent neural networks trained on chaotic symbolic sequences , 1999, IEEE Trans. Neural Networks.

[9]  Stefan L. Frank,et al.  Generalization and systematicity in echo state networks , 2008 .

[10]  Stefan L. Frank,et al.  Learn more by training less: systematicity in sentence processing by recurrent networks , 2006, Connect. Sci..

[11]  Michael I. Jordan Attractor dynamics and parallelism in a connectionist sequential machine , 1990 .

[12]  Henrik Jacobsson,et al.  Rule Extraction from Recurrent Neural Networks: ATaxonomy and Review , 2005, Neural Computation.

[13]  Peter Tiño,et al.  Financial volatility trading using recurrent neural networks , 2001, IEEE Trans. Neural Networks.

[14]  Giovanni Soda,et al.  Inductive inference from noisy examples using the hybrid finite state filter , 1998, IEEE Trans. Neural Networks.

[15]  Joachim Diederich,et al.  The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks , 1998, IEEE Trans. Neural Networks.

[16]  Martin A. Riedmiller,et al.  RPROP - A Fast Adaptive Learning Algorithm , 1992 .

[17]  Narendra S. Chaudhari,et al.  Segmented-Memory Recurrent Neural Networks , 2009, IEEE Transactions on Neural Networks.

[18]  Henrik Jacobsson,et al.  Rule extraction from recurrent neural networks , 2006 .

[19]  Cheng-Yuan Liou,et al.  Resolving Hidden Representations , 2007, ICONIP.

[20]  James A. Reggia,et al.  Evolutionary Design of Neural Network Architectures Using a Descriptive Encoding Language , 2006, IEEE Transactions on Evolutionary Computation.

[21]  Frank van der Velde,et al.  Lack of combinatorial productivity in language processing with simple recurrent networks , 2004, Connect. Sci..

[22]  Padhraic Smyth,et al.  Learning Finite State Machines With Self-Clustering Recurrent Networks , 1993, Neural Computation.

[23]  Henrik Jacobsson,et al.  Sentence-processing in echo state networks: a qualitative analysis by finite state machine extraction , 2010, Connect. Sci..