Sequence Processing with Recurrent Neural Networks

Sequence processing involves several tasks such as clustering, classification, prediction, and transduction of sequential data which can be symbolic, non-symbolic or mixed. Examples of symbolic data patterns occur in modelling natural (human) language, while the prediction of water level of River Thames is an example of processing non-symbolic data. If the content of a sequence will be varying through different time steps, the sequence is called temporal or time-series. In general, a temporal sequence consists of nominal symbols from a particular alphabet, while a time-series sequence deals with continuous, real-valued elements (Antunes & Oliverira, 2001). Processing both these sequences mainly consists of applying the current known patterns to produce or predict the future ones, while a major difficulty is that the range of data dependencies is usually unknown. Therefore, an intelligent system with memorising capability is crucial for effective sequence processing and modelling. A recurrent neural network (RNN) is an artificial neural network in which self-loop and backward connections between nodes are allowed (Lin & Lee 1996; Schalkoff, 1997). Comparing to feedforward neural networks, RNNs are well-known for their power to memorise time dependencies and model nonlinear systems. RNNs can be trained from examples to map input sequences to output sequences and in principle they can implement any kind of sequential behaviour. They are biologically more plausible and computationally more powerful than other modelling approaches, such as Hidden Markov Models (HMMs), which have non-continuous internal states, feedforward neural networks and Support Vector Machines (SVMs), which do not have internal states at all. In this article, we review RNN architectures and we discuss the challenges involved in training RNNs for sequence processing. We provide a review of learning algorithms for RNNs and discuss future trends in this area.

[1]  S. Grossberg Some Networks That Can Learn, Remember, and Reproduce any Number of Complicated Space-Time Patterns, I , 1969 .

[2]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[3]  Lee A. Feldkamp,et al.  Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks , 1994, IEEE Trans. Neural Networks.

[4]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[5]  Barak A. Pearlmutter Gradient calculations for dynamic recurrent neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[6]  Pierre Baldi,et al.  Gradient descent learning algorithm overview: a general dynamical systems perspective , 1995, IEEE Trans. Neural Networks.

[7]  Alessandro Sperduti,et al.  Supervised neural networks for the classification of structures , 1997, IEEE Trans. Neural Networks.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Bhaskar D. Rao,et al.  On-line learning algorithms for locally recurrent neural networks , 1999, IEEE Trans. Neural Networks.

[10]  Stefan C. Kremer,et al.  Spatiotemporal Connectionist Networks: A Taxonomy and Review , 2001, Neural Computation.

[11]  Jean Pierre Asselin de Beauville,et al.  Learning long-term dependencies by the selective addition of time-delayed connections to recurrent neural networks , 2002, Neurocomputing.

[12]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[13]  Ido Kanter,et al.  Time Series Generation by Recurrent Neural Networks , 2003, Annals of Mathematics and Artificial Intelligence.

[14]  V. Sugumaran The Inaugural Issue of the International Journal of Intelligent Information Technologies , 2005 .

[15]  Roelof K. Brouwer Training of a discrete recurrent neural network for sequence classification by using a helper FNN , 2005, Soft Comput..

[16]  Judy A. Franklin,et al.  Recurrent Neural Networks for Musical Pitch Memory and Classification , 2005, Int. J. Artif. Intell. Tools.

[17]  Vijayan K. Asari,et al.  Recurrent neural network as a linear attractor for pattern association , 2006, IEEE Transactions on Neural Networks.

[18]  Shakti Kumar,et al.  Swarm Intelligence and the Taguchi Method for Identification of Fuzzy Models , 2006 .

[19]  Jürgen Schmidhuber,et al.  Training Recurrent Networks by Evolino , 2007, Neural Computation.

[20]  Mikael Wiberg Designing Interactive Architecture: Lessons Learned from a Multi-Professional Approach to the Design of an Ambient Computing Environment , 2009, Int. J. Ambient Comput. Intell..

[21]  Vijayan Sugumaran,et al.  Methodological Advancements in Intelligent Information Technologies: Evolutionary Trends , 2009 .

[22]  Houda Labiod,et al.  Vehicular Networks Security: Attacks, Requirements, Challenges and Current Contributions , 2009, Int. J. Ambient Comput. Intell..

[23]  Kevin Curran,et al.  An Activity Monitoring Application for Windows Mobile Devices , 2010, Int. J. Ambient Comput. Intell..

[24]  Q. M. Danish Lohani,et al.  Intuitionistic Fuzzy 2-Metric Space and Some Topological Properties , 2011, Int. J. Artif. Life Res..

[25]  Kevin Curran,et al.  Ubiquitous Developments in Ambient Computing and Intelligence: Human-Centered Applications , 2011 .

[26]  Deirdre Lee,et al.  Context-Aware Pervasive Services for Smart Cities , 2011 .

[27]  Mark Burgin,et al.  Approximate Fuzzy Continuity of Functions , 2011, Int. J. Fuzzy Syst. Appl..