Evolino: Hybrid Neuroevolution/Optimal Linear Search for Sequence Learning

Current Neural Network learning algorithms are limited in their ability to model non-linear dynamical systems. Most supervised gradient-based recurrent neural networks (RNNs) suffer from a vanishing error signal that prevents learning from inputs far in the past. Those that do not, still have problems when there are numerous local minima. We introduce a general framework for sequence learning, EVOlution of recurrent systems with LINear outputs (Evolino). Evolino uses evolution to discover good RNN hidden node weights, while using methods such as linear regression or quadratic programming to compute optimal linear mappings from hidden state to output. Using the Long Short-Term Memory RNN Architecture, the method is tested in three very different problem domains: 1) context-sensitive languages, 2) multiple superimposed sine waves, and 3) the Mackey-Glass system. Evolino performs exceptionally well across all tasks, where other methods show notable deficiencies in some.

[1]  R. Penrose A Generalized inverse for matrices , 1955 .

[2]  L. Glass,et al.  Oscillation and chaos in physiological control systems. , 1977, Science.

[3]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[4]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[5]  Eduardo Sontag,et al.  Turing computability with neural nets , 1991 .

[6]  Risto Miikkulainen,et al.  Efficient Reinforcement Learning through Symbiotic Evolution , 1996, Machine Learning.

[7]  Didier Guériot,et al.  RBF neural network, basis functions and genetic algorithm , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[10]  Risto Miikkulainen,et al.  Solving Non-Markovian Control Tasks with Neuro-Evolution , 1999, IJCAI.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  John F. Kolen,et al.  Field Guide to Dynamical Recurrent Networks , 2001 .

[13]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[14]  Jürgen Schmidhuber,et al.  LSTM recurrent networks learn simple context-free and context-sensitive languages , 2001, IEEE Trans. Neural Networks.

[15]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[16]  Risto Miikkulainen,et al.  Efficient Reinforcement Learning through Symbiotic Evolution , 2004 .