Meta-learning with backpropagation

Introduces gradient descent methods applied to meta-learning (learning how to learn) in neural networks. Meta-learning has been of interest in the machine learning field for decades because of its appealing applications to intelligent agents, non-stationary time series, autonomous robots, and improved learning algorithms. Many previous neural network-based approaches toward meta-learning have been based on evolutionary methods. We show how to use gradient descent for meta-learning in recurrent neural networks. Based on previous work on fixed-weight learning neural networks, we hypothesize that any recurrent network topology and its corresponding learning algorithm(s) is a potential meta-learning system. We tested several recurrent neural network topologies and their corresponding forms of backpropagation for their ability to meta-learn. One of our systems, based on the long short-term memory neural network developed a learning algorithm that could learn any two-dimensional quadratic function (from a set of such functions) after only 30 training examples.

[1]  W. Chambers San Antonio, Texas , 1940 .

[2]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[3]  Peter R. Conwell,et al.  Fixed-weight networks can learn , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[4]  David J. Chalmers,et al.  The Evolution of Learning: An Experiment in Genetic Connectionism , 1991 .

[5]  N. E. Cotter,et al.  Learning algorithms and fixed dynamics , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[6]  J. Schmidhuber,et al.  A neural network that embeds its own meta-levels , 1993, IEEE International Conference on Neural Networks.

[7]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[8]  A. Steven Younger,et al.  Fixed-weight on-line learning , 1999, IEEE Trans. Neural Networks.

[9]  Magnus Thor Jonsson,et al.  Evolution and design of distributed learning rules , 2000, 2000 IEEE Symposium on Combinations of Evolutionary Computation and Neural Networks. Proceedings of the First IEEE Symposium on Combinations of Evolutionary Computation and Neural Networks (Cat. No.00.

[10]  Sepp Hochreiter,et al.  Learning to Learn Using Gradient Descent , 2001, ICANN.