Knowledge-based recurrent neural networks in Reinforcement Learning

Recurrent Neural Networks (RNNs) have been shown to have a strong ability to solve some hard problems. Learning time for these problems from scratch is typically very long. For supervised learning, several methods have been proposed to reuse existing knowledge in previous similar tasks. However, for unsupervised learning such as Reinforcement Learning (RL), especially for Partially Observable Markov Decision Processes (POMDPs), it is difficult to apply directly these algorithms. This paper presents several methods which have the potential of transferring of knowledge in RL using RNN: Directed Transfer, Cascade-Correlation, Mixture of Expert Systems, and Two-Level Architecture. Preliminary results of experiments in the E maze domain show the potential of these methods. Knowledge based learning time for a new problem is much shorter learning time from scratch even thought the new task looks very different from the previous tasks.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Mohamed S. Kamel,et al.  Reinforcement learning using a recurrent neural network , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[3]  Jürgen Schmidhuber,et al.  A robot that reinforcement-learns to identify and memorize important previous observations , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Risto Miikkulainen,et al.  Efficient Non-linear Control Through Neuroevolution , 2006, ECML.

[6]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[7]  Andrew G. Barto,et al.  Autonomous shaping: knowledge transfer in reinforcement learning , 2006, ICML.

[8]  Andrew McCallum,et al.  Instance-Based State Identification for Reinforcement Learning , 1994, NIPS.

[9]  Evandsa Sabrine Lopes-Lima Ribeiro,et al.  Incremental construction of LSTM recurrent neural network , 2002 .

[10]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994, Neural Computation.

[11]  Thomas R. Shultz,et al.  Knowledge-based cascade-correlation: Using knowledge to speed learning , 2001, Connect. Sci..

[12]  G. Miller Learning to Forget , 2004, Science.

[13]  Doina Precup,et al.  Combining TD-learning with Cascade-correlation Networks , 2003, ICML.

[14]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[15]  Jürgen Schmidhuber,et al.  Co-evolving recurrent neurons learn deep memory POMDPs , 2005, GECCO '05.

[16]  Jürgen Schmidhuber,et al.  Training Recurrent Networks by Evolino , 2007, Neural Computation.

[17]  Hajime Kita,et al.  Reinforcement learning of dynamic behavior by using recurrent neural networks , 1997, Artificial Life and Robotics.

[18]  Hideto Tomabechi,et al.  A parallel recurrent cascade-correlation neural network with natural connectionist glue , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[19]  Thomas R. Shultz,et al.  Could Knowledge-Based Neural Learning be Useful in Developmental Robotics? The Case of Kbcc , 2007, Int. J. Humanoid Robotics.

[20]  James L. Carroll,et al.  Towards automatic shaping in robot navigation , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).