Evolving Recurrent Neural Network Controllers by Incremental Fitness Shaping

Time varying artificial neural networks are commonly used for dynamic problems such as games controllers and robotics as they give the controller a memory of what occurred in previous states which is important as actions in previous states can influence the final success of the agent. Because of this temporal dependence, methods such as back-propagation can be difficult to use to optimise network parameters and so genetic algorithms (GAs) are often used instead. While recurrent neural networks (RNNs) are a common network used with GAs, long short term memory (LSTM) networks have had less attention. Since, LSTM networks have a wide range of temporal dynamics, in this paper, we evolve an LSTM network as a controller for a lunar lander task with two evolutionary algorithms: a steady state GA (SSGA) and an evolutionary strategy (ES). Due to the presence of a large local optima in the fitness space, we implemented an incremental fitness scheme to both evolutionary algorithms. We also compare the behaviour and evolutionary progress of the LSTM with the behaviour of an RNN evolved via NEAT and ES with the same fitness function. LSTMs proved themselves to be evolvable on such tasks, though the SSGA solution was outperformed by the RNN. However, despite using an incremental scheme, the ES developed solutions far better than both showing that ES can be used both for incremental fitness and for LSTMs and RNNs on dynamic tasks.

[1]  L. Darrell Whitley,et al.  An overview of evolutionary algorithms: practical issues and common pitfalls , 2001, Inf. Softw. Technol..

[2]  G.J. Barlow,et al.  Incremental evolution of autonomous controllers for unmanned aerial vehicles using multi-objective genetic programming , 2004, IEEE Conference on Cybernetics and Intelligent Systems, 2004..

[3]  Gilbert Syswerda,et al.  A Study of Reproduction in Generational and Steady State Genetic Algorithms , 1990, FOGA.

[4]  R. Lyndon While,et al.  Applying evolutionary algorithms to problems with noisy, time-consuming fitness functions , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[5]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[6]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[7]  Risto Miikkulainen,et al.  Evolving Deep LSTM-based Memory Networks using an Information Maximization Objective , 2016, GECCO.

[8]  S. K. Setua,et al.  A steady state Genetic Algorithm for Multiple Sequence Alignment , 2014, 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[9]  Bram Bakker,et al.  Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.

[10]  Jürgen Schmidhuber,et al.  Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..

[11]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[12]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.