Neuroevolution results in emergence of short-term memory in multi-goal environment

Animals behave adaptively in environments with multiple competing goals. Understanding of mechanisms underlying such goal-directed behavior remains a challenge for neuroscience as well as for adaptive system and machine learning research. To address this problem we developed an evolutionary model of adaptive behavior in a multi-goal stochastic environment. The proposed neuroevolutionary algorithm is based on neuron's duplication as a basic mechanism of agent's recurrent neural network development. Results of simulations demonstrate that in the course of evolution agents acquire the ability to store the short-term memory and use it in behavior with alternative actions. We found that evolution discovered two mechanisms for short-term memory. The first mechanism is integration of sensory signals and ongoing internal neural activity, resulting in emergence of cell groups specialized on alternative actions. And the second mechanism is slow neurodynamical process that makes possible to encode the previous behavioral choice.

[1]  Alex A. Freitas,et al.  Evolutionary Computation , 2002 .

[2]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[3]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[4]  Richard L. Lewis,et al.  Where Do Rewards Come From , 2009 .

[5]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[6]  Richard S. Sutton,et al.  Temporal Abstraction in Temporal-difference Networks , 2005, NIPS.

[7]  Gregor Schöner,et al.  An embodied account of serial order: How instabilities drive sequence generation , 2010, Neural Networks.

[8]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[9]  Francesco Mondada,et al.  Automatic creation of an autonomous agent: genetic evolution of a neural-network driven robot , 1994 .

[10]  Dario Floreano,et al.  Neuroevolution: from architectures to learning , 2008, Evol. Intell..

[11]  Matthew M Botvinick,et al.  Short-term memory for serial order: a recurrent neural network model. , 2006, Psychological review.

[12]  W. Gantt Biology and Neurophysiology of the Conditioned Reflex and Its Role in Adaptive Behavior , 1976 .

[13]  S. Grossberg Contour Enhancement , Short Term Memory , and Constancies in Reverberating Neural Networks , 1973 .

[14]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[15]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[16]  Risto Miikkulainen,et al.  Evolving Multimodal Networks for Multitask Games , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[17]  G. Osipov,et al.  Adaptive functional systems: learning with chaos. , 2010, Chaos.

[18]  G. Edelman Neural Darwinism: The Theory Of Neuronal Group Selection , 1989 .

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Patrick M. Pilarski,et al.  Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.

[21]  Jeroen Raes,et al.  Duplication and divergence: the evolution of new genes and old ideas. , 2004, Annual review of genetics.

[22]  M. Botvinick,et al.  Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.

[23]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..