Evolution of Neural Architecture Fitting Environmental Dynamics

Temporal and sequential information is essential to any agent continually interacting with its environment. In this paper, we test whether it is possible to evolve a recurrent neural network controller to match the dynamic requirement of the task. As a benchmark, we consider a sequential navigation task where the agent has to alternately visit two rewarding sites to obtain food and water after first visiting the nest. To achieve a better fitness, the agent must select relevant sensory inputs and update its working memory to realize a non-Markovian sequential behavior in which the preceding state alone does not determine the next action. We compare the performance of a feed-forward and recurrent neural control architectures in different environment settings and analyze the neural mechanisms and environment features exploited by the agents to achieve their goal. Simulation and experimental results using the Cyber Rodent robot show that a modular architecture with a locally excitatory recur rent layer outperformed the general recurrent controller.

[1]  Bruce Blumberg,et al.  Action-selection in hamsterdam: lessons from ethology , 1994 .

[2]  Kae Nakamura,et al.  Neuronal activity in medial frontal cortex during learning of sequential procedures. , 1998, Journal of neurophysiology.

[3]  P. Goldman-Rakic,et al.  Intrinsic circuit organization of the major layers and sublayers of the dorsolateral prefrontal cortex in the rhesus monkey , 1995, The Journal of comparative neurology.

[4]  Zbigniew Michalewicz,et al.  Genetic algorithms + data structures = evolution programs (2nd, extended ed.) , 1994 .

[5]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[6]  Sven Koenig 'From Animals to Animats 5': Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior , 2000, Artificial Life.

[7]  Shin Ishii,et al.  Reinforcement Learning for Biped Locomotion , 2002, ICANN.

[8]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[9]  Lisa Meeden,et al.  An incremental approach to developing intelligent neural network controllers for robots , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[10]  Paul Rodríguez,et al.  Simple Recurrent Networks Learn Context-Free and Context-Sensitive Languages by Counting , 2001, Neural Computation.

[11]  Inman Harvey,et al.  An Evolutionary Ecological Approach to the Study of Learning Behavior Using a Robot-Based Model , 2002, Adapt. Behav..

[12]  S. Hochreiter Recurrent Neural Net Learning and Vanishing , 1998 .

[13]  Kenji Doya,et al.  Adaptive neural oscillator using continuous-time back-propagation learning , 1989, Neural Networks.

[14]  Dario Floreano,et al.  Evolution of Adaptive Synapses: Robots with Fast Adaptive Behavior in New Environments , 2001, Evolutionary Computation.

[15]  Jean-Pierre Müller,et al.  Analysis and Design of Robot's Behavior: Towards a Methodology , 1997, EWLR.

[16]  Minami Ito,et al.  Columns for visual features of objects in monkey inferotemporal cortex , 1992, Nature.

[17]  Kenji Doya,et al.  Recurrent networks: supervised learning , 1998 .

[18]  Kenji Doya,et al.  Recurrent Networks : Learning Algorithms ∗ , 2002 .

[19]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[20]  Randall D. Beer,et al.  Center-Crossing Recurrent Neural Networks for the Evolution of Rhythmic Behavior , 2002, Neural Computation.

[21]  Randall D. Beer,et al.  Sequential Behavior and Learning in Evolved Dynamical Neural Networks , 1994, Adapt. Behav..

[22]  Anil K. Seth,et al.  Evolving action selection and selective attention without actions, attention, or selection , 1998 .

[23]  C. Paul Bilateral Decoupling in the Neural Control of Biped Locomotion , 2003 .

[24]  K. Doya,et al.  Selection of Neural Architecture and the Environment Complexity , 2002 .

[25]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[26]  Jesper Blynel,et al.  Evolving Reinforcement Learning-Like Abilities for Robots , 2003, ICES.

[27]  K. Doya,et al.  Parallel Cortico-Basal Ganglia Mechanisms for Acquisition and Execution of Visuomotor SequencesA Computational Approach , 2001, Journal of Cognitive Neuroscience.

[28]  Kenji Doya,et al.  An Evolutionary Approach to Automatic Construction of the Structure in Hierarchical Reinforcement Learning , 2003, GECCO.

[29]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[30]  Aude Billard,et al.  From Animals to Animats , 2004 .

[31]  P. Redgrave,et al.  The basal ganglia: a vertebrate solution to the selection problem? , 1999, Neuroscience.