A neural-network reinforcement-learning model of domestic chicks that learn to localize the centre of closed arenas

Previous experiments have shown that when domestic chicks (Gallus gallus) are first trained to locate food elements hidden at the centre of a closed square arena and then are tested in a square arena of double the size, they search for food both at its centre and at a distance from walls similar to the distance of the centre from the walls experienced during training. This paper presents a computational model that successfully reproduces these behaviours. The model is based on a neural-network implementation of the reinforcement-learning actor–critic architecture (in this architecture the ‘critic’ learns to evaluate perceived states in terms of predicted future rewards, while the ‘actor’ learns to increase the probability of selecting the actions that lead to higher evaluations). The analysis of the model suggests which type of information and cognitive mechanisms might underlie chicks' behaviours: (i) the tendency to explore the area at a specific distance from walls might be based on the processing of the height of walls' horizontal edges, (ii) the capacity to generalize the search at the centre of square arenas independently of their size might be based on the processing of the relative position of walls' vertical edges on the horizontal plane (equalization of walls' width), and (iii) the whole behaviour exhibited in the large square arena can be reproduced by assuming the existence of an attention process that, at each time, focuses chicks' internal processing on either one of the two previously discussed information sources. The model also produces testable predictions regarding the generalization capabilities that real chicks should exhibit if trained in circular arenas of varying size. The paper also highlights the potentialities of the model to address other experiments on animals' navigation and analyses its strengths and weaknesses in comparison to other models.

[1]  E. Tolman Cognitive maps in rats and men. , 1948, Psychological review.

[2]  J. O'Keefe,et al.  The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. , 1971, Brain research.

[3]  P. Day The Organisation of Learning , 1977 .

[4]  L. Nadel,et al.  The Hippocampus as a Cognitive Map , 1978 .

[5]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[6]  J. Rothwell Principles of Neural Science , 1982 .

[7]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[8]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[9]  K. Cheng A purely geometric module in the rat's spatial representation , 1986, Cognition.

[10]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[11]  D. Hubel Eye, brain, and vision , 1988 .

[12]  E. Reed The Ecological Approach to Visual Perception , 1989 .

[13]  Vijaykumar Gullapalli,et al.  A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[14]  M Zanforlin,et al.  Geometric modules in animals' spatial representations: a test with chicks (Gallus gallus domesticus). , 1990, Journal of comparative psychology.

[15]  A. Treves,et al.  Rats, nets, maps, and the emergence of place cells , 1992, Psychobiology.

[16]  B. Poucet Spatial cognitive maps in animals: new hypotheses on their structure and neural mechanisms. , 1993, Psychological review.

[17]  Joel L. Davis,et al.  Adaptive Critics and the Basal Ganglia , 1995 .

[18]  J. O’Keefe,et al.  Geometric determinants of the place fields of hippocampal neurons , 1996, Nature.

[19]  M. Recce,et al.  Memory for places: A navigational model in support of Marr's theory of hippocampal function , 1996, Hippocampus.

[20]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[21]  G. Vallortigara,et al.  Young chickens learn to localize the centre of a spatial environment , 1997, Journal of Comparative Physiology A.

[22]  J. Taube,et al.  Effects of repeated disorientation on the acquisition of spatial tasks in rats: dissociation between the appetitive radial arm maze and aversive water maze. , 1997, Journal of experimental psychology. Animal behavior processes.

[23]  J O'Keefe,et al.  Robotic and neuronal simulation of the hippocampus and rat navigation. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[24]  Alan C. Kamil,et al.  The seed-storing corvid Clark's nutcracker learns geometric relationships among landmarks , 1997, Nature.

[25]  Jean-Arcady Meyer,et al.  BIOLOGICALLY BASED ARTIFICIAL NAVIGATION SYSTEMS: REVIEW AND PROSPECTS , 1997, Progress in Neurobiology.

[26]  Christian Balkenius,et al.  Computational models of classical conditioning: a comparative study , 1998 .

[27]  William A. Phillips,et al.  Reinforcement landmark learning , 1998 .

[28]  G. Vallortigara,et al.  Searching for the center: spatial cognition in the domestic chick (Gallus gallus). , 2000, Journal of experimental psychology. Animal behavior processes.

[29]  D. Parisi,et al.  Classical and instrumental conditioning : From laboratory phenomena to integrated mechanisms for adaptation , 2000 .

[30]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[31]  David J. Foster,et al.  A model of hippocampally dependent navigation, using the temporal difference learning rule , 2000, Hippocampus.

[32]  Sven Koenig 'From Animals to Animats 5': Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior , 2000, Artificial Life.

[33]  Hanspeter A. Mallot,et al.  Biomimetic robot navigation , 2000, Robotics Auton. Syst..

[34]  Sharon R. Doerkson,et al.  Use of Landmark Configuration in Pigeons and Humans : II . Generality Across Search Tasks , 2001 .

[35]  C Thinus-Blanc,et al.  Rhesus monkeys use geometric and nongeometric information during a reorientation task. , 2001, Journal of experimental psychology. General.

[36]  G. Vallortigara,et al.  Encoding of geometric and landmark information in the left and right hemispheres of the Avian Brain. , 2001, Behavioral neuroscience.

[37]  Valeria Anna Sovrano,et al.  Modularity and spatial reorientation in a simple mind: encoding of geometric and nongeometric properties of a spatial environment by fish , 2002, Cognition.

[38]  E. Spelke,et al.  Human Spatial Representation: Insights from Animals , 2002 .

[39]  David Filliat,et al.  Map-based navigation in mobile robots: II. A review of map-learning and path-planning strategies , 2003, Cognitive Systems Research.

[40]  Orazio Miglino,et al.  Evolving an action-based mechanism for the interpretation of geometrical clues during robot navigation , 2004, Connect. Sci..

[41]  Richard S. Sutton,et al.  Landmark learning: An illustration of associative search , 1981, Biological Cybernetics.

[42]  Catherine Thinus-Blanc,et al.  Generalization in place learning and geometry knowledge in rats. , 2004, Learning & memory.

[43]  T. S. Collett,et al.  Landmark learning in bees , 1983, Journal of comparative physiology.

[44]  Dana H. Ballard,et al.  Learning to perceive and act by trial and error , 1991, Machine Learning.

[45]  T. S. Collett,et al.  Landmark learning and visuo-spatial memories in gerbils , 1986, Journal of Comparative Physiology A.

[46]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.