How can bottom-up information shape learning of top-down attention-control skills?

How does bottom-up information affect the development of top-down attentional control skills during the learning of visuomotor tasks? Why is the eye fovea so small? Strong evidence supports the idea that in humans foveation is mainly guided by task-specific skills, but how these are learned is still an important open problem. We designed and implemented a simulated neural eye-arm coordination model to study the development of attention control in a search-and-reach task involving simple coloured stimuli. The model is endowed with a hard-wired bottom-up attention saliency map and a top-down attention component which acquires task-specific knowledge on potential gaze targets and their spatial relations. This architecture achieves high performance very fast. To explain this result, we argue that: (a) the interaction between bottom-up and top-down mechanisms supports the development of task-specific attention control skills by allowing an efficient exploration of potentially useful gaze targets; (b) bottom-up mechanisms boast the exploitation of the initial limited task-specific knowledge by actively selecting areas where it can be suitably applied; (c) bottom-up processes shape objects representation, their value, and their roles (these can change during learning, e.g. distractors can become useful attentional cues); (d) increasing the size of the fovea alleviates perceptual aliasing, but at the same time increases input processing costs and the number of trials required to learn. Overall, the results indicate that bottom-up attention mechanisms can play a relevant role in attention control, especially during the acquisition of new task-specific skills, but also during task performance.

[1]  D. Shore,et al.  More efficient scanning for familiar faces. , 2008, Journal of vision.

[2]  R. Klein,et al.  Inhibition of return , 2000, Trends in Cognitive Sciences.

[3]  N. P. Bichot,et al.  A visual salience map in the primate frontal eye field. , 2005, Progress in brain research.

[4]  J. Tenenbaum,et al.  Theory-based Bayesian models of inductive learning and reasoning , 2006, Trends in Cognitive Sciences.

[5]  M. Lungarella,et al.  Information Self-Structuring: Key Principle for Learning and Development , 2005, Proceedings. The 4nd International Conference on Development and Learning, 2005..

[6]  G. Schöner,et al.  Dynamic Field Theory of Movement Preparation , 2022 .

[7]  Jochen Triesch,et al.  Learning independent causes in natural images explains the spacevariant oblique effect , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[8]  R. Johansson,et al.  Eye–Hand Coordination during Learning of a Novel Visuomotor Task , 2005, The Journal of Neuroscience.

[9]  S. Kastner,et al.  Topographic maps in human frontal and parietal cortex , 2009, Trends in Cognitive Sciences.

[10]  P. Cisek Integrated Neural Processes for Defining Potential Actions and Deciding between Them: A Computational Model , 2006, The Journal of Neuroscience.

[11]  A. Pouget,et al.  Multisensory spatial representations in eye-centered coordinates for reaching , 2002, Cognition.

[12]  Axel Cleeremans,et al.  Contextual Cueing of Visuo-Spatial Attention: Implicit or Explicit Learning? , 2001 .

[13]  Satinder Singh Transfer of Learning by Composing Solutions of Elemental Sequential Tasks , 1992, Mach. Learn..

[14]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[15]  Peter Ford Dominey,et al.  A cortico-subcortical model for generation of spatially accurate sequential saccades. , 1992, Cerebral cortex.

[16]  Peter E. Latham,et al.  Statistically Efficient Estimation Using Population Coding , 1998, Neural Computation.

[17]  Mary M Hayhoe,et al.  Task and context determine where you look. , 2016, Journal of vision.

[18]  C. Koch,et al.  Task-demands can immediately reverse the effects of sensory-driven saliency in complex visual stimuli. , 2008, Journal of vision.

[19]  Angelo Rega,et al.  A Model of Reaching that Integrates Reinforcement Learning and Population Encoding of Postures , 2006, SAB.

[20]  Krista A. Ehinger,et al.  Modelling search for people in 900 scenes: A combined source model of eye guidance , 2009 .

[21]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[22]  M. Land Eye movements and the control of actions in everyday life , 2006, Progress in Retinal and Eye Research.

[23]  B. Tatler,et al.  Yarbus, eye movements, and vision , 2010, i-Perception.

[24]  Dario Floreano,et al.  The contribution of active body movement to visual development in evolutionary robots , 2005, Neural Networks.

[25]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[26]  O. Hikosaka,et al.  Role of the basal ganglia in the control of purposive saccadic eye movements. , 2000, Physiological reviews.

[27]  Raymond Klein,et al.  Inhibition of return , 2000, Trends in Cognitive Sciences.

[28]  George L. Malcolm,et al.  Combining top-down processes to guide eye movements during real-world scene search. , 2010, Journal of vision.

[29]  S. Treue Visual attention: the where, what, how and why of saliency , 2003, Current Opinion in Neurobiology.

[30]  Christian Balkenius,et al.  Integrating Epistemic Action (Active Vision) and Pragmatic Action (Reaching): A Neural Architecture for Camera-Arm Robots , 2008, SAB.

[31]  Richard S. Sutton,et al.  Dimensions of Reinforcement Learning , 1998 .

[32]  Christian Balkenius,et al.  Attention as selection-for-action: a scheme for active perception , 1999, 1999 Third European Workshop on Advanced Mobile Robots (Eurobot'99). Proceedings (Cat. No.99EX355).

[33]  Vassilis Cutsuridis,et al.  A Cognitive Model of Saliency, Attention, and Picture Scanning , 2009, Cognitive Computation.