Attentional Selection for Action in Mobile Robots

During the last few years attention has become an important issue in machine vision. Studies of attentional mechanisms in biological vision have inspired many computational models (Tsotsos et al., 1995; Itti & Koch, 2000; Frintrop et al., 2005; Torralba et al., 2006; Navalpakkan & Itti, 2006). Most of them follow the assumption of limited capacity associated to the role of attention from psychological proposals (Broadbent, 1958; Laberge, 1995). These theories hypothesize that the visual system has limited processing capacity and that attention acts as a filter selecting the information that should be processed. This assumption has been criticized by many authors who affirm that the human perceptual system processing capacity is enormous (Neumann et al., 1986; Allport, 1987). From this point of view, a stage selecting the information to be processed is not needed. Instead, they claim the role of attention from the perspective of selection for action (Allport, 1987). According to this new conception, the function of attention is to avoid behavioural disorganization by selecting the appropriate information to drive task execution. Such a notion of attention is very interesting in robotics, where the aim is to build autonomous robots that interact with complex environments, keeping multiple behavioural objectives. Attentional selection for action can guide robot behaviours by focusing on relevant visual targets while avoiding distracting elements. Moreover, it can be conceived as a coordination mechanism, since stimuli selection allows serializing the actions of, potentially, multiple active behaviours. To exploit these ideas, a visual attention system based on the selection for action theory has been developed. The system is a central component of a control architecture from which complex behaviours emerge according to different attention-action links. It has been designed and tested on a mobile robot endowed with a stereo vision head. Figure 1 shows the proposed control model. Sensory-motor abilities of the robot are divided into two groups that lead to two subsystems: the visual attention system, which includes the mechanisms that give rise to the selection of visual information, and the set of high-level behaviours that use visual information to accomplish their goals. Both subsystems are connected to the motor control system, which is in charge of effectively executing motor responses generated by the other two subsystems. Each high-level behaviour modulates the visual system in a specific way in order to get the necessary visual information. The incoming flow of information affects high-level

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Laurent Itti,et al.  An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  A. H. C. Heijden,et al.  Visual selective attention: Introductory remarks , 1986, Psychological research.

[4]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[5]  Clemens A. Szyperski,et al.  Component software - beyond object-oriented programming , 2002 .

[6]  M. Goodale,et al.  The visual brain in action , 1995 .

[7]  Xiaoshan Li,et al.  Component-based software engineering : The need to link methods and their theories , 2005 .

[8]  J. T. Enright,et al.  Monocularly programmed human saccades during vergence changes? , 1998, The Journal of physiology.

[9]  A. Allport,et al.  Selection for action: Some behavioral and neurophysiological considerations of attention and action , 1987 .

[10]  J. Deutsch Perception and Communication , 1958, Nature.

[11]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[12]  Cesar Bandera,et al.  Foveal machine vision systems , 1989, Conference Proceedings., IEEE International Conference on Systems, Man and Cybernetics.

[13]  Simone Frintrop,et al.  Goal-Directed Search with a Top-Down Modulated Computational Attention System , 2005, DAGM-Symposium.

[14]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[15]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[16]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[17]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Cordelia Schmid,et al.  Indexing Based on Scale Invariant Interest Points , 2001, ICCV.

[20]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[21]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.