Directed attention - a cognitive vision system for a mobile robot

In this paper we introduce a method to combine bottom-up saliency, texture descriptors and top-down attention to direct the focus on arbitrary objects in a human-robot interaction scenario. In a usual bottom-up attention process interesting features are extracted out of the image to focus on salient regions. We expand this approach with a top-down process, which determines in one single image the most discriminative features that separate fore- and background best for the observed context. In order to enable the robot also focussing on complex objects like humans, we propose to use a simple model consisting of multiple regions. Thus, the position of an object can be estimated using a probability distribution generated by the attention process. Finally, the system is unrestricted in terms of scene constraints or object appearance and can deal with the requirements of a mobile robot scenario. The attention system has been evaluated both in a human-robot interaction scenario, directing the attention towards interaction partners and in an object learning scenario, focusing on objects. Our results show that the system performed robustly in different applications without any changes to the algorithm.

[1]  Ajo Fod,et al.  Laser-Based People Tracking , 2002 .

[2]  Katharina J. Rohlfing,et al.  Toward designing a robot that learns actions from parental demonstrations , 2008, 2008 IEEE International Conference on Robotics and Automation.

[3]  Maja J. Mataric,et al.  A laser-based people tracker , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[4]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[5]  S. Frintrop,et al.  Cognitive Data Association for Visual Person Tracking , 2008 .

[6]  Antonio Torralba,et al.  Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[7]  Sebastian Wrede,et al.  Who am I talking with? A face memory for social robots , 2008, 2008 IEEE International Conference on Robotics and Automation.

[8]  Sebastian Lang,et al.  BIRON - The Bielefeld Robot Companion , 2004 .

[9]  Simone Frintrop,et al.  Most salient region tracking , 2009, 2009 IEEE International Conference on Robotics and Automation.

[10]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[11]  Robert B. Fisher,et al.  Object-based visual attention for computer vision , 2003, Artif. Intell..

[12]  Sebastian Lang,et al.  Providing the basis for human-robot-interaction: a multi-modal attention system for a mobile robot , 2003, ICMI '03.

[13]  J. Wolfe Visual search in continuous, naturalistic stimuli , 1994, Vision Research.

[14]  M. Topi,et al.  Robust texture classification by subsets of local binary patterns , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[15]  Matti Pietikäinen,et al.  Robust Texture Classification by Subsets of Local Binary Patterns , 2000, ICPR.

[16]  D. Heeger,et al.  Center-surround interactions in foveal and peripheral vision , 2000, Vision Research.

[17]  Alexandre Bernardino,et al.  Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub , 2008, 2008 IEEE International Conference on Robotics and Automation.

[18]  George K. I. Mann,et al.  An object-based visual attention model for robots , 2008, 2008 IEEE International Conference on Robotics and Automation.

[19]  Brian Scassellati,et al.  A Context-Dependent Attention System for a Social Robot , 1999, IJCAI.