Development of an Autonomous Visual Perception System for Robots Using Object-Based Visual Attention

Unlike the traditional robotic systems in which the perceptual behaviors are manually designed by programmers for a given task and environment, autonomous perception of the world is one of the challenging issues in the cognitive robotics. It is known that the selective attention mechanism serves to link the processes of perception, action and learning (Grossberg, 2007; Tipper et al., 1998). It endows humanswith the cognitive capability that allows them to learn and think about how to perceive the environment autonomously. This visual attention based autonomous perception mechanism involves two aspects: conscious aspect that directs perception based on the current task and learned knowledge, and unconscious aspect that directs perception in the case of facing an unexpected or unusual situation. The top-down attention mechanism (Wolfe, 1994) is responsible for the conscious aspect whereas the bottom-up attentionmechanism (Treisman & Gelade, 1980) corresponds to the unconscious aspect. This paper therefore discusses about how to build an artificial system of autonomous visual perception. Three fundamental problems are addressed in this paper. The first problem is about pre-attentive segmentation for object-based attention. It is known that attentional selection is either space-based or object-based (Scholl, 2001). The space-based theory holds that attention is allocated to a spatial location (Posner et al., 1980). The object-based theory, however, posits that some pre-attentive processes serve to segment the field into discrete objects, followed by the attention that deals with one object at a time (Duncan, 1984). This paper proposes that object-based attention has the following three advantages in terms of computations: 1) Object-based attention is more robust than space-based attention since the attentional activation at the object level is estimated by accumulating contributions of all components within that object, 2) attending to an exact object can provide more useful information (e.g., shape and size) to produce the appropriate actions than attending to a spatial location, and 3) the discrete objects obtained by pre-attentive segmentation are required in the case that a global feature (e.g., shape) is selected to guide the top-down attention. Thus this paper adopts the object-based visual attention theory (Duncan, 1984; Scholl, 2001). Although a few object-based visual attention models have been proposed, such as (Sun, 2008; Sun & Fisher, 2003), developing a pre-attentive segmentation algorithm is still a challenging issue as it is a unsupervised process. This issue includes three types of challenges: 1) The Development of an Autonomous Visual Perception System for Robots Using Object-Based Visual Attention

[1]  Robert B. Fisher,et al.  Object-based visual attention for computer vision , 2003, Artif. Intell..

[2]  B. Scholl Objects and attention: the state of the art , 2001, Cognition.

[3]  Alberto Finzi,et al.  Model-based control architecture for attentive robots in rescue scenarios , 2008, Auton. Robots.

[4]  Michael Isard,et al.  Active Contours: The Application of Techniques from Graphics, Vision, Control Theory and Statistics to Visual Tracking of Shapes in Motion , 2000 .

[5]  Ronen Basri,et al.  Fast multiscale image segmentation , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[7]  Robert B. Fisher,et al.  A computer vision model for visual-object-based attention and eye movements , 2008, Comput. Vis. Image Underst..

[8]  Simone Frintrop,et al.  Attentional Landmarks and Active Gaze Control for Visual SLAM , 2008, IEEE Transactions on Robotics.

[9]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[10]  Rajesh P. N. Rao,et al.  Object indexing using an iconic sparse distributed memory , 1995, Proceedings of IEEE International Conference on Computer Vision.

[11]  Azriel Rosenfeld,et al.  Hierarchical Image Analysis Using Irregular Tessellations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Shumeet Baluja,et al.  Dynamic Relevance: Vision-Based Focus of Attention Using Artificial Neural Networks. (Technical Note) , 1997, Artif. Intell..

[13]  Simone Frintrop,et al.  Most salient region tracking , 2009, 2009 IEEE International Conference on Robotics and Automation.

[14]  Ronen Basri,et al.  Hierarchy and adaptivity in segmenting visual scenes , 2006, Nature.

[15]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[16]  T. S. Lee,et al.  Dynamics of subjective contour formation in the early visual cortex. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Bärbel Mertsching,et al.  Data- and Model-Driven Gaze Control for an Active-Vision System , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[19]  Jean-Michel Jolion Stochastic pyramid revisited , 2003, Pattern Recognit. Lett..

[20]  Peter Meer,et al.  Stochastic image pyramids , 1989, Comput. Vis. Graph. Image Process..

[21]  Rajesh P. N. Rao,et al.  An Active Vision Architecture Based on Iconic Representations , 1995, Artif. Intell..

[22]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[23]  S. Grossberg,et al.  Linking attention to learning, expectation, competition and consciousness. n , 2005 .

[24]  T. Poggio,et al.  What and where: A Bayesian inference theory of attention , 2010, Vision Research.

[25]  Fiora Pirri,et al.  Robot task-driven attention , 2006, PCAR '06.

[26]  M. Posner,et al.  Attention and the detection of signals. , 1980, Journal of experimental psychology.

[27]  Giulio Sandini,et al.  Object-based Visual Attention: a Model for a Behaving Robot , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[28]  Eckehard G. Steinbach,et al.  A Probabilistic Appearance Representation and Its Application to Surprise Detection in Cognitive Robots , 2010, IEEE Transactions on Autonomous Mental Development.

[29]  J. Duncan Converging levels of analysis in the cognitive neuroscience of visual attention. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[30]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[31]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[32]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  T. Hoya,et al.  Notions of intuition and attention modeled by a hierarchically arranged generalized regression neural network , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[34]  S P Tipper,et al.  Action-based mechanisms of attention. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[35]  Pietro Perona,et al.  On the usefulness of attention for object recognition , 2004 .

[36]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[37]  Stephen Grossberg,et al.  Consciousness CLEARS the mind , 2007, Neural Networks.

[38]  Rajesh P. N. Rao,et al.  Eye movements in iconic visual search , 2002, Vision Research.

[39]  Stephen Grossberg,et al.  Adaptive resonance theory , 1997, Scholarpedia.

[40]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[41]  Simone Frintrop,et al.  VOCUS: A Visual Attention System for Object Detection and Goal-Directed Search , 2006, Lecture Notes in Computer Science.

[42]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[43]  Bärbel Mertsching,et al.  Evaluation of Visual Attention Models for Robots , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[44]  Stephen Grossberg,et al.  Adaptive Resonance Theory , 2010, Encyclopedia of Machine Learning.

[45]  George K. I. Mann,et al.  An Object-Based Visual Attention Model for Robotic Applications , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  J. Duncan Selective attention and the organization of visual information. , 1984, Journal of experimental psychology. General.

[47]  John MacCormick Stochastic algorithms for visual tracking: probabilistic modelling and stochastic algorithms for visual localisation and tracking , 2000 .

[48]  Brian Scassellati,et al.  Active vision for sociable robots , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[49]  Fiora Pirri,et al.  A biologically plausible robot attention model, based on space and time , 2006, Cognitive Processing.

[50]  Pietro Perona,et al.  Overcomplete steerable pyramid filters and rotation invariance , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[51]  J. Duncan,et al.  Competitive brain activity in visual attention , 1997, Current Opinion in Neurobiology.