From Saliency to Eye Gaze: Embodied Visual Selection for a Pan-Tilt-Based Robotic Head

This paper introduces a model of gaze behavior suitable for robotic active vision. Built upon a saliency map taking into account motion saliency, the presented model estimates the dynamics of different eye movements, allowing to switch from fixational movements, to saccades and to smooth pursuit. We investigate the effect of the embodiment of attentive visual selection in a pan-tilt camera system. The constrained physical system is unable to follow the important fluctuations characterizing the maxima of a saliency map and a strategy is required to dynamically select what is worth attending and the behavior, fixation or target pursuing, to adopt. The main contributions of this work are a novel approach toward real time, motion-based saliency computation in video sequences, a dynamic model for gaze prediction from the saliency map, and the embodiment of the modeled dynamics to control active visual sensing.

[1]  R. Singer Estimating Optimal Tracking Filter Performance for Manned Maneuvering Targets , 1970, IEEE Transactions on Aerospace and Electronic Systems.

[2]  C. Vomscheid,et al.  Analysis of eye tracking movements using innovations generated by a Kalman filter , 2006, Medical and Biological Engineering and Computing.

[3]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[4]  Gunnar Farnebäck,et al.  Two-Frame Motion Estimation Based on Polynomial Expansion , 2003, SCIA.

[5]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[6]  Eileen Kowler Eye movements: The past 25years , 2011, Vision Research.

[7]  Leszek Wojnar,et al.  Image Analysis , 1998 .

[8]  Oleg V. Komogortsev,et al.  Eye movement prediction by Kalman filter with integrated linear horizontal oculomotor plant mechanical model , 2008, ETRA.

[9]  Nicolas Riche,et al.  Abnormal motion selection in crowds using bottom-up saliency , 2011, 2011 18th IEEE International Conference on Image Processing.

[10]  Rudolph van der Merwe,et al.  The unscented Kalman filter for nonlinear estimation , 2000, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373).

[11]  Jeffrey K. Uhlmann,et al.  New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.

[12]  Jeffrey K. Uhlmann,et al.  Unscented filtering and nonlinear estimation , 2004, Proceedings of the IEEE.

[13]  Ralf Engbert,et al.  Microsaccades uncover the orientation of covert attention , 2003, Vision Research.

[14]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Y. Bar-Shalom,et al.  The interacting multiple model algorithm for systems with Markovian switching coefficients , 1988 .

[16]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[17]  Javier R. Movellan,et al.  Infomax Control of Eye Movements , 2010, IEEE Transactions on Autonomous Mental Development.

[18]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.