Resolution of focus of attention using gaze direction estimation and saliency computation

Modeling the user's attention is useful for responsive and interactive systems. This paper proposes a method for establishing joint visual attention between an experimenter and an intelligent agent. A rapid procedure is described to track the 3D head pose of the experimenter, which is used to approximate the gaze direction. The head is modeled with a sparse grid of points sampled from the surface of a cylinder. We then propose to employ a bottom-up saliency model to single out interesting objects in the neighborhood of the estimated focus of attention. We report results on a series of experiments, where a human experimenter looks at objects placed at different locations of the visual field, and the proposed algorithm is used to locate target objects automatically. Our results indicate that the proposed approach achieves high localization accuracy and thus constitutes a useful tool for the construction of natural human-computer interfaces.

[1]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[3]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[4]  Rajesh P. N. Rao,et al.  A Cognitive Model of Imitative Development in Humans and Machines , 2007, Int. J. Humanoid Robotics.

[5]  Albert Ali Salah,et al.  Head Pose and Neural Network Based Gaze Direction Estimation for Joint Attention Modeling in Embodied Agents , 2009 .

[6]  Jitendra Malik,et al.  Learning Appearance Based Models: Mixtures of Second Moment Experts , 1996, NIPS.

[7]  Reinhard Moratz,et al.  Affordance-Based Human-Robot Interaction , 2006, Towards Affordance-Based Robot Control.

[8]  Ying Wu,et al.  Wide-range, person- and illumination-insensitive head orientation estimation , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[9]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[10]  Matthew W. Hoffman,et al.  A probabilistic model of gaze imitation and shared attention , 2006, Neural Networks.

[11]  Alexander H. Waibel,et al.  Modeling focus of attention for meeting indexing , 1999, MULTIMEDIA '99.

[12]  Qiang Ji,et al.  In the Eye of the Beholder: A Survey of Models for Eyes and Gaze , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.