When Computer Vision Gazes at Cognition

Joint attention is a core, early-developing form of social interaction. It is based on our ability to discriminate the third party objects that other people are looking at. While it has been shown that people can accurately determine whether another person is looking directly at them versus away, little is known about human ability to discriminate a third person gaze directed towards objects that are further away, especially in unconstraint cases where the looker can move her head and eyes freely. In this paper we address this question by jointly exploring human psychophysics and a cognitively motivated computer vision model, which can detect the 3D direction of gaze from 2D face images. The synthesis of behavioral study and computer vision yields several interesting discoveries. (1) Human accuracy of discriminating targets 8°-10° of visual angle apart is around 40% in a free looking gaze task; (2) The ability to interpret gaze of different lookers vary dramatically; (3) This variance can be captured by the computational model; (4) Human outperforms the current model significantly. These results collectively show that the acuity of human joint attention is indeed highly impressive, given the computational challenge of the natural looking task. Moreover, the gap between human and model performance, as well as the variability of gaze interpretation across different lookers, require further understanding of the underlying mechanisms utilized by humans for this challenging task.

[1]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[2]  C. Clifford,et al.  Adaptation to vergent and averted eye gaze. , 2014, Journal of vision.

[3]  D. Muir,et al.  A demonstration of gaze following in 3- to 6-month-olds , 1997 .

[4]  Roberto Cipolla,et al.  Determining the gaze of faces in images , 1994, Image Vis. Comput..

[5]  M. Cline The perception of where a person is looking. , 1967, The American journal of psychology.

[6]  Chris L. Baker,et al.  Action understanding as inverse planning , 2009, Cognition.

[7]  J. Bruner,et al.  The capacity for joint visual attention in the infant , 1975, Nature.

[8]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  J. Gibson,et al.  Perception of another person's looking behavior. , 1963, The American journal of psychology.

[10]  Brian J. Scholl,et al.  The psychophysics of chasing: A case study in the perception of animacy , 2009, Cognitive Psychology.

[11]  Mayu Nishimura,et al.  What are you looking at? Acuity for triadic eye gaze. , 2004, The Journal of general psychology.

[12]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  S. Carey,et al.  Understanding other minds: linking developmental psychology and functional neuroimaging. , 2004, Annual review of psychology.

[14]  M. Tomasello,et al.  Does the chimpanzee have a theory of mind? 30 years later , 2008, Trends in Cognitive Sciences.

[15]  R. Jenkins,et al.  Are you looking at me? Neural correlates of gaze adaptation , 2007, Neuroreport.

[16]  Qiang Ji,et al.  In the Eye of the Beholder: A Survey of Models for Eyes and Gaze , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  J. Stevenson The cultural origins of human cognition , 2001 .

[18]  P. Thier,et al.  How precise is gaze following in humans? , 2008, Vision Research.

[19]  R. Carpenter,et al.  Movements of the Eyes , 1978 .

[20]  Shimon Ullman,et al.  From simple innate biases to complex visual concepts , 2012, Proceedings of the National Academy of Sciences.

[21]  Dejan Todorović,et al.  Geometrical basis of perception of gaze direction , 2006, Vision Research.

[22]  Jean-Marc Odobez,et al.  Person independent 3D gaze estimation from remote RGB-D cameras , 2013, 2013 IEEE International Conference on Image Processing.