Measuring and modeling the perception of natural and unconstrained gaze in humans and machines

Humans are remarkably adept at interpreting the gaze direction of other individuals in their surroundings. This skill is at the core of the ability to engage in joint visual attention, which is essential for establishing social interactions. How accurate are humans in determining the gaze direction of others in lifelike scenes, when they can move their heads and eyes freely, and what are the sources of information for the underlying perceptual processes? These questions pose a challenge from both empirical and computational perspectives, due to the complexity of the visual input in real-life situations. Here we measure empirically human accuracy in perceiving the gaze direction of others in lifelike scenes, and study computationally the sources of information and representations underlying this cognitive capacity. We show that humans perform better in face-to-face conditions compared with recorded conditions, and that this advantage is not due to the availability of input dynamics. We further show that humans are still performing well when only the eyes-region is visible, rather than the whole face. We develop a computational model, which replicates the pattern of human performance, including the finding that the eyes-region contains on its own, the required information for estimating both head orientation and direction of gaze. Consistent with neurophysiological findings on task-specific face regions in the brain, the learned computational representations reproduce perceptual effects such as the Wollaston illusion, when trained to estimate direction of gaze, but not when trained to recognize objects or faces.

[1]  Reza Ebrahimpour,et al.  A specialized face-processing model inspired by the organization of monkey face patches explains several face-specific phenomena observed in humans , 2016, Scientific Reports.

[2]  T. Allison,et al.  Brain activation evoked by perception of gaze shifts: the influence of context , 2003, Neuropsychologia.

[3]  Wang Ke,et al.  Face pose estimation with a knowledge-based model , 2003, International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003.

[4]  Doris Y. Tsao,et al.  Functional Compartmentalization and Viewpoint Generalization Within the Macaque Face-Processing System , 2010, Science.

[5]  T. Striano,et al.  Adult gaze influences infant attention and object processing: implications for cognitive neuroscience , 2005, The European journal of neuroscience.

[6]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[7]  S. Baron-Cohen How to build a baby that can read minds: Cognitive mechanisms in mindreading. , 1994 .

[8]  D I Perrett,et al.  Organization and functions of cells responsive to faces in the temporal cortex. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[9]  S. Carey,et al.  Understanding other minds: linking developmental psychology and functional neuroimaging. , 2004, Annual review of psychology.

[10]  S. Tipper,et al.  Gaze cueing of attention: visual attention, social cognition, and individual differences. , 2007, Psychological bulletin.

[11]  Hisao Nishijo,et al.  Differential characteristics of face neuron responses within the anterior superior temporal sulcus of macaques. , 2005, Journal of neurophysiology.

[12]  Galit Yovel,et al.  It's all in your head: why is the body inversion effect abolished for headless bodies? , 2010, Journal of experimental psychology. Human perception and performance.

[13]  P. Mundy,et al.  CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE Attention, Joint Attention, and Social Cognition , 2022 .

[14]  J. Haxby,et al.  Distinct representations of eye gaze and identity in the distributed human neural system for face perception , 2000, Nature Neuroscience.

[15]  S. Altmann Rotations, Quaternions, and Double Groups , 1986 .

[16]  Takahiro Okabe,et al.  Inferring human gaze from appearance via adaptive linear regression , 2011, 2011 International Conference on Computer Vision.

[17]  S. Carey,et al.  Whose gaze will infants follow? The elicitation of gaze-following in 12-month-olds , 1998 .

[18]  William Rowan Hamilton,et al.  ON QUATERNIONS, OR ON A NEW SYSTEM OF IMAGINARIES IN ALGEBRA , 1847 .

[19]  M. Cline The perception of where a person is looking. , 1967, The American journal of psychology.

[20]  S. Langton,et al.  The influence of head contour and nose angle on the perception of eye-gaze direction , 2004, Perception & psychophysics.

[21]  N. Emery,et al.  The eyes have it: the neuroethology, function and evolution of social gaze , 2000, Neuroscience & Biobehavioral Reviews.

[22]  Linda B. Smith,et al.  Joint Attention without Gaze Following: Human Infants and Their Parents Coordinate Visual Attention to Objects through Eye-Hand Coordination , 2013, PloS one.

[23]  P. Thier,et al.  Assessing the precision of gaze following using a stereoscopic 3D virtual reality setting , 2015, Vision Research.

[24]  Dejan Todorović,et al.  Geometrical basis of perception of gaze direction , 2006, Vision Research.

[25]  J. Gibson,et al.  Perception of another person's looking behavior. , 1963, The American journal of psychology.

[26]  Roberto Cipolla,et al.  Determining the gaze of faces in images , 1994, Image Vis. Comput..

[27]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[28]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[29]  Colin W. G. Clifford,et al.  Dual-Route Model of the Effect of Head Orientation on Perceived Gaze Direction , 2014, Journal of experimental psychology. Human perception and performance.

[30]  T. Allison,et al.  Social perception from visual cues: role of the STS region , 2000, Trends in Cognitive Sciences.

[31]  Peter Robinson,et al.  Rendering of Eyes for Eye-Shape Registration and Gaze Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Qiang Ji,et al.  In the Eye of the Beholder: A Survey of Models for Eyes and Gaze , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  D. Muir,et al.  A demonstration of gaze following in 3- to 6-month-olds , 1997 .

[34]  Andrew N. Meltzoff,et al.  10 Eyes Wide Shut: The Importance of Eyes in Infant Gaze Following and Understanding Other Minds , 2006 .

[35]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[36]  William Hyde Wollaston,et al.  XIII. On the apparent direction of eyes in a portrait , 1824, Philosophical Transactions of the Royal Society of London.

[37]  Yoichi Sato,et al.  Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Jon Driver,et al.  Seen Gaze-Direction Modulates Fusiform Activity and Its Coupling with Other Brain Areas during Face Processing , 2001, NeuroImage.

[39]  Johan D. Carlin,et al.  A Head View-Invariant Representation of Gaze Direction in Anterior Superior Temporal Sulcus , 2011, Current Biology.

[40]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41]  S. Langton The Mutual Influence of Gaze and Head Orientation in the Analysis of Social Attention Direction , 2000, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[42]  R. Jenkins,et al.  Are you looking at me? Neural correlates of gaze adaptation , 2007, Neuroreport.

[43]  Norbert Krüger,et al.  Determination of face position and pose with a learned representation based on labelled graphs , 1997, Image Vis. Comput..

[44]  A. Meltzoff,et al.  The development of gaze following and its relation to language. , 2005, Developmental science.

[45]  Dave S. Kerby,et al.  The effect of head turn on the perception of gaze , 2009, Vision Research.

[46]  J. Stevenson The cultural origins of human cognition , 2001 .

[47]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[48]  R. Dolan,et al.  Separate Coding of Different Gaze Directions in the Superior Temporal Sulcus and Inferior Parietal Lobule , 2007, Current Biology.

[49]  Tobias Bachmeier,et al.  Theories Of Theories Of Mind , 2016 .

[50]  Heiko Neumann,et al.  Detection of Head Pose and Gaze Direction for Human-Computer Interaction , 2006, PIT.

[51]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[52]  P. Thier,et al.  How precise is gaze following in humans? , 2008, Vision Research.

[53]  A. Young,et al.  Reading the mind from eye gaze , 2002, Neuropsychologia.

[54]  M. Tomasello Joint attention as social cognition. , 1995 .

[55]  Jean-Marc Odobez,et al.  Person independent 3D gaze estimation from remote RGB-D cameras , 2013, 2013 IEEE International Conference on Image Processing.

[56]  Matthew F. Peterson,et al.  Individual Differences in Eye Movements During Face Identification Reflect Observer-Specific Optimal Points of Fixation , 2013, Psychological science.

[57]  J. Decety,et al.  From the perception of action to the understanding of intention , 2001, Nature reviews. Neuroscience.

[58]  D. Perrett,et al.  Understanding the intentions of others from visual signals: Neurophysiological evidence. , 1994 .

[59]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[60]  Ashutosh Saxena,et al.  Learning Depth from Single Monocular Images , 2005, NIPS.

[61]  Antonio Torralba,et al.  Where are they looking? , 2015, NIPS.

[62]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  C. Clifford,et al.  Adaptation to vergent and averted eye gaze. , 2014, Journal of vision.

[64]  Sachiko Amano,et al.  Infant shifting attention from an adult’s face to an adult’s hand: a precursor of joint attention , 2004 .

[65]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[66]  J. Bruner,et al.  The capacity for joint visual attention in the infant , 1975, Nature.

[67]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[68]  Vincent M. Reid,et al.  Social cognition in the first year , 2006, Trends in Cognitive Sciences.

[69]  Mayu Nishimura,et al.  What are you looking at? Acuity for triadic eye gaze. , 2004, The Journal of general psychology.

[70]  D. Perrett,et al.  Neural Representation for the Perception of the Intentionality of Actions , 2000, Brain and Cognition.

[71]  Masatoshi Okutomi,et al.  Seamless image cloning by a closed form solution of a modified Poisson problem , 2012, SA '12.

[72]  D. Whitteridge Movements of the eyes R. H. S. Carpenter, Pion Ltd, London (1977), 420 pp., $27.00 , 1979, Neuroscience.

[73]  J. Decety,et al.  Brain Regions Involved in the Perception of Gaze: A PET Study , 1998, NeuroImage.

[74]  D. Muir,et al.  Gaze-following : its development and significance , 2007 .

[75]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[76]  Shimon Ullman,et al.  From simple innate biases to complex visual concepts , 2012, Proceedings of the National Academy of Sciences.

[77]  C. Moore,et al.  Social Understanding at the End of the First Year of Life , 1994 .

[78]  Filip Germeys,et al.  Perceiving where another person is looking: the integration of head and body information in estimating another person’s gaze , 2015, Front. Psychol..

[79]  S. Anstis,et al.  The perception of where a face or television "portrait" is looking. , 1969, The American journal of psychology.

[80]  Roel Vertegaal,et al.  Effects of Gaze on Multiparty Mediated Communication , 2000, Graphics Interface.

[81]  Gedeon O. Deák,et al.  Nine-month-olds' shared visual attention as a function of gesture and object location. , 2004 .

[82]  Jean-Marc Odobez,et al.  A semi-automated system for accurate gaze coding in natural dyadic interactions , 2013, ICMI '13.

[83]  D. Sparks,et al.  Eye-head coordination during head-unrestrained gaze shifts in rhesus monkeys. , 1997, Journal of neurophysiology.

[84]  Mario Fritz,et al.  Appearance-based gaze estimation in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[85]  C. Moore,et al.  Joint attention : its origins and role in development , 1995 .

[86]  Jean-Marc Odobez,et al.  Gaze estimation from multimodal Kinect data , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[87]  ShenChunhua,et al.  Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields , 2016 .