Recognizing People in Motion

Natural movements of the face and body, as well as voice, provide converging cues to a person's identity. To date, person recognition has been studied primarily with static images of faces. Face recognition, however, is part of a larger system, whose preeminent goal is to efficiently recognize dynamic familiar people in unconstrained environments. We present a comprehensive framework for understanding person recognition as it happens in the real world. In this framework, dynamic information plays the central role in binding multi-modal information from the face, body, and the voice to achieve robust and highly accurate recognition. The superior temporal sulcus (STS) integrates multisensory, dynamic information from the whole person for recognition, thereby complementing its role in social cognition.

[1]  S. Lea,et al.  Perception of Emotion from Dynamic Point-Light Displays Represented in Dance , 1996, Perception.

[2]  E. Vatikiotis-Bateson,et al.  `Putting the Face to the Voice' Matching Identity across Modality , 2003, Current Biology.

[3]  K. Lander,et al.  Exploring the Motion Advantage: Evaluating the Contribution of Familiarity and Differences in Facial Motion , 2017, Quarterly journal of experimental psychology.

[4]  Vaidehi S. Natu,et al.  Unaware Person Recognition From the Body When Face Identification Fails , 2013, Psychological science.

[5]  N. Troje,et al.  Person identification from biological motion: Effects of structural and kinematic cues , 2005, Perception & psychophysics.

[6]  Anil K. Jain,et al.  Unconstrained face recognition: Establishing baseline human performance via crowdsourcing , 2014, IEEE International Joint Conference on Biometrics.

[7]  N. Kanwisher,et al.  The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception , 1997, The Journal of Neuroscience.

[8]  Galit Yovel,et al.  An Integrated Face–Body Representation in the Fusiform Gyrus but Not the Lateral Occipital Cortex , 2014, Journal of Cognitive Neuroscience.

[9]  Galit Yovel,et al.  Two neural pathways of face processing: A critical evaluation of current models , 2015, Neuroscience & Biobehavioral Reviews.

[10]  Xueting Li,et al.  Representation of Contextually Related Multiple Objects in the Human Ventral Visual Pathway , 2013, Journal of Cognitive Neuroscience.

[11]  A. O'Toole,et al.  The Role of the Face and Body in Unfamiliar Person Identification , 2013 .

[12]  Sherryse L. Corrow,et al.  Recognizing and identifying people: A neuropsychological review , 2016, Cortex.

[13]  Nancy Kanwisher,et al.  Functional Organization of Social Perception and Cognition in the Superior Temporal Sulcus , 2015, Cerebral cortex.

[14]  H. Bülthoff,et al.  The use of facial motion and facial form during the processing of identity , 2003, Vision Research.

[15]  Maggie Shiffrar,et al.  Experience, context, and the visual perception of human movement. , 2004, Journal of experimental psychology. Human perception and performance.

[16]  A. Young,et al.  Understanding face recognition. , 1986, British journal of psychology.

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Alice J. O'Toole,et al.  Face Recognition Algorithms Surpass Humans Matching Faces Over Changes in Illumination , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Winrich A. Freiwald,et al.  Contrasting Specializations for Facial Motion within the Macaque Face-Processing System , 2015, Current Biology.

[20]  Alan Johnston,et al.  Motion as a cue for viewpoint invariance , 2005 .

[21]  S. Runeson,et al.  Kinematic specification of dynamics as an informational basis for person and action perception: Expe , 1983 .

[22]  J. Haxby,et al.  fMRI Responses to Video and Point-Light Displays of Moving Humans and Manipulable Objects , 2003, Journal of Cognitive Neuroscience.

[23]  D. Roark,et al.  2 Memory for Moving Faces : The Interplay of Two RecognitionSystems , 2010 .

[24]  Jenq-Neng Hwang,et al.  A Review on Video-Based Human Activity Recognition , 2013, Comput..

[25]  Alice J. O'Toole,et al.  Comparison of human and computer performance across face recognition experiments , 2014, Image and Vision Computing.

[26]  T. Allison,et al.  Social perception from visual cues: role of the STS region , 2000, Trends in Cognitive Sciences.

[27]  L. Rosenblum,et al.  Lip-Read Me Now, Hear Me Better Later , 2006, Psychological science.

[28]  Saeid Nahavandi,et al.  A Review of Vision-Based Gait Recognition Methods for Human Identification , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.

[29]  Max Coltheart,et al.  The effects of inversion and familiarity on face versus body cues to person recognition. , 2012, Journal of experimental psychology. Human perception and performance.

[30]  Carina A. Hahn,et al.  Dissecting the time course of person recognition in natural viewing environments. , 2016, British journal of psychology.

[31]  A. Johnston,et al.  Categorizing sex and identity from the biological motion of faces , 2001, Current Biology.

[32]  J. Haxby,et al.  Neural systems for recognition of familiar faces , 2007, Neuropsychologia.

[33]  J. Haxby,et al.  Distinct representations of eye gaze and identity in the distributed human neural system for face perception , 2000, Nature Neuroscience.

[34]  Christopher J. Fox,et al.  Defining the face processing network: Optimization of the functional localizer in fMRI , 2009, Human brain mapping.

[35]  S. Schweinberger,et al.  The role of audiovisual asynchrony in person recognition , 2010, Quarterly journal of experimental psychology.

[36]  Vincent Walsh,et al.  Combined TMS and fMRI Reveal Dissociable Cortical Pathways for Dynamic and Static Face Perception , 2014, Current Biology.

[37]  L. Rosenblum,et al.  Hearing a face: Cross-modal speaker matching using isolated visible speech , 2006, Perception & psychophysics.

[38]  V. Bruce,et al.  Face Recognition in Poor-Quality Video: Evidence From Security Surveillance , 1999 .

[39]  Jean-Luc Schwartz,et al.  No, There Is No 150 ms Lead of Visual Speech on Auditory Speech, but a Range of Audiovisual Asynchronies Varying from Small Audio Lead to Large Audio Lag , 2014, PLoS Comput. Biol..

[40]  A. O'Toole,et al.  Psychological and neural perspectives on the role of motion in face recognition. , 2003, Behavioral and cognitive neuroscience reviews.

[41]  B. Bahrami,et al.  Neuroanatomical correlates of biological motion detection , 2013, Neuropsychologia.

[42]  K. Kriegstein,et al.  Neuroscience and Biobehavioral Reviews Person Recognition and the Brain: Merging Evidence from Patients and Healthy Individuals , 2022 .

[43]  K. Scherer,et al.  The Body Action and Posture Coding System (BAP): Development and Reliability , 2012 .

[44]  Shree K. Nayar,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Describable Visual Attributes for Face Verification and Image Search , 2022 .

[45]  K. Scherer,et al.  Emotion expression in body action and posture. , 2012, Emotion.

[46]  Massimo Tistarelli,et al.  Biometrics and Identity Management , 2008, Lecture Notes in Computer Science.

[47]  Y. Trope,et al.  Body Cues, Not Facial Expressions, Discriminate Between Intense Positive and Negative Emotions , 2012, Science.

[48]  Michael J Brammer,et al.  Crossmodal identification , 1998, Trends in Cognitive Sciences.

[49]  N. Kanwisher,et al.  The Human Body , 2001 .

[50]  J. Cutting,et al.  Recognizing friends by their walk: Gait perception without familiarity cues , 1977 .

[51]  Alfred Anwander,et al.  Direct Structural Connections between Voice- and Face-Recognition Areas , 2011, The Journal of Neuroscience.

[52]  D. Pisoni,et al.  Crossmodal Source Identification in Speech Perception , 2004, Ecological psychology : a publication of the International Society for Ecological Psychology.

[53]  A. O'Toole,et al.  Recognizing moving faces: a psychological and neural synthesis , 2002, Trends in Cognitive Sciences.

[54]  S. Schweinberger,et al.  Hearing Facial Identities , 2007, Quarterly journal of experimental psychology.

[55]  N. Kanwisher,et al.  The fusiform face area: a cortical region specialized for the perception of faces , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[56]  F. Pollick,et al.  Exaggerating Temporal Differences Enhances Recognition of Individuals from Point Light Displays , 2000, Psychological science.

[57]  Jerome Sallet,et al.  Are there specialized circuits for social cognition and are they unique to humans? , 2013, Current Opinion in Neurobiology.

[58]  A. O'Toole,et al.  Recognizing people from dynamic and static faces and bodies: Dissecting identity with a fusion approach , 2010 .

[59]  John A. Molino Pure-tone equal-loudness contours for standard tones of different frequencies , 1973 .

[60]  T. Poggio,et al.  Cognitive neuroscience: Neural mechanisms for the recognition of biological movements , 2003, Nature Reviews Neuroscience.

[61]  P. Belin,et al.  Understanding voice perception. , 2011, British journal of psychology.

[62]  B. de Gelder,et al.  The perception of emotion in body expressions. , 2015, Wiley interdisciplinary reviews. Cognitive science.

[63]  K. Lander,et al.  Independence of face identity and expression processing: exploring the role of motion , 2015, Front. Psychol..

[64]  Rebecca F. Schwarzlose,et al.  Separate Face and Body Selectivity on the Fusiform Gyrus , 2005, The Journal of Neuroscience.

[65]  Pascal Belin,et al.  People-selectivity, audiovisual integration and heteromodality in the superior temporal sulcus , 2014, Cortex.

[66]  Daniel D. Dilks,et al.  Differential selectivity for dynamic versus static information in face-selective cortical regions , 2011, NeuroImage.

[67]  S. Schweinberger,et al.  Hearing facial identities: Brain correlates of face–voice integration in person identification , 2011, Cortex.

[68]  J. Schultz,et al.  Natural facial motion enhances cortical responses to faces , 2009, Experimental Brain Research.

[69]  E. Vatikiotis-Bateson,et al.  It's not what you say but the way you say it: matching faces and voices. , 2007, Journal of experimental psychology. Human perception and performance.

[70]  N. Logothetis,et al.  Is the frontal lobe involved in conscious perception? , 2014, Front. Psychol..

[71]  M. Shiffrar,et al.  Recognizing people from their movement. , 2005, Journal of experimental psychology. Human perception and performance.

[72]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[73]  James W. Dias,et al.  Experience with a talker can transfer across modalities to facilitate lipreading , 2013, Attention, perception & psychophysics.

[74]  Verena G. Skuk,et al.  Speaker perception. , 2014, Wiley interdisciplinary reviews. Cognitive science.

[75]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  J. Haxby,et al.  The distributed human neural system for face perception , 2000, Trends in Cognitive Sciences.

[77]  Ming Yang,et al.  Web-scale training for face identification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  G. Yovel,et al.  A unified coding strategy for processing faces and voices , 2013, Trends in Cognitive Sciences.

[79]  김제중 Biological Motion , 1990, Lecture Notes in Biomathematics.

[80]  Zhe Wang,et al.  On the facilitative effects of face motion on face recognition and its development , 2014, Front. Psychol..

[81]  P. Downing,et al.  The lateral occipitotemporal cortex in action , 2015, Trends in Cognitive Sciences.

[82]  Galit Yovel,et al.  A Revised Neural Framework for Face Processing. , 2015, Annual review of vision science.

[83]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[84]  Galit Yovel,et al.  The contribution of the body and motion to whole person recognition , 2016, Vision Research.

[85]  Christoph Kayser,et al.  Who is That? Brain Networks and Mechanisms for Identifying Individuals , 2015, Trends in Cognitive Sciences.

[86]  Asif A Ghazanfar,et al.  Dynamic faces speed up the onset of auditory cortical spiking responses during vocal detection , 2013, Proceedings of the National Academy of Sciences.

[87]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[88]  Heinrich H. Bülthoff,et al.  Walk this way: Approaching bodies can influence the processing of faces , 2011, Cognition.

[89]  W. Freiwald,et al.  Whole-agent selectivity within the macaque face-processing system , 2015, Proceedings of the National Academy of Sciences.