Head Motion Modeling for Human Behavior Analysis in Dyadic Interaction

This paper presents a computational study of head motion in human interaction, notably of its role in conveying interlocutors' behavioral characteristics. Head motion is physically complex and carries rich information; current modeling approaches based on visual signals, however , are still limited in their ability to adequately capture these important properties. Guided by the methodology of kinesics , we propose a data-driven approach to identify typical head motion patterns. The approach follows the steps of first segmenting motion events, then parametrically representing the motion by linear predictive features, and finally generalizing the motion types using Gaussian mixture models. The proposed approach is experimentally validated using video recordings of communication sessions from real couples involved in a couples therapy study. In particular we use the head motion model to classify binarized expert judgments of the interactants' specific behavioral characteristics where entrainment in head motion is hypothesized to play a role: Acceptance, Blame, Positive, and Negative behavior. We achieve accuracies in the range of 60% to 70% for the various experimental settings and conditions. In addition, we describe a measure of motion similarity between the interaction partners based on the proposed model. We show that the relative change of head motion similarity during the interaction significantly correlates with the expert judgments of the interactants' behavioral characteristics. These findings demonstrate the effectiveness of the proposed head motion model, and underscore the promise of analyzing human behavioral characteristics through signal processing methods.

[1]  Uri Hadar,et al.  The timing of shifts of head postures during conservation , 1984 .

[2]  Evelyn Z. McClave Linguistic functions of head movements in the context of speech , 2000 .

[3]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[4]  Ashish Kapoor,et al.  A real-time head nod and shake detector , 2001, PUI '01.

[5]  Mohan M. Trivedi,et al.  Optical flow based Head Movement and Gesture Analyzer (OHMeGA) , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[6]  M. LaFrance Nonverbal synchrony and rapport: Analysis by the cross-lag panel technique. , 1979 .

[7]  Peter Kabal,et al.  The computation of line spectral frequencies using Chebyshev polynomials , 1986, IEEE Trans. Acoust. Speech Signal Process..

[8]  Michael I. Jordan,et al.  A Sticky HDP-HMM With Application to Speaker Diarization , 2009, 0905.2592.

[9]  Panayiotis G. Georgiou,et al.  "That's Aggravating, Very Aggravating": Is It Possible to Classify Behaviors in Couple Interactions Using Automatically Derived Lexical Features? , 2011, ACII.

[10]  Stuart J. Sigman,et al.  Commemorative essay. Ray L. Birdwhistell (1918-1994) , 1996 .

[11]  Ling Chen,et al.  Large head movement tracking using sift-based registration , 2007, ACM Multimedia.

[12]  R. Birdwhistell Kinesics and Context: Essays on Body Motion Communication , 1971 .

[13]  Panayiotis G. Georgiou,et al.  Modeling therapist empathy and vocal entrainment in drug addiction counseling , 2013, INTERSPEECH.

[14]  Antonio Camurri,et al.  A System for Real-Time Multimodal Analysis of Nonverbal Affective Social Interaction in User-Centric Media , 2010, IEEE Transactions on Multimedia.

[15]  A. Murat Tekalp,et al.  Analysis of Head Gesture and Prosody Patterns for Prosody-Driven Head-Gesture Animation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Mohamed Chetouani,et al.  Interpersonal Synchrony: A Survey of Evaluation Methods across Disciplines , 2012, IEEE Transactions on Affective Computing.

[17]  T. Wheatley,et al.  From Mind Perception to Mental Connection: Synchrony as a Mechanism for Social Understanding , 2012 .

[18]  Panayiotis G. Georgiou,et al.  Data driven modeling of head motion towards analysis of behaviors in couple interactions , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Jean-Marc Odobez,et al.  Using self-context for multimodal detection of head nods in face-to-face interactions , 2012, ICMI '12.

[20]  Athanasios Katsamanis,et al.  Computing vocal entrainment: A signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions , 2014, Comput. Speech Lang..

[21]  Maja Pantic,et al.  Towards the automatic detection of spontaneous agreement and disagreement based on nonverbal behaviour: A survey of related cues, databases, and tools , 2013, Image Vis. Comput..

[22]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Frank J. Bernieri,et al.  Coordinated movement and rapport in teacher-student interactions , 1988 .

[24]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[25]  Athanasios Katsamanis,et al.  Multiple Instance Learning for Classification of Human Behavior Observations , 2011, ACII.

[26]  Athanasios Katsamanis,et al.  "You made me do it": Classification of Blame in Married Couples' Interactions by Fusing Automatically Derived Speech and Language Information , 2011, INTERSPEECH.

[27]  U. Hadar,et al.  Head movement during listening turns in conversation , 1985 .

[28]  Shrikanth S. Narayanan,et al.  Interplay between verbal response latency and physiology of children with autism during ECA interactions , 2012, INTERSPEECH.

[29]  Phill-Kyu Rhee,et al.  Real Time Head Nod and Shake Detection Using HMMs , 2006, KES.

[30]  Panayiotis G. Georgiou,et al.  Analyzing the language of therapist empathy in Motivational Interview based psychotherapy , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[31]  Luc Van Gool,et al.  Real time 3D head pose estimation: Recent achievements and future challenges , 2012, 2012 5th International Symposium on Communications, Control and Signal Processing.

[32]  Anton Nijholt,et al.  Towards visual and vocal mimicry recognition in human-human interactions , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[33]  Shrikanth S. Narayanan,et al.  Spontaneous-Speech Acoustic-Prosodic Features of Children with Autism and the Interacting Psychologist , 2012, INTERSPEECH.

[34]  Athanasios Katsamanis,et al.  Toward automating a human behavioral coding system for married couples' interactions using speech acoustic features , 2013, Speech Commun..

[35]  Emily Butler,et al.  Partner influence and in-phase versus anti-phase physiological linkage in romantic couples. , 2013, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[36]  Trevor Darrell,et al.  Head gestures for perceptual interfaces: The role of context in improving recognition , 2007, Artif. Intell..

[37]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[38]  Zhigang Deng,et al.  Rigid Head Motion in Expressive Speech Animation: Analysis and Synthesis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  Subbarayan Pasupathy,et al.  Predictive head movement tracking using a Kalman filter , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[40]  Haibo Li,et al.  3D head pose estimation using the Kinect , 2011, 2011 International Conference on Wireless Communications and Signal Processing (WCSP).

[41]  F. Ramseyer,et al.  Synchrony : A Core Concept for a Constructivist Approach to Psychotherapy , 2009 .

[42]  Paul W. Fieguth,et al.  Maneuvering Head Motion Tracking by Coarse-to-Fine Particle Filter , 2011, ICIAR.

[43]  Panayiotis G. Georgiou,et al.  A Case Study: Detecting Counselor Reflections in Psychotherapy for Addictions using Linguistic Features , 2012, INTERSPEECH.

[44]  U. Hadar,et al.  The Relationship Between Head Movements and Speech Dysfluencies , 1984, Language and speech.

[45]  Gang Rong,et al.  A real-time head nod and shake detector using HMMs , 2003, Expert Syst. Appl..

[46]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[47]  Panayiotis G. Georgiou,et al.  Head motion synchrony and its correlation to affectivity in dyadic interactions , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[48]  Uri Hadar,et al.  Kinematics of head movements accompanying speech during conversation , 1983 .

[49]  Gunnar Farnebäck,et al.  Two-Frame Motion Estimation Based on Polynomial Expansion , 2003, SCIA.

[50]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[51]  Bülent Sankur,et al.  Robust classification of face and head gestures in video , 2011, Image Vis. Comput..

[52]  Ann Buysse,et al.  Support provision in marriage: the role of emotional similarity and empathic accuracy. , 2008, Emotion.

[53]  David C. Atkins,et al.  Traditional versus integrative behavioral couple therapy for significantly and chronically distressed married couples. , 2004, Journal of consulting and clinical psychology.

[54]  F. Ramseyer,et al.  Nonverbal synchrony in psychotherapy: coordinated body movement reflects relationship quality and outcome. , 2011, Journal of consulting and clinical psychology.

[55]  U. Hadar,et al.  Head Movement Correlates of Juncture and Stress at Sentence Level , 1983, Language and speech.

[56]  K. Scherer,et al.  The New Handbook of Methods in Nonverbal Behavior Research , 2008 .

[57]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[58]  Jonathan E. Butner,et al.  Attachment style and two forms of affect coregulation between romantic partners , 2007 .

[59]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[60]  Panayiotis G. Georgiou,et al.  An audio-visual approach to learning salient behaviors in couples' problem solving discussions , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[61]  T. Chartrand,et al.  Chapter 5 Human Mimicry , 2009 .

[62]  Panayiotis G. Georgiou,et al.  Behavioral Signal Processing: Deriving Human Behavioral Informatics From Speech and Language , 2013, Proceedings of the IEEE.

[63]  Malik Mallem,et al.  Comparison between particle filter approach and Kalman filter-based technique for head tracking in augmented reality systems , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[64]  Bo Ma,et al.  Unscented Kalman filter for visual curve tracking , 2004, Image Vis. Comput..