Audio-visual emotion recognition in adult attachment interview

Automatic multimodal recognition of spontaneous affective expressions is a largely unexplored and challenging problem. In this paper, we explore audio-visual emotion recognition in a realistic human conversation setting - Adult Attachment Interview (AAI). Based on the assumption that facial expression and vocal expression be at the same coarse affective states, positive and negative emotion sequences are labeled according to Facial Action Coding System Emotion Codes. Facial texture in visual channel and prosody in audio channel are integrated in the framework of Adaboost multi-stream hidden Markov model (AMHMM) in which Adaboost learning scheme is used to build component HMM fusion. Our approach is evaluated in the preliminary AAI spontaneous emotion recognition experiments.

[1]  Chun Chen,et al.  Audio-visual based emotion recognition - a new approach , 2004, CVPR 2004.

[2]  Glenn I. Roisman,et al.  The coherence of dyadic behavior across parent–child and romantic relationships as mediated by the internalized representation of experience , 2001, Attachment & human development.

[3]  Gwen Littlewort,et al.  Recognizing facial expression: machine learning and application to spontaneous behavior , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Glenn I. Roisman,et al.  Salient and emerging developmental tasks in the transition to adulthood. , 2004, Child development.

[5]  Glenn I. Roisman,et al.  Developmental cascades: linking academic achievement and externalizing and internalizing symptoms over 20 years. , 2005, Developmental psychology.

[6]  Jay Belsky,et al.  A taxometric study of the Adult Attachment Interview. , 2007, Developmental psychology.

[7]  Nicu Sebe,et al.  Authentic Facial Expression Analysis , 2004, FGR.

[8]  Glenn I. Roisman,et al.  The role of adult attachment security in non-romantic, non-attachment-related first interactions between same-sex strangers , 2006, Attachment & human development.

[9]  Jean-Claude Martin,et al.  Representing Real-Life Emotions in Audiovisual Data with Non Basic Emotional Patterns and Context Features , 2005, ACII.

[10]  E. Vesterinen,et al.  Affective Computing , 2009, Encyclopedia of Biometrics.

[11]  Jing Chen,et al.  Essentializing Race: Implications for Bicultural Individuals' Cognition and Physiological Reactivity , 2007, Psychological science.

[12]  Zhihong Zeng,et al.  Audio-visual affect recognition through multi-stream fused HMM for HCI , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Yuxiao Hu,et al.  Training combination strategy of multi-stream fused hidden Markov model for audio-visual affect recognition , 2006, MM '06.

[14]  Yuxiao Hu,et al.  One-class classification for spontaneous facial expression analysis , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[15]  Glenn I. Roisman,et al.  The psychophysiology of adult attachment relationships: Autonomic reactivity in marital and premarital interactions. , 2007, Developmental psychology.

[16]  Alexandros Potamianos,et al.  Multi-band speech recognition in noisy environments , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[17]  Glenn I. Roisman,et al.  Antisocial behavior in the transition to adulthood: The independent and interactive roles of developmental history and emerging developmental tasks , 2004, Development and Psychopathology.

[18]  Zhigang Deng,et al.  Analysis of emotion recognition using facial expressions, speech and multimodal information , 2004, ICMI '04.

[19]  Glenn I. Roisman,et al.  Earned-secure attachment status in retrospect and prospect. , 2002, Child development.

[20]  Glenn I. Roisman,et al.  An experimental manipulation of retrospectively defined earned and continuous attachment security. , 2006, Child development.

[21]  Jay Belsky,et al.  Family rearing antecedents of pubertal timing. , 2007, Child development.

[22]  J. G. Taylor,et al.  Emotion recognition in human-computer interaction , 2005, Neural Networks.

[23]  Diane J. Litman,et al.  Predicting Student Emotions in Computer-Human Tutoring Dialogues , 2004, ACL.

[24]  Glenn I. Roisman,et al.  The Adult Attachment Interview and self-reports of attachment style: an empirical rapprochement. , 2007, Journal of personality and social psychology.

[25]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[26]  Glenn I. Roisman,et al.  A behavior-genetic study of parenting quality, infant attachment security, and their covariation in a nationally representative sample. , 2008, Developmental psychology.

[27]  Roddy Cowie,et al.  Beyond emotion archetypes: Databases for emotion modelling using neural networks , 2005, Neural Networks.

[28]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[29]  Glenn I. Roisman,et al.  Insecurity, stress, and symptoms of psychopathology: contrasting results from self-reports versus interviews of adult attachment , 2008, Attachment & human development.

[30]  Nicu Sebe,et al.  Affective multimodal human-computer interaction , 2005, ACM Multimedia.

[31]  Jeffrey D Long,et al.  Resources and resilience in the transition to adulthood: Continuity and change , 2004, Development and Psychopathology.

[32]  Glenn I. Roisman,et al.  Adult romantic relationships as contexts of human development: a multimethod comparison of same-sex couples with opposite-sex dating, engaged, and married dyads. , 2008, Developmental psychology.

[33]  R. Gibson,et al.  What the Face Reveals , 2002 .

[34]  Lori Lamel,et al.  Challenges in real-life emotion annotation and machine learning based detection , 2005, Neural Networks.

[35]  Chalapathy Neti,et al.  Frame-dependent multi-stream reliability indicators for audio-visual speech recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[36]  Glenn I. Roisman,et al.  The intersection of adolescent development and intensive intervention: age-related psychosocial correlates of treatment regimens in the diabetes control and complication trial. , 2002, Journal of pediatric psychology.

[37]  Joan G. Miller,et al.  International Society for the Study of Behavioural Development , 1978 .

[38]  Glenn I. Roisman,et al.  The limits of genetic influence: a behavior-genetic analysis of infant-caregiver relationship quality and temperament. , 2006, Child development.

[39]  Zhihong Zeng,et al.  Audio-Visual Affect Recognition , 2007, IEEE Transactions on Multimedia.

[40]  Thomas S. Huang,et al.  Explanation-based facial motion tracking using a piecewise Bezier volume deformation model , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[41]  Yuxiao Hu,et al.  Spontaneous Emotional Facial Expression Detection , 2006, J. Multim..

[42]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[43]  L. Rothkrantz,et al.  Toward an affect-sensitive multimodal human-computer interaction , 2003, Proc. IEEE.

[44]  Glenn I. Roisman,et al.  Predictors of young adults' representations of and behavior in their current romantic relationship: Prospective tests of the prototype hypothesis , 2005, Attachment & human development.

[45]  Jeffrey F. Cohn,et al.  The Timing of Facial Motion in posed and Spontaneous Smiles , 2003, Int. J. Wavelets Multiresolution Inf. Process..

[46]  Glenn I. Roisman,et al.  Conceptual clarifications in the study of resilience. , 2005, The American psychologist.

[47]  Glenn I. Roisman,et al.  The emotional integration of childhood experience: physiological, facial expressive, and self-reported emotional response during the adult attachment interview. , 2004, Developmental psychology.

[48]  L. Frank The Society for Research in Child Development , 1935 .

[49]  S. Demleitner [Communication without words]. , 1997, Pflege aktuell.

[50]  Roddy Cowie,et al.  Describing the emotional states that are expressed in speech , 2003, Speech Commun..

[51]  Hervé Bourlard,et al.  A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[52]  Glenn I. Roisman,et al.  Big Five personality traits and relationship quality: Self-reported, observational, and physiological evidence , 2008 .

[53]  Harriet Oster,et al.  Infant Attachment Security as a Discriminant Predictor of Career Development in Late Adolescence , 2000 .

[54]  Glenn I. Roisman Beyond Main Effects Models of Adolescent Work Intensity, Family Closeness, and School Disengagement , 2002 .

[55]  Yuxiao Hu,et al.  Learning a locality preserving subspace for visual recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.