Affective State Recognition in Married Couples' Interactions Using PCA-Based Vocal Entrainment Measures with Multiple Instance Learning

Recently there has been an increase in efforts in Behavioral Signal Processing (BSP), that aims to bring quantitative analysis using signal processing techniques in the domain of observational coding. Currently observational coding in fields such as psychology is based on subjective expert coding of abstract human interaction dynamics. In this work, we use a Multiple Instance Learning (MIL) framework, a saliencybased prediction model, with a signal-driven vocal entrainment measure as the feature to predict the affective state of a spouse in problem solving interactions. We generate 18 MIL classifiers to capture the variablelength saliency of vocal entrainment, and a cross-validation scheme with maximum accuracy and mutual information as the metric to select the best performing classifier for each testing couple. This method obtains a recognition accuracy of 53.93%, a 2.14% (4.13% relative) improvement over baseline model using Support Vector Machine. Furthermore, this MIL-based framework has potential for identifying meaningful regions of interest for further detailed analysis of married couples interactions.

[1]  Athanasios Katsamanis,et al.  Estimation of ordinal approach-avoidance labels in dyadic interactions: Ordinal logistic regression approach , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Athanasios Katsamanis,et al.  Automatic classification of married couples' behavior using audio features , 2010, INTERSPEECH.

[3]  David C. Atkins,et al.  Traditional versus integrative behavioral couple therapy for significantly and chronically distressed married couples. , 2004, Journal of consulting and clinical psychology.

[4]  J. Gottman,et al.  The roles of conflict engagement, escalation, and avoidance in marital interaction: a longitudinal view of five types of couples. , 1993, Journal of consulting and clinical psychology.

[5]  G. Margolin,et al.  The Nuts and Bolts of Behavioral Observation of Marital and Family Interaction , 1998, Clinical child and family psychology review.

[6]  Panayiotis G. Georgiou,et al.  SailAlign: Robust long speech-text alignment , 2011 .

[7]  Loïc Kessous,et al.  The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals , 2007, INTERSPEECH.

[8]  Shrikanth S. Narayanan,et al.  Primitives-based evaluation and estimation of emotions in speech , 2007, Speech Commun..

[9]  Athanasios Katsamanis,et al.  An Analysis of PCA-Based Vocal Entrainment Measures in Married Couples' Affective Spoken Interactions , 2011, INTERSPEECH.

[10]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[11]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[12]  Athanasios Katsamanis,et al.  Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couples , 2010, INTERSPEECH.

[13]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.