Still together?: the role of acoustic features in predicting marital outcome

The assessment and prediction of marital outcome in couple therapy has intrigued many clinical psychologists. In this work, we analyze the significance of various acoustic features extracted from couples’ spoken interaction in predicting the success or failure of their marriage. We also investigate whether speech acoustic features can provide complementary information to behavioral descriptions or codes provided by human experts (e.g., relationship satisfaction, blame patterns, global negativity). We formulate marital outcome prediction as both binary (improvement vs. no improvement) and multiclass (different levels of improvement) classification problem. Our experiments show that acoustic features can predict marital outcome more accurately than those based on behavioral descriptors provided by human experts. We also find that dialog turn-level acoustic features generally perform better than frame-level signal descriptors. This observation supports the notion that the impact of the behavior of one interlocutor on the other is more important than the behavior itself looked in isolation. Finally, acoustic features together with human-derived behavioral codes show the best performance in outcome prediction, suggesting some complementarity in the information captured by these behavioral representations.

[1]  Shrikanth S. Narayanan,et al.  Strategies to Improve the Robustness of Agglomerative Hierarchical Clustering Under Data Source Variation for Speaker Diarization , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Oh-Wook Kwon,et al.  EMOTION RECOGNITION BY SPEECH SIGNAL , 2003 .

[3]  Athanasios Katsamanis,et al.  Computing vocal entrainment: A signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions , 2014, Comput. Speech Lang..

[4]  Panayiotis G. Georgiou,et al.  "That's Aggravating, Very Aggravating": Is It Possible to Classify Behaviors in Couple Interactions Using Automatically Derived Lexical Features? , 2011, ACII.

[5]  Rahul Gupta,et al.  A language-based generative model framework for behavioral analysis of couples' therapy , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  M. Sanders,et al.  A comparison of the generalization of behavioral marital therapy and enhanced behavioral marital therapy. , 1993, Journal of consulting and clinical psychology.

[7]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[8]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[9]  Shrikanth S. Narayanan,et al.  Primitives-based evaluation and estimation of emotions in speech , 2007, Speech Commun..

[10]  References , 1971 .

[11]  Shrikanth S. Narayanan,et al.  A robust frontend for VAD: exploiting contextual, discriminative and spectral cues of human voice , 2013, INTERSPEECH.

[12]  Jiucang Hao,et al.  Emotion recognition by speech signals , 2003, INTERSPEECH.

[13]  Athanasios Katsamanis,et al.  Toward automating a human behavioral coding system for married couples' interactions using speech acoustic features , 2013, Speech Commun..

[14]  Katherine J. W. Baucom,et al.  Changes in dyadic communication during and after integrative and traditional behavioral couple therapy. , 2015, Behaviour research and therapy.

[15]  Alex Pentland Socially Aware Computation and Communication , 2005, Computer.

[16]  Björn W. Schuller,et al.  Comparing one and two-stage acoustic modeling in the recognition of emotion in speech , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[17]  Athanasios Katsamanis,et al.  Automatic classification of married couples' behavior using audio features , 2010, INTERSPEECH.

[18]  David C. Atkins,et al.  Traditional versus integrative behavioral couple therapy for significantly and chronically distressed married couples. , 2004, Journal of consulting and clinical psychology.

[19]  Carlos Busso,et al.  Emotion recognition using a hierarchical binary decision tree approach , 2011, Speech Commun..

[20]  Katherine J. W. Baucom,et al.  Observed communication in couples two years after integrative and traditional behavioral couple therapy: outcome and link with five-year follow-up. , 2011, Journal of consulting and clinical psychology.

[21]  Athanasios Katsamanis,et al.  Automatic Identification of Salient Acoustic Instances in Couples' Behavioral Interactions Using Diverse Density Support Vector Machines , 2011, INTERSPEECH.

[22]  Andrew Christensen,et al.  Observed communication and associations with satisfaction during traditional and integrative behavioral couple therapy. , 2008, Behavior therapy.

[23]  Panayiotis G. Georgiou,et al.  Data driven modeling of head motion towards analysis of behaviors in couple interactions , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  David D. Lewis,et al.  Feature Selection and Feature Extraction for Text Categorization , 1992, HLT.

[25]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[26]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[27]  Levent Özgür,et al.  Text Categorization with Class-Based and Corpus-Based Keyword Selection , 2005, ISCIS.

[28]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[29]  Panayiotis G. Georgiou,et al.  Behavioral signal processing for understanding (distressed) dyadic interactions: some recent developments , 2011, J-HGBU '11.

[30]  Panayiotis G. Georgiou,et al.  Behavioral Signal Processing: Deriving Human Behavioral Informatics From Speech and Language , 2013, Proceedings of the IEEE.