Multimodal Analysis and Estimation of Intimate Self-Disclosure

Self-disclosure to others has a proven benefit for one’s mental health. It is shown that disclosure to computers can be similarly beneficial for emotional and psychological well-being. In this paper, we analyzed verbal and nonverbal behavior associated with self-disclosure in two datasets containing structured human-human and human-agent interviews from more than 200 participants. Correlation analysis of verbal and nonverbal behavior revealed that linguistic features such as affective and cognitive content in verbal behavior, and nonverbal behavior such as head gestures are associated with intimate self-disclosure. A multimodal deep neural network was developed to automatically estimate the level of intimate self-disclosure from verbal and nonverbal behavior. Between modalities, verbal behavior was the best modality for estimating self-disclosure within-corpora achieving r = 0.66. However, the cross-corpus evaluation demonstrated that nonverbal behavior can outperform language modality in cross-corpus evaluation. Such automatic models can be deployed in interactive virtual agents or social robots to evaluate rapport and guide their conversational strategy.

[1]  Gang Rong,et al.  A real-time head nod and shake detector using HMMs , 2003, Expert Syst. Appl..

[2]  L. Tickle-Degnen,et al.  The Nature of Rapport and Its Nonverbal Correlates , 1990 .

[3]  Aren Jansen,et al.  CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Kallirroi Georgila,et al.  SimSensei kiosk: a virtual human interviewer for healthcare decision support , 2014, AAMAS.

[5]  Björn W. Schuller,et al.  The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing , 2016, IEEE Transactions on Affective Computing.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Adam S. Miner,et al.  Psychological, Relational, and Emotional Effects of Self-Disclosure After Conversations With a Chatbot , 2018, The Journal of communication.

[10]  S. Kopp,et al.  Towards Adaptive Social Behavior Generation for Assistive Robots Using Reinforcement Learning , 2017, 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI.

[11]  Jonathan Gratch,et al.  Virtual humans elicit socially anxious interactants' verbal self‐disclosure , 2010, Comput. Animat. Virtual Worlds.

[12]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[13]  Munmun De Choudhury,et al.  Detecting and Characterizing Mental Health Related Self-Disclosure in Social Media , 2015, CHI Extended Abstracts.

[14]  Ran Zhao,et al.  Automatic Recognition of Conversational Strategies in the Service of a Socially-Aware Dialog System , 2016, SIGDIAL Conference.

[15]  I. Altman,et al.  Social penetration: The development of interpersonal relationships , 1973 .

[16]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[17]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[18]  Serge Thill,et al.  Engagement: A traceable motivational concept in human-robot interaction , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Alan W. Black,et al.  An Empirical Study of Self-Disclosure in Spoken Dialogue Systems , 2018, SIGDIAL Conference.

[21]  P. Ekman,et al.  Facial action coding system , 2019 .

[22]  Beno Benhabib,et al.  A Survey of Autonomous Human Affect Detection Methods for Social Robots Engaged in Natural HRI , 2016, J. Intell. Robotic Syst..

[23]  Naomi Inoue,et al.  Model free head pose estimation using stereovision , 2012, Pattern Recognit..

[24]  Ashish Kapoor,et al.  A real-time head nod and shake detector , 2001, PUI '01.

[25]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[26]  Zheng Yao,et al.  Self-Disclosure and Channel Difference in Online Health Support Groups , 2017, ICWSM.

[27]  Michael E. Bernard,et al.  REBT Assessment and Treatment with Children , 2006 .

[28]  Katie Mitchell,et al.  Emotional Priming and Spontaneous Facial Mimicry of Asian and Caucasian Faces: An Investigation Using the Facial Action Coding System (FACS) , 2014 .

[29]  Emmanuel Bacry,et al.  tick: a Python Library for Statistical Learning, with an emphasis on Hawkes Processes and Time-Dependent Models , 2017, J. Mach. Learn. Res..

[30]  Li Ping,et al.  Analysis on the Prospect of China's Economy (2017) , 2017 .

[31]  Albert A. Rizzo,et al.  Reporting Mental Health Symptoms: Breaking Down Barriers to Care with Virtual Human Interviewers , 2017, Front. Robot. AI.

[32]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[33]  David S. Monaghan,et al.  Real-time head nod and shake detection for continuous human affect recognition , 2013, 2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS).

[34]  Louis-Philippe Morency,et al.  It's only a computer: Virtual humans increase willingness to disclose , 2014, Comput. Hum. Behav..

[35]  P. Barglow,et al.  Self-disclosure in psychotherapy. , 2005, American journal of psychotherapy.

[36]  Ron Artstein,et al.  Towards building a virtual counselor: modeling nonverbal behavior during intimate self-disclosure , 2012, AAMAS.

[37]  Takayuki Kanda,et al.  Data-Driven HRI: Learning Social Behaviors by Example From Human–Human Interaction , 2016, IEEE Transactions on Robotics.

[38]  Aren Jansen,et al.  Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[39]  Peter Robinson,et al.  OpenFace: An open source facial behavior analysis toolkit , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[40]  Ran Zhao,et al.  Towards a Dyadic Computational Model of Rapport Management for Human-Virtual Agent Interaction , 2014, IVA.

[41]  Stacy Marsella,et al.  Virtual Rapport , 2006, IVA.

[42]  R. M. Tobin,et al.  Measuring emotional expression with the Linguistic Inquiry and Word Count. , 2007, The American journal of psychology.

[43]  Robert E. Kraut,et al.  Modeling Self-Disclosure in Social Networking Sites , 2016, CSCW.

[44]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[45]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[46]  Maja Pantic,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING , 2022 .