Multimodal Machine Learning for Interactive Mental Health Therapy

Mental health disorders are among the leading causes of disability. Despite the prevalence of mental health disorders, there is a large gap between the needs and resources available for their assessment and treatment. Automatic behaviour analysis for computer-aided mental health assessment can augment clinical resources in the diagnosis and treatment of patients. Intelligent systems like virtual agents and social robots can have a large impact by deploying multimodal machine learning to perceive and interact with patients in interactive scenarios for probing behavioral cues of mental health disorders. In this paper, we propose our plans for developing multimodal machine learning methods for augmenting embodied interactive agents with emotional intelligence, toward probing cues of mental health disorders. We aim to develop a new generation of intelligent agents that can create engaging interactive experiences for assisting with mental health assessments.

[1]  Fernando De la Torre,et al.  Detecting depression from facial actions and vocal prosody , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[2]  Peter Robinson,et al.  OpenFace: An open source facial behavior analysis toolkit , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[3]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[4]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[5]  Ran Zhao,et al.  Socially-Aware Virtual Agents: Automatically Assessing Dyadic Rapport from Temporal Patterns of Behavior , 2016, IVA.

[6]  Louis-Philippe Morency,et al.  A probabilistic multimodal approach for predicting listener backchannels , 2009, Autonomous Agents and Multi-Agent Systems.

[7]  Mohammad Soleymani,et al.  Multimodal Learning for Identifying Opportunities for Empathetic Responses , 2019, ICMI.

[8]  Michael Wagner,et al.  Multimodal assistive technologies for depression diagnosis and monitoring , 2013, Journal on Multimodal User Interfaces.

[9]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[10]  A. Joinson Self‐disclosure in computer‐mediated communication: The role of self‐awareness and visual anonymity , 2001 .

[11]  David DeVault,et al.  The Distress Analysis Interview Corpus of human and computer interviews , 2014, LREC.

[12]  Kallirroi Georgila,et al.  SimSensei Demonstration: A Perceptive Virtual Human Interviewer for Healthcare Applications , 2015, AAAI.

[13]  Louis-Philippe Morency,et al.  It's only a computer: Virtual humans increase willingness to disclose , 2014, Comput. Hum. Behav..

[14]  Rosalind W. Picard,et al.  Establishing the computer-patient working alliance in automated health behavior change interventions. , 2005, Patient education and counseling.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17]  Albert A. Rizzo,et al.  Automatic audiovisual behavior descriptors for psychological disorder analysis , 2014, Image Vis. Comput..

[18]  Louis-Philippe Morency,et al.  Automatic Nonverbal Behavior Indicators of Depression and PTSD: Exploring Gender Differences , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[19]  Roland Göcke,et al.  Diagnosis of depression by behavioural signals: a multimodal approach , 2013, AVEC@ACM Multimedia.

[20]  Kallirroi Georgila,et al.  SimSensei kiosk: a virtual human interviewer for healthcare decision support , 2014, AAMAS.

[21]  Erik Cambria,et al.  Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).