A multimodal system to characterise melancholia: cascaded bag of words approach

Recent years have seen a lot of activity in affective computing for automated analysis of depression. However, no research has so far proposed a multimodal system for classifying different subtypes of depression such as melancholia. The mental state assessment of a mood disorder depends primarily on appearance, behaviour, speech, thought, perception, mood and facial affect. Mood and facial affect mainly contribute to distinguishing melancholia from non-melancholia. These are assessed by clinicians, and hence vulnerable to subjective judgement. As a result, clinical assessment alone may not accurately capture the presence or absence of specific disorders such as melancholia, a distressing condition whose presence has important treatment implications. Melancholia is characterised by severe anhedonia and psychomotor disturbance, which can be a combination of motor retardation with periods of superimposed agitation. Psychomotor disturbance can be sensed in both face and voice. To the best of our knowledge, this study is the first attempt to propose a multimodal system to differentiate melancholia from non-melancholia and healthy controls. We report the sensitivity and specificity of classification in depressive subtypes.

[1]  Michael Wagner,et al.  Eye movement analysis for depression detection , 2013, 2013 IEEE International Conference on Image Processing.

[2]  M. Hamilton,et al.  Development of a rating scale for primary depressive illness. , 1967, The British journal of social and clinical psychology.

[3]  Thomas F. Quatieri,et al.  A review of depression and suicide risk assessment using speech analysis , 2015, Speech Commun..

[4]  Jeffrey F. Cohn,et al.  Dynamic Multimodal Measurement of Depression Severity Using Deep Autoencoding , 2018, IEEE Journal of Biomedical and Health Informatics.

[5]  Michael Wagner,et al.  Characterising depressed speech for classification , 2013, INTERSPEECH.

[6]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[7]  Roland Göcke,et al.  An Investigation of Depressed Speech Detection: Features and Normalization , 2011, INTERSPEECH.

[8]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  G. Parker,et al.  Classifying depression: should paradigms lost be regained? , 2000, The American journal of psychiatry.

[10]  Roland Göcke,et al.  Modeling spectral variability for the classification of depressed speech , 2013, INTERSPEECH.

[11]  Roland Göcke,et al.  Can body expressions contribute to automatic depression analysis? , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[12]  Eliathamby Ambikairajah,et al.  Spectro-temporal analysis of speech affected by depression and psychomotor retardation , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Roland Göcke,et al.  A Video-Based Facial Behaviour Analysis Approach to Melancholia , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[14]  Abhinav Dhall,et al.  The Utility of Facial Analysis Algorithms in Detecting Melancholia , 2016 .

[15]  Michael Wagner,et al.  Multimodal assistive technologies for depression diagnosis and monitoring , 2013, Journal on Multimodal User Interfaces.

[16]  Karl J. Friston,et al.  Disrupted effective connectivity of cortical systems supporting attention and interoception in melancholia. , 2015, JAMA psychiatry.

[17]  Roland Göcke,et al.  An Investigation of Emotional Speech in Depression Classification , 2016, INTERSPEECH.

[18]  Roland Göcke,et al.  An approach for automatically measuring facial activity in depressed subjects , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[19]  Björn W. Schuller,et al.  openXBOW - Introducing the Passau Open-Source Crossmodal Bag-of-Words Toolkit , 2016, J. Mach. Learn. Res..

[20]  M. Picheny,et al.  Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .

[21]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[22]  Roland Göcke,et al.  Relative Body Parts Movement for Automatic Depression Analysis , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[23]  J. Markowitz,et al.  The 16-Item quick inventory of depressive symptomatology (QIDS), clinician rating (QIDS-C), and self-report (QIDS-SR): a psychometric evaluation in patients with chronic major depression , 2003, Biological Psychiatry.

[24]  Hamdi Dibeklioglu,et al.  Multimodal Detection of Depression in Clinical Interviews , 2015, ICMI.

[25]  Michael Wagner,et al.  Head Pose and Movement Analysis as an Indicator of Depression , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[26]  Fernando De la Torre,et al.  Detecting depression from facial actions and vocal prosody , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[27]  Roland Göcke,et al.  Diagnosis of depression by behavioural signals: a multimodal approach , 2013, AVEC@ACM Multimedia.

[28]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Michael Wagner,et al.  Multimodal Depression Detection: Fusion Analysis of Paralinguistic, Head Pose and Eye Gaze Behaviors , 2018, IEEE Transactions on Affective Computing.

[30]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jeffrey F. Cohn,et al.  Detecting Depression Severity from Vocal Prosody , 2013, IEEE Transactions on Affective Computing.

[32]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[33]  Albert A. Rizzo,et al.  Automatic audiovisual behavior descriptors for psychological disorder analysis , 2014, Image Vis. Comput..

[34]  Roland Göcke,et al.  Neural-net classification for spatio-temporal descriptor based depression analysis , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[35]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[36]  Fabien Ringeval,et al.  AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge , 2016, AVEC@ACM Multimedia.

[37]  Szymon Fedor Can We Predict Depression From the Asymmetry of Electrodermal Activity , 2016 .