Weighted pairwise Gaussian likelihood regression for depression score prediction

This paper presents a technique in which feature vectors are mapped onto ordinal ranges of clinical depression scores using weighted pairwise Gaussians. The position of a test vector with respect to these partitions is used to perform depression score prediction. Results found on a set of spectral and formant based speech characteristics indicate the potential of this technique for performing depression score prediction. Key results on the AVEC 2013 development set indicate that the inclusion of weights and Bayesian adaptation improves system performance by 16.5% - 18.5% when compared to using an unweighted non-adapted system. Fusing results from Bayesian adapted models corresponding to different feature spaces offers up to 8% further improvement. Further, fusion consistently improves performance on both the AVEC 2013 development and test set, in contrast to conventional regressor fusion.

[1]  Björn W. Schuller,et al.  AVEC 2013: the continuous audio/visual emotion and depression recognition challenge , 2013, AVEC@ACM Multimedia.

[2]  D DeBrota,et al.  The responsiveness of the Hamilton Depression Rating Scale. , 2000, Journal of psychiatric research.

[3]  Lee Baer,et al.  Handbook of Clinical Rating Scales and Assessment in Psychiatry and Mental Health , 2010, Current Clinical Psychiatry.

[4]  Samy Bengio,et al.  Why do multi-stream, multi-band and multi-modal approaches work on biometric user authentication tasks? , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Michael J. Pazzani,et al.  A Principal Components Approach to Combining Regression Estimates , 1999, Machine Learning.

[6]  Roland Göcke,et al.  An Investigation of Depressed Speech Detection: Features and Normalization , 2011, INTERSPEECH.

[7]  Eliathamby Ambikairajah,et al.  Investigation of Spectral Centroid Magnitude and Frequency for Speaker Recognition , 2010, Odyssey.

[8]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[9]  Björn W. Schuller,et al.  CCA based feature selection with application to continuous depression recognition from acoustic speech features , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  P. Bech,et al.  The heterogeneity of the depressive syndrome: when numbers get serious , 2011, Acta psychiatrica Scandinavica.

[11]  Eliathamby Ambikairajah,et al.  Spectro-temporal analysis of speech affected by depression and psychomotor retardation , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  A. Mitchell,et al.  Clinical diagnosis of depression in primary care: a meta-analysis , 2009, The Lancet.

[13]  C. Mathers,et al.  Projections of Global Mortality and Burden of Disease from 2002 to 2030 , 2006, PLoS medicine.

[14]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[15]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[16]  Maja Pantic,et al.  Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge , 2013, AVEC@ACM Multimedia.

[17]  Thomas F. Quatieri,et al.  Vocal biomarkers of depression based on motor incoordination , 2013, AVEC@ACM Multimedia.

[18]  Albert A. Rizzo,et al.  Automatic audiovisual behavior descriptors for psychological disorder analysis , 2014, Image Vis. Comput..

[19]  Vidhyasaharan Sethu,et al.  Probabilistic acoustic volume analysis for speech affected by depression , 2014, INTERSPEECH.

[20]  David A. van Leeuwen,et al.  Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006 , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  Roland Göcke,et al.  Modeling spectral variability for the classification of depressed speech , 2013, INTERSPEECH.

[22]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[23]  Thomas F. Quatieri,et al.  Classification of depression state based on articulatory precision , 2013, INTERSPEECH.

[24]  Roland Göcke,et al.  Diagnosis of depression by behavioural signals: a multimodal approach , 2013, AVEC@ACM Multimedia.

[25]  Douglas A. Reynolds,et al.  Approaches to language identification using Gaussian mixture models and shifted delta cepstral features , 2002, INTERSPEECH.

[26]  A. Beck,et al.  Comparison of Beck Depression Inventories -IA and -II in psychiatric outpatients. , 1996, Journal of personality assessment.

[27]  Huaiyu Yang,et al.  Rating Scales for Depression , 2009 .