On the relative importance of vocal source, system, and prosody in human depression

In Major Depressive Disorder (MDD), neurophysiologic changes can alter motor control [1][2] and therefore alter speech production by influencing vocal fold motion (source), the vocal tract (system), and melody (prosody). In this paper, we use a database of voice recordings from 28 depressed subjects treated over a 6-week period [3] to compare correlations between features from each of the three speech-production components and clinical assessments of MDD. Toward biomarkers for audio-based continuous monitoring of depression severity, we explore the contextual dependence of these correlations with free-response and read speech, and show tradeoffs across categories of features in these two example contexts. Likewise, we also investigate the context-and speech component-dependence of correlations between our vocal features and assessment of individual symptoms of MDD (e.g., depressed mood, agitation, energy). Finally, motivated by our initial findings, we describe how context may be useful in “on-body” monitoring of MDD to facilitate identification of depression and evaluation of its treatment.

[1]  D DeBrota,et al.  The responsiveness of the Hamilton Depression Rating Scale. , 2000, Journal of psychiatric research.

[2]  D. Mitchell Wilkes,et al.  Acoustical properties of speech as indicators of depression and suicidal risk , 2000, IEEE Transactions on Biomedical Engineering.

[3]  J. Peifer,et al.  Analysis of prosodic variation in speech for clinical depression , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[4]  P Bech,et al.  The Hamilton Depression Scale , 1981, Acta psychiatrica Scandinavica.

[5]  Thomas F. Quatieri,et al.  Vocal-Source Biomarkers for Depression: A Link to Psychomotor Activity , 2012, INTERSPEECH.

[6]  Daniel Rudoy,et al.  KARMA: Kalman-based autoregressive moving average modeling and inference for formant and antiformant tracking , 2011, The Journal of the Acoustical Society of America.

[7]  H. Sackeim,et al.  Psychomotor symptoms of depression. , 1997, The American journal of psychiatry.

[8]  J. Mundt,et al.  Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology , 2007, Journal of Neurolinguistics.

[9]  D. Mitchell Wilkes,et al.  Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk , 2004, IEEE Transactions on Biomedical Engineering.

[10]  M. Fava,et al.  Major Depressive Disorder , 2000, Neuron.

[11]  M. Hamilton A RATING SCALE FOR DEPRESSION , 1960, Journal of neurology, neurosurgery, and psychiatry.

[12]  Douglas E. Sturim,et al.  Automatic Detection of Depression in Speech Using Gaussian Mixture Modeling with Factor Analysis , 2011, INTERSPEECH.

[13]  Jiming Zhou,et al.  Embedded Passive Technology Application: Design and Fabrication of an Automotive Engine Controller , 2005, 2005 Conference on High Density Microsystem Design and Packaging and Component Failure Analysis.

[14]  M. Caligiuri,et al.  Motor and cognitive aspects of motor retardation in depression. , 2000, Journal of affective disorders.

[15]  Michael Cannizzaro,et al.  Voice acoustical measurement of the severity of major depression , 2004, Brain and Cognition.

[16]  Cathy J. Price,et al.  A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading , 2012, NeuroImage.

[17]  Philip J. B. Jackson,et al.  Pitch-scaled estimation of simultaneous voiced and turbulence-noise components in speech , 2001, IEEE Trans. Speech Audio Process..

[18]  Daniel Rudoy,et al.  Conditionally linear Gaussian models for estimating vocal tract resonances , 2007, INTERSPEECH.

[19]  Paul E. Croarkin,et al.  Psychomotor retardation in depression: Biological underpinnings, measurement, and treatment , 2011, Progress in Neuro-Psychopharmacology and Biological Psychiatry.