Elicitation Design for Acoustic Depression Classification: An Investigation of Articulation Effort, Linguistic Complexity, and Word Affect

Assessment of neurological and psychiatric disorders like depression are unusual from a speech processing perspective, in that speakers can be prompted or instructed in what they should say (e.g. as part of a clinical assessment). Despite prior speech-based depression studies that have used a variety of speech elicitation methods, there has been little evaluation of the best elicitation mode. One approach to understand this better is to analyze an existing database from the perspective of articulation effort, word affect, and linguistic complexity measures as proxies for depression sub-symptoms (e.g. psychomotor retardation, negative stimulus suppression, cognitive impairment). Here a novel measure for quantifying articulation effort is introduced, and when applied experimentally to the DAIC corpus shows promise for identifying speech data that are more discriminative of depression. Interestingly, experiment results demonstrate that by selecting speech with higher articulation effort, linguistic complexity, or word-based arousal/valence, improvements in acoustic speech-based feature depression classification performance can be achieved, serving as a guide for future elicitation design.

[1]  Scott A. Crossley,et al.  Automatically Assessing Lexical Sophistication: Indices, Tools, Findings, and Application , 2015 .

[2]  Thomas F. Quatieri,et al.  A review of depression and suicide risk assessment using speech analysis , 2015, Speech Commun..

[3]  R. de Raedt,et al.  Deficient inhibition of emotional information in depression. , 2006, Journal of affective disorders.

[4]  C. Young,et al.  Interviewing the patient. , 1960, The American journal of clinical nutrition.

[5]  N. Isshiki Physiology of Speech Production , 1989 .

[6]  T. Strine,et al.  The PHQ-8 as a measure of current depression in the general population. , 2009, Journal of affective disorders.

[7]  K. Scherer,et al.  Vocal indicators of affective disorders. , 1988, Psychotherapy and psychosomatics.

[8]  Kristopher Kyle,et al.  Measuring Syntactic Development in L2 Writing: Fine Grained Indices of Syntactic Complexity and Usage-Based Indices of Syntactic Sophistication , 2016 .

[9]  Noam Chomsky,et al.  The Sound Pattern of English , 1968 .

[10]  Stefan Priebe,et al.  Medical Decision Making Shared decision-making in ongoing outpatient psychiatric treatment , 2013 .

[11]  K. Bunton,et al.  Speech versus Nonspeech: Different Tasks, Different Neural Organization , 2008, Seminars in speech and language.

[12]  M. Alpert,et al.  Reflections of depression in acoustic measures of the patient's speech. , 2001, Journal of affective disorders.

[13]  T. Pozzo,et al.  Psychomotor Retardation in Depression: A Systematic Review of Diagnostic, Pathophysiologic, and Therapeutic Implications , 2013, BioMed research international.

[14]  J. de Haes,et al.  Doctor-patient communication: a review of the literature. , 1995, Social science & medicine.

[15]  J. Mundt,et al.  Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology , 2007, Journal of Neurolinguistics.

[16]  B. Pfohl,et al.  Linguistic analysis of speech in affective disorders. , 1976, Archives of general psychiatry.

[17]  P. Ladefoged,et al.  The sounds of the world's languages , 1996 .

[18]  P. Healey,et al.  Shared understanding in psychiatrist-patient communication: association with treatment adherence in schizophrenia. , 2013, Patient education and counseling.

[19]  Thomas F. Quatieri,et al.  Detecting Depression using Vocal, Facial and Semantic Communication Cues , 2016, AVEC@ACM Multimedia.

[20]  Laura K. Allen,et al.  Analyzing Discourse Processing Using a Simple Natural Language Processing Tool , 2014 .

[21]  J. Peifer,et al.  Comparing objective feature statistics of speech for classifying clinical depression , 2004, The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[22]  Moin Nadeem,et al.  Identifying Depression on Twitter , 2016, ArXiv.

[23]  Mohsen Bazargan,et al.  Depression symptomatology and diagnosis: discordance between patients and physicians in primary care settings. , 2008, BMC family practice.

[24]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[25]  J. Darby,et al.  Speech and voice parameters of depression: a pilot study. , 1984, Journal of communication disorders.

[26]  Michael Wagner,et al.  Detecting depression: A comparison between spontaneous and read speech , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  Peter Ladefoged,et al.  Vowels and Consonants , 2000, Manchu Grammar.

[28]  Kallirroi Georgila,et al.  SimSensei kiosk: a virtual human interviewer for healthcare decision support , 2014, AAMAS.

[29]  Valérie Hazan,et al.  The development of phonemic categorization in children aged 6-12 , 2000, J. Phonetics.

[30]  A. B. Smit,et al.  The Iowa Articulation Norms Project and Its Nebraska Replication , 1990 .

[31]  David DeVault,et al.  The Distress Analysis Interview Corpus of human and computer interviews , 2014, LREC.

[32]  Ian Maddieson,et al.  Correlating phonological complexity: data and validation , 2005 .

[33]  David DeVault,et al.  Toward Natural Turn-Taking in a Virtual Human Negotiation Agent , 2015, AAAI Spring Symposia.

[34]  Matthew Purver,et al.  Linguistic Indicators of Severity and Progress in Online Text-based Therapy for Depression , 2014, CLPsych@ACL.

[35]  John Kane,et al.  COVAREP — A collaborative voice analysis repository for speech technologies , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[36]  Dimitra Vergyri,et al.  The SRI AVEC-2014 Evaluation System , 2014, AVEC '14.

[37]  A. Flint,et al.  Abnormal speech articulation, psychomotor retardation, and subcortical dysfunction in major depression. , 1993, Journal of psychiatric research.

[38]  Judith A. Hall,et al.  Physicians’ interviewing styles and medical information obtained from patients , 1987, Journal of General Internal Medicine.

[39]  Yihan Deng,et al.  Sentiment analysis in medical settings: New opportunities and challenges , 2015, Artif. Intell. Medicine.

[40]  J. Pennebaker,et al.  Language use of depressed and depression-vulnerable college students , 2004 .

[41]  G. Zipf,et al.  The Psycho-Biology of Language , 1936 .

[42]  E. K. Sander When are speech sounds learned? , 1972, The Journal of speech and hearing disorders.

[43]  R. Bull,et al.  The cognitive interview: Its origins, empirical support, evaluation and practical implications , 1991 .

[44]  Beth L. Wellman,et al.  Speech sounds of young children , 1931 .

[45]  C. Bradshaw,et al.  Elongation of Pause-Time in Speech: A Simple, Objective Measure of Motor Retardation in Depression , 1976, British Journal of Psychiatry.

[46]  Gábor Kiss,et al.  Physiological and Cognitive Status Monitoring on the Base of Acoustic-Phonetic Speech Parameters , 2014, SLSP.

[47]  Eric Horvitz,et al.  Predicting Depression via Social Media , 2013, ICWSM.

[48]  Jeffrey F. Cohn,et al.  Detecting Depression Severity from Vocal Prosody , 2013, IEEE Transactions on Affective Computing.

[49]  J. Cohn,et al.  Dyadic Behavior Analysis in Depression Severity Assessment Interviews , 2014, ICMI.

[50]  Zhou Yu,et al.  Multimodal Prediction of Psychological Disorders: Learning Verbal and Nonverbal Commonalities in Adjacency Pairs , 2013 .