Distributed under Creative Commons Cc-by 4.0 a Technology Prototype System for Rating Therapist Empathy from Audio Recordings in Addiction Counseling

Scaling up psychotherapy services such as for addiction counseling is a critical societal need. One challenge is ensuring quality of therapy, due to the heavy cost of manual observational assessment. This work proposes a speech technology-based system to automate the assessment of therapist empathy-a key therapy quality index-from audio recordings of the psychotherapy interactions. We designed a speech processing system that includes voice activity detection and diarization modules, and an automatic speech recognizer plus a speaker role matching module to extract the therapist's language cues. We employed Maximum Entropy models, Maximum Likelihood language models, and a Lattice Rescoring method to characterize high vs. low empathic language. We estimated therapy-session level empathy codes using utterance level evidence obtained from these models. Our experiments showed that the fully automated system achieved a correlation of 0.643 between expert annotated empathy codes and machine-derived estimations, and an accuracy of 81% in classifying high vs. low empathy, in comparison to a 0.721 correlation and 86% accuracy in the oracle setting using manual transcripts. The results show that the system provides useful information that can contribute to automatic quality insurance and therapist training.

[1]  Shrikanth S. Narayanan,et al.  "Rate My Therapist": Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing , 2015, PloS one.

[2]  M. Iacoboni Imitation, empathy, and mirror neurons. , 2009, Annual review of psychology.

[3]  Athanasios Katsamanis,et al.  Toward automating a human behavioral coding system for married couples' interactions using speech acoustic features , 2013, Speech Commun..

[4]  Athanasios Katsamanis,et al.  Computing vocal entrainment: A signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions , 2014, Comput. Speech Lang..

[5]  Janet M. Baker,et al.  The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.

[6]  Panayiotis G. Georgiou,et al.  Analyzing the language of therapist empathy in Motivational Interview based psychotherapy , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[7]  Ronald Rosenfeld,et al.  A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[8]  Junji Yamato,et al.  Analyzing empathetic interactions based on the probabilistic modeling of the co-occurrence patterns of facial expressions in group meetings , 2011, Face and Gesture 2011.

[9]  David C. Atkins,et al.  A randomized controlled trial of event-specific prevention strategies for reducing problematic drinking associated with 21st birthday celebrations. , 2012, Journal of consulting and clinical psychology.

[10]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[11]  W. Miller,et al.  Is low therapist empathy toxic? , 2013, Psychology of addictive behaviors : journal of the Society of Psychologists in Addictive Behaviors.

[12]  David C. Atkins,et al.  Randomized controlled trial of a Spring Break intervention to reduce high-risk drinking. , 2014, Journal of consulting and clinical psychology.

[13]  Nancy Eisenberg,et al.  Empathic responding: Sympathy and personal distress. , 2009 .

[14]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[15]  Panayiotis G. Georgiou,et al.  "That's Aggravating, Very Aggravating": Is It Possible to Classify Behaviors in Couple Interactions Using Automatically Derived Lexical Features? , 2011, ACII.

[16]  Shrikanth S. Narayanan,et al.  A robust frontend for VAD: exploiting contextual, discriminative and spectral cues of human voice , 2013, INTERSPEECH.

[17]  W. Miller,et al.  Toward a theory of motivational interviewing. , 2009, The American psychologist.

[18]  S. Preston,et al.  Empathy: Its ultimate and proximate bases. , 2001, The Behavioral and brain sciences.

[19]  Panayiotis G. Georgiou,et al.  Modeling therapist empathy and vocal entrainment in drug addiction counseling , 2013, INTERSPEECH.

[20]  Junji Yamato,et al.  Analyzing perceived empathy/antipathy based on reaction time in behavioral coordination , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[21]  Angeliki Metallinou,et al.  Quantifying atypicality in affective facial expressions of children with autism spectrum disorders , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[22]  David C. Atkins,et al.  Brief intervention for problem drug use in safety-net primary care settings: a randomized clinical trial. , 2014, JAMA.

[23]  Tanaya Guha,et al.  On quantifying facial expression-related atypicality of children with Autism Spectrum Disorder , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Antoinette Krupski,et al.  Testing the effects of brief intervention in primary care for problem drug use in a randomized controlled trial: rationale, design, and methods , 2012, Addiction Science & Clinical Practice.

[25]  Xi Xiao,et al.  Sensitivity and Acclimation of Three Canopy-Forming Seaweeds to UVB Radiation and Warming , 2015, PloS one.

[26]  C. Batson These things called empathy: Eight related but distinct phenomena. , 2009 .

[27]  G. Barrett‐Lennard,et al.  The empathy cycle: Refinement of a nuclear concept. , 1981 .

[28]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[29]  Panayiotis G. Georgiou,et al.  Modeling therapist empathy through prosody in drug addiction counseling , 2014, INTERSPEECH.

[30]  Sean J. Tollison,et al.  Questions and reflections: the use of motivational interviewing microskills in a peer-led brief alcohol intervention for college students. , 2008, Behavior therapy.

[31]  Panayiotis G. Georgiou,et al.  A Case Study: Detecting Counselor Reflections in Psychotherapy for Addictions using Linguistic Features , 2012, INTERSPEECH.

[32]  Panayiotis G. Georgiou,et al.  Behavioral Signal Processing: Deriving Human Behavioral Informatics From Speech and Language , 2013, Proceedings of the IEEE.

[33]  Panayiotis G. Georgiou,et al.  Power-spectral analysis of head motion signal for behavioral modeling in human interaction , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Zhang Le,et al.  Maximum Entropy Modeling Toolkit for Python and C , 2004 .

[35]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[36]  Shrikanth S. Narayanan,et al.  The psychologist as an interlocutor in autism spectrum disorder assessment: insights from a study of spontaneous prosody. , 2014, Journal of speech, language, and hearing research : JSLHR.

[37]  David C. Atkins,et al.  Indicated prevention for college student marijuana use: a randomized controlled trial. , 2013, Journal of consulting and clinical psychology.

[38]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[39]  R. Ying Motivational interviewing: helping people change , 2013 .

[40]  Bryan Hartzler,et al.  Agency context and tailored training in technology transfer: a pilot evaluation of motivational interviewing training for community counselors. , 2009, Journal of substance abuse treatment.

[41]  David C. Atkins,et al.  Computational psychotherapy research: scaling up the evaluation of patient-provider interactions. , 2015, Psychotherapy.

[42]  Che-Wei Huang,et al.  Unsupervised speaker diarization using riemannian manifold clustering , 2014, INTERSPEECH.

[43]  Stacey M. L. Hendrickson,et al.  Assessing competence in the use of motivational interviewing. , 2005, Journal of substance abuse treatment.

[44]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.