A Novel Stochastic Transformer-based Approach for Post-Traumatic Stress Disorder Detection using Audio Recording of Clinical Interviews

Post-traumatic stress disorder (PTSD) is a mental disorder that can be developed after witnessing or experiencing extremely traumatic events. PTSD can affect anyone, regardless of ethnicity, or culture. An estimated one in every eleven people will experience PTSD during their lifetime. The Clinician-Administered PTSD Scale (CAPS) and the PTSD Check List for Civilians (PCL-C) interviews are gold standards in the diagnosis of PTSD. These questionnaires can be fooled by the subject's responses. This work proposes a deep learning-based approach that achieves state-of-the-art performances for PTSD detection using audio recordings during clinical interviews. Our approach is based on MFCC low-level features extracted from audio recordings of clinical interviews, followed by deep high-level learning using a Stochastic Transformer. Our proposed approach achieves state-of-the-art performances with an RMSE of 2.92 on the eDAIC dataset thanks to the stochastic depth, stochastic deep learning layers, and stochastic activation function.

[1]  Iryna Gurevych,et al.  Transformers with Learnable Activation Functions , 2022, FINDINGS.

[2]  Daoqiang Zhang,et al.  TcT: Temporal and channel Transformer for EEG-based Emotion Recognition , 2022, 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS).

[3]  Sonam Gupta,et al.  TOXGB: Teamwork Optimization Based XGBoost model for early identification of post-traumatic stress disorder , 2022, Cognitive Neurodynamics.

[4]  Dimitris N. Metaxas,et al.  Stochastic Transformer Networks with Linear Competing Units: Application to end-to-end SL Translation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Qing-Long Zhang,et al.  ResT: An Efficient Transformer for Visual Recognition , 2021, NeurIPS.

[6]  Lijia Jiang,et al.  Deep learning stochastic processes with QCD phase transition , 2021, Physical Review D.

[7]  Andrew Zisserman,et al.  Perceiver: General Perception with Iterative Attention , 2021, ICML.

[8]  Weiping Hu,et al.  Autoencoder Based on Cepstrum Separation to Detect Depression from Speech , 2020, ICITEE.

[9]  Kai Yu,et al.  DEPA: Self-Supervised Audio Embedding for Depression Detection , 2019, ACM Multimedia.

[10]  Albert Ali Salah,et al.  Predicting Depression and Emotions in the Cross-roads of Cultures, Para-linguistics, and Non-linguistics , 2019, AVEC@MM.

[11]  Koichi Shinoda,et al.  Multimodal Fusion of BERT-CNN and Gated CNN Representations for Depression Detection , 2019, AVEC@MM.

[12]  In-So Kweon,et al.  BAM: Bottleneck Attention Module , 2018, BMVC.

[13]  Jiang Li,et al.  A deep transfer learning approach for improved post-traumatic stress disorder diagnosis , 2017, Knowledge and Information Systems.

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016, 1606.08415.

[16]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[17]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[18]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[19]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[20]  David DeVault,et al.  The Distress Analysis Interview Corpus of human and computer interviews , 2014, LREC.

[21]  Pierre Baldi,et al.  Understanding Dropout , 2013, NIPS.

[22]  Shashidhar G. Koolagudi,et al.  Characterization and recognition of emotions from speech using excitation source information , 2013, Int. J. Speech Technol..

[23]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[24]  N. Breslau The Epidemiology of Trauma, PTSD, and Other Posttrauma Disorders , 2009, Trauma, violence & abuse.

[25]  Joseph E LeDoux,et al.  Response Variation following Trauma: A Translational Neuroscience Approach to Understanding PTSD , 2007, Neuron.

[26]  H. Westra Review of Pathological anxiety: Emotional processing in etiology and treatment. , 2007 .

[27]  Dale L. June,et al.  Posttraumatic Stress Disorder (PTSD) , 2010, Stahl's Illustrated Anxiety, Stress, and PTSD.

[28]  E. Blanchard,et al.  Information processing and PTSD: a review of the empirical literature. , 2000, Clinical psychology review.

[29]  P. Meehl Why Summaries of Research on Psychological Theories are Often Uninterpretable , 1990 .

[30]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[31]  Peter J. Lang,et al.  A Bio‐Informational Theory of Emotional Imagery , 1979 .

[32]  Albert A. Rizzo,et al.  Self-Reported Symptoms of Depression and PTSD Are Associated with Reduced Vowel Space in Screening Interviews , 2016, IEEE Transactions on Affective Computing.

[33]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[34]  Tara N. Sainath,et al.  Locally-connected and convolutional neural networks for small footprint speaker recognition , 2015, INTERSPEECH.