The INESC-ID Multi-Modal System for the ADReSS 2020 Challenge

This paper describes a multi-modal approach for the automatic detection of Alzheimer's disease proposed in the context of the INESC-ID Human Language Technology Laboratory participation in the ADReSS 2020 challenge. Our classification framework takes advantage of both acoustic and textual feature embeddings, which are extracted independently and later combined. Speech signals are encoded into acoustic features using DNN speaker embeddings extracted from pre-trained models. For textual input, contextual embedding vectors are first extracted using an English Bert model and then used either to directly compute sentence embeddings or to feed a bidirectional LSTM-RNNs with attention. Finally, an SVM classifier with linear kernel is used for the individual evaluation of the three systems. Our best system, based on the combination of linguistic and acoustic information, attained a classification accuracy of 81.25%. Results have shown the importance of linguistic features in the classification of Alzheimer's Disease, which outperforms the acoustic ones in terms of accuracy. Early stage features fusion did not provide additional improvements, confirming that the discriminant ability conveyed by speech in this case is smooth out by linguistic data.

[1]  S. Soroush Haj Zargarbashi,et al.  A Multi-Modal Feature Embedding Approach to Diagnose Alzheimer Disease from Spoken Language , 2019, ArXiv.

[2]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  D. N. Ripich,et al.  Patterns of discourse cohesion and coherence in Alzheimer's disease. , 1988, The Journal of speech and hearing disorders.

[4]  Jieping Ye,et al.  An Attention-Based Hybrid Network for Automatic Detection of Alzheimer's Disease from Narrative Speech , 2019, INTERSPEECH.

[5]  M. Grossman,et al.  Language Processing in Dementia , 2011 .

[6]  Koichi Shinoda,et al.  Detecting Alzheimer's Disease Using Gated Convolutional Neural Network from Audio Data , 2018, INTERSPEECH.

[7]  David Martins de Matos,et al.  Pragmatic Aspects of Discourse Production for the Automatic Identification of Alzheimer's Disease , 2020, IEEE Journal of Selected Topics in Signal Processing.

[8]  Sylvie Ratté,et al.  Computer-based evaluation of Alzheimer’s disease and mild cognitive impairment patients during a picture description task , 2018, Alzheimer's & dementia.

[9]  Isabel Trancoso,et al.  Pathological speech detection using x-vector embeddings , 2020, ArXiv.

[10]  Heidi Christensen,et al.  Detecting Signs of Dementia Using Word Vector Representations , 2018, INTERSPEECH.

[11]  P. Garrard,et al.  Connected speech as a marker of disease progression in autopsy-proven Alzheimer’s disease , 2013, Brain : a journal of neurology.

[12]  Fasih Haider,et al.  Alzheimer's Dementia Recognition through Spontaneous Speech: The ADReSS Challenge , 2020, INTERSPEECH.

[13]  Heidi Christensen,et al.  Dementia detection using automatic analysis of conversations , 2019, Comput. Speech Lang..

[14]  Issa Annamoradnejad,et al.  ColBERT: Using BERT Sentence Embedding for Humor Detection , 2020, ArXiv.

[15]  N. Kowall,et al.  The Handbook Of Alzheimer's Disease And Other Dementias (Softcover) , 2013 .

[16]  Gábor Gosztolya,et al.  Identifying Mild Cognitive Impairment and mild Alzheimer's disease based on spontaneous speech using ASR and linguistic features , 2019, Comput. Speech Lang..

[17]  Björn W. Schuller,et al.  The INTERSPEECH 2020 Computational Paralinguistics Challenge: Elderly Emotion, Breathing & Masks , 2020, INTERSPEECH.

[18]  Kathleen C. Fraser,et al.  Linguistic Features Identify Alzheimer's Disease in Narrative Speech. , 2015, Journal of Alzheimer's disease : JAD.

[19]  Sanjeev Khudanpur,et al.  Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.

[20]  Fasih Haider,et al.  An Assessment of Paralinguistic Acoustic Features for Detection of Alzheimer's Dementia in Spontaneous Speech , 2020, IEEE Journal of Selected Topics in Signal Processing.

[21]  Itshak Lapidot,et al.  Identifying Distinctive Acoustic and Spectral Features in Parkinson's Disease , 2019, INTERSPEECH.

[22]  Waad Ben Kheder,et al.  Automatic Prediction of Speech Evaluation Metrics for Dysarthric Speech , 2017, INTERSPEECH.

[23]  H. Goodglass Boston diagnostic aphasia examination , 2013 .

[24]  Sanjeev Khudanpur,et al.  Spoken Language Recognition using X-vectors , 2018, Odyssey.

[25]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[26]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[27]  Juan M. Perero-Codosero,et al.  Modeling Obstructive Sleep Apnea Voices Using Deep Neural Network Embeddings and Domain-Adversarial Training , 2020, IEEE Journal of Selected Topics in Signal Processing.

[28]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[29]  V. Manera,et al.  Automatic speech analysis for the assessment of patients with predementia and Alzheimer's disease , 2015, Alzheimer's & dementia.

[30]  M. Prince,et al.  World Alzheimer Report 2015 - The Global Impact of Dementia: An analysis of prevalence, incidence, cost and trends , 2015 .

[31]  Joon Son Chung,et al.  VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.

[32]  Daniel Kempler,et al.  Language Changes in Dementia of the Alzheimer Type , 1995 .

[33]  J. Weuve,et al.  Alzheimer disease in the United States (2010–2050) estimated using the 2010 census , 2013, Neurology.

[34]  Sanjeev Khudanpur,et al.  X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35]  B. Croisile,et al.  Comparative Study of Oral and Written Picture Description in Patients with Alzheimer's Disease , 1996, Brain and Language.

[36]  Mohit Bansal,et al.  Detecting Linguistic Characteristics of Alzheimer’s Dementia by Interpreting Neural Models , 2018, NAACL.

[37]  David Martins de Matos,et al.  Topic coherence analysis for the classification of Alzheimer's disease , 2018, IberSPEECH.