Evaluating Progression of Alzheimer's Disease by Regression and Classification Methods in a Narrative Language Test in Portuguese

Automated discourse analysis aiming at the diagnosis of language impairing dementias already exist for the English language, but no such work had been done for Portuguese. Here, we describe the results of creating a unified environment, entitled Coh-Metrix-Dementia, based on a previous tool to analyze discourse, named Coh-Metrix-Port. After adding 25 new metrics for measuring syntactical complexity, idea density, and text cohesion through latent semantics, Coh-Metrix-Dementia extracts 73 features from narratives of normal aging (CTL), Alzheimer’s Disease (AD), and Mild Cognitive Impairment (MCI) patients. This paper presents initial experiments in automatically diagnosing CTL, AD, and MCI patients from a narrative language test based on sequenced pictures and textual analysis of the resulting transcriptions. In order to train regression and classification models, the large set of features in Coh-Metrix-Dementia must be reduced in size. Three feature selection methods are compared. In our experiments with classification, it was possible to separate CTL, AD, and MCI with 0.817 \(F_1\) score, and separate CTL and MCI with 0.900 \(F_1\) score. As for regression, the best results for MAE were 0.238 and 0.120 for scenarios with three and two classes, respectively.

[1]  Eric Yeh,et al.  Language Analytics for Assessing Brain Health: Cognitive Impairment, Depression and Pre-symptomatic Alzheimer's Disease , 2010, Brain Informatics.

[2]  Keith A. Johnson,et al.  Preclinical Alzheimer disease—the challenges ahead , 2013, Nature Reviews Neurology.

[3]  D R Wekstein,et al.  Linguistic ability in early life and cognitive function and Alzheimer's disease in late life. Findings from the Nun Study. , 1996, JAMA.

[4]  Lucia Specia,et al.  Readability Assessment for Text Simplification , 2010 .

[5]  P. Mecocci,et al.  Random Forest ensembles for detection and prediction of Alzheimer's disease with a good between-cohort robustness , 2014, NeuroImage: Clinical.

[6]  Michael A Covington,et al.  Automatic measurement of propositional idea density from part-of-speech tagging , 2008, Behavior research methods.

[7]  Sandra M. Aluísio,et al.  Análise da Inteligibilidade de textos via ferramentas de Processamento de Língua Natural: adaptando as métricas do Coh-Metrix para o Português , 2010, Linguamática.

[8]  Brian Roark,et al.  Spoken Language Derived Measures for Detecting Mild Cognitive Impairment , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  N. Cercone,et al.  Automatic detection and rating of dementia of Alzheimer type through lexical analysis of spontaneous speech , 2005, IEEE International Conference Mechatronics and Automation, 2005.

[10]  Sandra M. Aluísio,et al.  Automatic Proposition Extraction from Dependency Trees: Helping Early Prediction of Alzheimer's Disease from Narratives , 2015, 2015 IEEE 28th International Symposium on Computer-Based Medical Systems.