论文信息 - AUTOMATIC SPEECH ASSESSMENT FOR APHASIC PATIENTS BASED ON SYLLABLE-LEVEL EMBEDDING AND SUPRASEGMENTAL DURATION FEATURES Conference

AUTOMATIC SPEECH ASSESSMENT FOR APHASIC PATIENTS BASED ON SYLLABLE-LEVEL EMBEDDING AND SUPRASEGMENTAL DURATION FEATURES Conference

Aphasia is a type of acquired language impairment resulting from brain injury. Speech assessment is an important part of the comprehensive assessment process for aphasic patients. It is based on the acoustical and linguistic analysis of patients’ speech elicited through pre-defined story-telling tasks. This type of narrative spontaneous speech embodies multi-fold atypical characteristics related to the underlying language impairment. This paper presents an investigation on automatic speech assessment for Cantonese-speaking aphasic patients using an automatic speech recognition (ASR) system. A novel approach to extracting robust text features from erroneous ASR output is developed based on word embedding methods. The text features can effectively distinguish the stories told by an impaired speaker from those by unimpaired ones. On the other hand, a set of supra-segmental duration features are derived from syllablelevel time alignments produced by the ASR system, to characterize the atypical prosody of impaired speech. The proposed text features, duration features and their combination are evaluated in a binary classification experiment as well as in automatic prediction of subjective assessment score. The results clearly show that the text features are very effective in the intended task of aphasia assessment, while using duration features could provide additional benefit.

[1] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[2] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .

[3] E. Yiu,et al. Linguistic assessment of Chinese-speaking aphasics: Development of a Cantonese aphasia battery , 1992, Journal of Neurolinguistics.

[4] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[5] Hisham Adam,et al. Dysprosody in aphasia: An acoustic analysis evidence from Palestinian Arabic , 2014 .

[6] Sam-Po Law,et al. A Coding System with Independent Annotations of Gesture Forms and Functions During Verbal Communication: Development of a Database of Speech and GEsture (DoSaGE) , 2015, Journal of nonverbal behavior.

[7] Haipeng Wang,et al. Analysis of auto-aligned and auto-segmented oral discourse by speakers with aphasia: A preliminary study on the acoustic parameter of duration. , 2013, Procedia, social and behavioral sciences.

[8] Andrew Kertesz,et al. Aphasia and Associated Disorders: Taxonomy, Localization and Recovery , 1979 .

[9] Margaret Forbes,et al. AphasiaBank: Methods for studying discourse , 2011, Aphasiology.

[10] Anastassia Loukina,et al. Feature selection for automated speech scoring , 2015, BEA@NAACL-HLT.

[11] Helen Meng,et al. CANTONESE SPEECH RECOGNITION AND SYNTHESIS , 2006 .

[12] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[14] Frank Rudzicz,et al. Automatic speech recognition in the diagnosis of primary progressive aphasia , 2013, SLPAT.