Adapting acoustic and lexical models to dysarthric speech

Dysarthria is a motor speech disorder resulting from neurological damage to the part of the brain that controls the physical production of speech. It is, in part, characterized by pronunciation errors that include deletions, substitutions, insertions, and distortions of phonemes. These errors follow consistent intra-speaker patterns that we exploit through acoustic and lexical model adaptation to improve automatic speech recognition (ASR) on dysarthric speech. We show that acoustic model adaptation yields an average relative word error rate (WER) reduction of 36.99% and that pronunciation lexicon adaptation (PLA) further reduces the relative WER by an average of 8.29% on a large vocabulary task of over 1500 words for six speakers with severe to moderate dysarthria. PLA also shows an average relative WER reduction of 7.11% on speaker-dependent models evaluated using 5-fold cross-validation.