Tools and Technologies for Computer-Aided Speech and Language Therapy

This paper addresses the problem of Computer-Aided Speech and Language Therapy (CASLT). The goal of the work described in the paper is to develop and evaluate a semi-automated system for providing interactive speech therapy to the increasing population of impaired individuals and help professional speech therapists. A discussion on the development and evaluation of a set of interactive therapy tools, along with the underlying speech technologies that support these tools is provided. The interactive tools are designed to facilitate the acquisition of language skills in the areas of basic phonatory skills, phonetic articulation and language understanding primarily for children with neuromuscular disorders like dysarthria. Human-machine interaction for all of these areas requires the existence of speech analysis, speech recognition, and speech verification algorithms that are robust with respect to the sources of speech variability that are characteristic of this population of speakers. The paper will present an experimental study that demonstrates the effectiveness of an interactive system for eliciting speech from a population of impaired children and young speakers ranging in age from 11 to 21 years. The performance of automatic speech recognition (ASR) systems and subword-based pronunciation verification (PV) on this domain are also presented. The results indicate that ASR and PV systems configured from speech utterances taken from the impaired speech domain can provide adequate performance, similar to the experts' agreement rate, for supporting the presented CASLT applications.

[1]  Frank K. Soong,et al.  Automatic mispronunciation detection for Mandarin , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Björn Granström,et al.  Speech technology for language training and e-inclusion , 2005, INTERSPEECH.

[3]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[4]  Jean Paul Haton,et al.  Speech training for deaf and hearing-impaired people , 1999, EUROSPEECH.

[5]  Oscar Saz-Torralba,et al.  COMUNICA - tools for speech and language therapy , 2008, WOCCI.

[6]  Andreas Stolcke,et al.  Finding consensus among words: lattice-based word error minimization , 1999, EUROSPEECH.

[7]  Ralf Kompe,et al.  A Combined MAP + MLLR Approach for Speaker Adaptation , 2002 .

[8]  M. Monfort,et al.  Registro fonológico inducido , 1988 .

[9]  José B. Mariño,et al.  Albayzin speech database: design of the phonetic corpus , 1993, EUROSPEECH.

[10]  Klára Vicsi,et al.  SPECO - a multimedia multilingual teaching and training system for speech handicapped children , 1999, EUROSPEECH.

[11]  Eric Sanders,et al.  Automatic Recognition Of Dutch Dysarthric Speech, A Pilot Study , 2002 .

[12]  Helmer Strik,et al.  ASR-based pronunciation training: scoring accuracy and pedagogical effectiveness of a system for dutch L2 learners , 2007, INTERSPEECH.

[13]  J R Deller,et al.  On the use of hidden Markov modelling for recognition of dysarthric speech. , 1991, Computer methods and programs in biomedicine.

[14]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[15]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[16]  Eduardo Lleida,et al.  Speech Technology Applied to Children with Speech Disorders , 2008 .

[17]  Alexander U. Kornilov The biofeedback program for speech rehabilitation of oncological patients after full larynx removal surgical treatment , 2004 .

[18]  Eduardo Lleida,et al.  Utterance verification in continuous speech recognition: decoding and training procedures , 2000, IEEE Trans. Speech Audio Process..

[19]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[20]  Hugo Van hamme,et al.  Automatic assessment of children's reading level , 2007, INTERSPEECH.

[21]  M. Inés Torres,et al.  Improving dialogue systems in a home automation environment , 2008, Ambi-Sys '08.

[22]  Paloma Acero Villán,et al.  Tratamiento de la voz: (manual práctico) , 2005 .

[23]  David House,et al.  Presentation of a new EU project for speech therapy: OLP (Ortho-Logo-Paedia) , 2007 .

[24]  Mervyn A. Jack,et al.  Harp: an autonomous speech rehabilitation system for hearing-impaired people , 1994, ICSLP.

[25]  Justin Fackrell,et al.  Segment selection in the L&h Realspeak laboratory TTS system , 2000, INTERSPEECH.

[26]  Oscar Saz-Torralba,et al.  E-inclusion technologies for the speech handicapped , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  Shrikanth S. Narayanan,et al.  Investigating automatic assessment of reading comprehension in young children , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[28]  R. Patel Phonatory control in adults with cerebral palsy and severe dysarthria , 2002 .

[29]  Khalid Choukri,et al.  SPEECHDAT-CAR. A Large Speech Database for Automotive Environments , 2000, LREC.

[30]  Oscar Saz-Torralba,et al.  A study of pronunciation verification in a speech therapy application , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  Samy Bengio,et al.  A statistical significance test for person authentication , 2004, Odyssey.

[32]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[33]  S. Pratt,et al.  The efficacy of using the IBM Speech Viewer Vowel Accuracy Module to treat young children with hearing impairment. , 1993, Journal of speech and hearing research.

[34]  Oscar Saz-Torralba,et al.  Study of time and frequency variability in pathological speech and error reduction methods for automatic speech recognition , 2006, INTERSPEECH.

[35]  Thomas S. Huang,et al.  Dysarthric speech database for universal access research , 2008, INTERSPEECH.

[36]  H. Timothy Bunnell,et al.  The Nemours database of dysarthric speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[37]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[38]  P. Green,et al.  STARDUST – Speech Training And Recognition for Dysarthric Users of Assistive Technology , 2003 .

[39]  Akinori Ito,et al.  Recognition of English utterances with grammatical and lexical mistakes for dialogue-based CALL system , 2008, INTERSPEECH.