A Voice-Input Voice-Output Communication Aid for People With Severe Speech Impairment

A new form of augmentative and alternative communication (AAC) device for people with severe speech impairment-the voice-input voice-output communication aid (VIVOCA)-is described. The VIVOCA recognizes the disordered speech of the user and builds messages, which are converted into synthetic speech. System development was carried out employing user-centered design and development methods, which identified and refined key requirements for the device. A novel methodology for building small vocabulary, speaker-dependent automatic speech recognizers with reduced amounts of training data, was applied. Experiments showed that this method is successful in generating good recognition performance (mean accuracy 96%) on highly disordered speech, even when recognition perplexity is increased. The selected message-building technique traded off various factors including speed of message construction and range of available message outputs. The VIVOCA was evaluated in a field trial by individuals with moderate to severe dysarthria and confirmed that they can make use of the device to produce intelligible speech output from disordered speech input. The trial highlighted some issues which limit the performance and usability of the device when applied in real usage situations, with mean recognition accuracy of 67% in these circumstances. These limitations will be addressed in future work.

[1]  James Carmichael,et al.  A speech-controlled environmental control system for people with severe dysarthria. , 2007, Medical engineering & physics.

[2]  U. Sandler,et al.  A system for recognition and translation of the speech of handicapped individuals , 1998, MELECON '98. 9th Mediterranean Electrotechnical Conference. Proceedings (Cat. No.98CH36056).

[3]  P. Green,et al.  Automatic speech recognition and training for severely dysarthric users of assistive technology: The STARDUST project , 2006, Clinical linguistics & phonetics.

[4]  Mark Hasegawa-Johnson,et al.  State-Transition Interpolation and MAP Adaptation for HMM-based Dysarthric Speech Recognition , 2010, SLPAT@NAACL.

[5]  Phil D. Green,et al.  Automatic speech recognition with sparse training data for dysarthric speakers , 2003, INTERSPEECH.

[6]  John Todman,et al.  Rate and quality of conversations using a text-storage AAC system: Single-case training study , 2000 .

[7]  Linda J. Ferrier,et al.  Dysarthric speakers' intelligibility and speech characteristics in relation to computer speech recognition , 1995 .

[8]  Stuart Cunningham,et al.  The Effect of Three Practice Conditions on the Consistency of Chronic Dysarthric Speech , 2004 .

[9]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[10]  C. Kleinke Gaze and eye contact: a research review. , 1986, Psychological bulletin.

[11]  Joan Murphy "I Prefer Contact This Close": Perceptions of AAC by People with Motor Neurone Disease and their Communication Partners , 2004 .

[12]  BRUCE WISENBURN,et al.  Participant Evaluations of Rate and Communication Efficacy of an AAC Application Using Natural Language Processing , 2009, Augmentative and alternative communication.

[13]  Stuart Cunningham,et al.  Challenges in developing a voice input voice output communication aid for people with severe dysarthria , 2007 .

[14]  Bernard M O'Keefe,et al.  Research priorities in augmentative and alternative communication as identified by people who use AAC and their facilitators , 2007, Augmentative and alternative communication.

[15]  Norman Alm,et al.  Whole Utterance Approaches in AAC , 2008, Augmentative and alternative communication.

[16]  Bruce Wisenburn,et al.  An AAC Application Using Speaking Partner Speech Recognition to Automatically Produce Contextually Relevant Utterances: Objective Results , 2008, Augmentative and alternative communication.

[17]  Nancy Thomas-Stonell,et al.  Computerized speech recognition: influence of intelligibility and perceptual consistency on recognition accuracy , 1998 .

[18]  H. Bauchner,et al.  Does Speech and Language Therapy Work , 2000 .

[19]  Alan W. Black,et al.  Flite: a small fast run-time synthesis engine , 2001, SSW.

[20]  Mark Hawley,et al.  A voice input voice output communication aid: what do users and therapists require? , 2010 .