Combining Augmented Reality and Speech Technologies to Help Deaf and Hard of Hearing People

Augmented Reality (AR), Automatic Speech Recognition (ASR) and Text-to-Speech Synthesis (TTS) can be used to help people with disabilities. In this paper, we combine these technologies to make a new system for helping deaf people. This system can take the narrator's speech and convert it into a readable text and show it directly on AR display. To improve the accuracy of the system, we use Audio-Visual Speech Recognition (AVSR) as a backup for the ASR engine in noisy environments. In addition, we use the TTS system to make our system more usable for deaf people. The results of testing the system show that its accuracy is over 85 percent on average in different places. Also, the result of a survey shows that more than 90 percent of deaf people on average are very interested in using our system as an assistant in portable devices for communication.

[1]  F. Valero-Cuevas,et al.  The potential of virtual reality and gaming to assist successful aging with disability. , 2010, Physical medicine and rehabilitation clinics of North America.

[2]  Zöe Handley Is text-to-speech synthesis ready for use in computer-assisted language learning? , 2009, Speech Commun..

[3]  Halimah Badioze Zaman,et al.  Developing Augmented Reality book for deaf in science: The determining factors , 2010, 2010 International Symposium on Information Technology.

[4]  Heedong Ko,et al.  "Move the couch where?" : developing an augmented reality multimodal interface , 2006, 2006 IEEE/ACM International Symposium on Mixed and Augmented Reality.

[5]  Alex Zelinsky,et al.  Learning OpenCV---Computer Vision with the OpenCV Library (Bradski, G.R. et al.; 2008)[On the Shelf] , 2009, IEEE Robotics & Automation Magazine.

[6]  B. Ben Mosbah Speech Recognition for Disabilities People , 2006, 2006 2nd International Conference on Information & Communication Technologies.

[7]  John D. Kelleher,et al.  Just Say It: An Evaluation of Speech Interfaces for Augmented Reality Design Applications , 2009, AICS.

[8]  Satoshi Tamura,et al.  Evaluation of real-time audio-visual speech recognition , 2010, AVSP.

[9]  Nikos E. Mastorakis,et al.  An overview of text-to-speech synthesis techniques , 2010, ICC 2010.

[10]  Kah Phooi Seng,et al.  Lips detection for audio-visual speech recognition system , 2009, 2008 International Symposium on Intelligent Signal Processing and Communications Systems.

[11]  Steven K. Feiner,et al.  Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality , 2003, ICMI '03.

[12]  Sridha Sridharan,et al.  Lip detection for audio-visual speech recognition in-car environment , 2010, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010).

[13]  Ivo Ipsic Speech and Language Technologies , 2011 .

[14]  R. San-Segundo,et al.  Evaluating a Speech Communication System for Deaf People , 2011, IEEE Latin America Transactions.

[15]  Sigal Eden,et al.  Improving Flexible Thinking in Deaf and Hard of Hearing Children with Virtual Reality Technology , 2000, American annals of the deaf.