Deaf talk using 3D animated sign language: A sign language interpreter using Microsoft's kinect v2

This paper describes a neoteric approach to bridge the communication gap between deaf people and normal human beings. In any community there exists such group of disable people who face severe difficulties in communication due to their speech and hearing impediments. Such people use various gestures and symbols to talk and receive their messages and this mode of communication is called sign language. Yet the communication problem doesn't end here, as natural language speakers don't understand sign language resulting in a communication gap. Towards such ends there is a need to develop a system which can act as an interpreter for sign language speakers and a translator for natural language speaker. For this purpose, a software based solution has been developed in this research by exploiting the latest technologies from Microsoft i.e. Kinect for windows V2. The proposed system is dubbed as Deaf Talk, and it acts as a sign language interpreter and translator to provide a dual mode of communication between sign language speakers and natural language speakers. The dual mode of communication has following independent modules (1) Sign/Gesture to speech conversion (2) Speech to sign language conversion. In sign to speech conversion module, the person with speech inhibition has to place himself within Kinect's field of view (FOV) and then performs the sign language gestures. The system receives the performed gestures through Kinect sensor and then comprehends those gestures by comparing them with the trained gestures already stored in the database. Once the gesture is determined, it is mapped to the keyword corresponding to that gesture. The keywords are then sent to text to speech conversion module, which speaks or plays the sentence for natural language speaker. In contrast to sign to speech conversion, the speech to sign language conversion module translates the spoken language to sign language. In this case, the normal person places himself in the Kinect sensor's FOV and speaks in his native language (English for this case). The system then converts it into text using speech to text API. The keywords are then mapped to their corresponding pre-stored animated gestures and then animations are played on the screen for the spoken sentence. In this way the disable person can visualize the spoken sentence, translated into a 3D animated sign language. The accuracy of Deaf Talk is 87 percent for speech to sign language conversion and 84 percent for sign language to speech conversion.

[1]  Robin R. Murphy,et al.  Hand gesture recognition with depth images: A review , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[2]  Wenyuan Xu,et al.  KinWrite: Handwriting-Based Authentication Using Kinect , 2013, NDSS.

[3]  Chien-Yu Huang,et al.  Developing Kinect Games Integrated with Virtual Reality on Activities of Daily Living for Children with Developmental Delay , 2013, EMC/HumanCom.

[4]  Matthew Tang,et al.  Recognizing Hand Gestures with Microsoft ’ s Kinect , 2011 .

[5]  Raúl Rojas,et al.  Sign Language Recognition Using Kinect , 2012, ICAISC.

[6]  Junsong Yuan,et al.  Robust Part-Based Hand Gesture Recognition Using Kinect Sensor , 2013, IEEE Transactions on Multimedia.

[7]  Thad Starner,et al.  American sign language recognition with the kinect , 2011, ICMI '11.

[8]  Guang Li,et al.  Sign Language Recognition and Translation with Kinect , 2013 .

[9]  Janusz Konrad,et al.  A gesture-driven computer interface using Kinect , 2012, 2012 IEEE Southwest Symposium on Image Analysis and Interpretation.

[10]  Yi Li,et al.  Hand gesture recognition using Kinect , 2012, 2012 IEEE International Conference on Computer Science and Automation Engineering.

[11]  Norman H. Villaroman,et al.  Teaching natural user interaction using OpenNI and the Microsoft Kinect sensor , 2011, SIGITE '11.

[12]  Hee-Deok Yang,et al.  Sign Language Recognition with the Kinect Sensor Based on Conditional Random Fields , 2014, Sensors.

[13]  Andrea Fossati,et al.  Consumer Depth Cameras for Computer Vision , 2013, Advances in Computer Vision and Pattern Recognition.