Spoken language processing techniques for sign language recognition and translation

We present an approach to automatically recognize sign language and translate it into a spoken language. A system to address these tasks is created based on state-ofthe-art techniques from statistical machine translation, speech recognition, and image processing research. Such a system is necessary for communication between deaf and hearing people. The communication is otherwise nearly impossible due to missing sign language skills on the hearing side, and the low reading and writing skills on the deaf side. As opposed to most current approaches, which focus on the recognition of isolated signs only, we present a system that recognizes complete sentences in sign language. Similar to speech recognition, we have to deal with temporal sequences. Instead of the acoustic signal in speech recognition, we process a video signal as input. Therefore, we use a speech recognition system to obtain a textual representation of the signed sentences. This intermediate representation is then fed into a statistical machine translation system to create a translation into a spoken language. To achieve good results, some particularities of sign languages are considered in both systems. We use a publicly available corpus to show the performance of the proposed system and report very promising results.

[1]  W. Stokoe,et al.  A dictionary of American sign language on linguistic principles , 1965 .

[2]  Andy Way,et al.  An Example-Based Approach to Translating Sign Language , 2005, MTSUMMIT.

[3]  Georg Heigold,et al.  The 2006 RWTH parliamentary speeches transcription system , 2006, INTERSPEECH.

[4]  Dietrich Klakow,et al.  Testing the correlation of word error rate and perplexity , 2002, Speech Commun..

[5]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[7]  Hermann Ney,et al.  Tracking using dynamic programming for appearance-based sign language recognition , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[8]  Xin Liu,et al.  Real Time Large Vocabulary Continuous Sign Language Recognition Based on OP/Viterbi Algorithm , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[9]  K. Emmorey,et al.  The Syntax of American Sign Language: Functional Categories and Hierarchical Structure by Carol Neidle et al. , 2000, Trends in Cognitive Sciences.

[10]  Hermann Ney,et al.  Morpho-syntax Based Statistical Methods for Sign Language Translation vorgelegt von : Cand , 2006 .

[11]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[12]  Tony Veale,et al.  The Challenges of Cross-Modal Translation: English-to-Sign-Language Translation in the Zardoz System , 1998, Machine Translation.

[13]  Dimitris N. Metaxas,et al.  A Framework for Recognizing the Simultaneous Aspects of American Sign Language , 2001, Comput. Vis. Image Underst..

[14]  Chung-Hsien Wu,et al.  Joint Optimization of Word Alignment and Epenthesis Generation for Chinese to Taiwanese Sign Synthesis , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  H. Ney,et al.  Modeling Image Variability in Appearance-Based Gesture Recognition , 2006 .

[16]  E. Klima The signs of language , 1979 .

[17]  Hermann Ney,et al.  Speech recognition techniques for a sign language recognition system , 2007, INTERSPEECH.

[18]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  David Windridge,et al.  A Linguistic Feature Vector for the Visual Interpretation of Sign Language , 2004, ECCV.

[20]  Hermann Ney,et al.  Morpho-Syntax Based Statistical Methods for Automatic Sign Language Translation , 2006, EAMT.

[21]  Jonathan G. Fiscus,et al.  A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[22]  Ruiduo Yang,et al.  Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Penny Boyes Breaem Einführung in die Gebärdensprache und ihre Ertorschung , 1995 .

[24]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Rubén San-Segundo-Hernández,et al.  A Spanish speech to sign language translation system for assisting deaf-mute people , 2006, INTERSPEECH.

[26]  R. Battison,et al.  Lexical Borrowing in American Sign Language , 1978 .

[27]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[28]  Ulrike Rosa Wrobel Referenz in Gebärdensprachen: Raum und Person , 2001 .