Segmentation-robust representations, matching, and modeling for sign language

Distinguishing true signs from transitional, extraneous movements as the signer moves from one sign to the next is a serious hurdle in the design of continuous sign language recognition systems. This problem is further compounded by the ambiguity of segmentation and occlusions. This short paper provides an overview of our experience with representations and matching methods, particularly those that can handle errors in low-level segmentation and uncertainties of sign boundaries in sentences. We have formulated a novel framework that can address both these problems using a nested, level-building based dynamic programming approach that works for matching two instances of signs as well as for matching an instance to an abstracted statistical model in the form of a Hidden Markov Model (HMM). We also present our approach to sign recognition that does not need hand tracking over frames, but rather abstracts and uses the global configuration of low-level features from hands and faces. These global representations are used not only for recognition, but also to extract and to automatically learn models of signs from continuous sentences in a weakly unsupervised manner. Our publications that discuss these issues and solutions in more detail can be found at http://marathon.csee.usf.edu/ASL/

[1]  Siome Goldenstein,et al.  Facial movement analysis in ASL , 2007, Universal Access in the Information Society.

[2]  Clayton Valli,et al.  Linguistics of American Sign Language: A Resource Text for Asl Users , 1992 .

[3]  Kenneth C. Knowlton,et al.  Perception of sign language from an array of 27 moving spots , 1981, Nature.

[4]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Sudeep Sarkar,et al.  Distribution-Based Dimensionality Reduction Applied to Articulated Motion Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Sudeep Sarkar,et al.  Automated extraction of signs from continuous sign language sentences using Iterated Conditional Modes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Ruiduo Yang,et al.  Coupled grouping and matching for sign and gesture recognition , 2009, Comput. Vis. Image Underst..

[9]  Stan Sclaroff,et al.  Automatic detection of relevant head gestures in American Sign Language communication , 2002, Object recognition supported by user interaction for service robots.

[10]  Cristian Sminchisescu,et al.  Conditional models for contextual human motion recognition , 2006, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Stan Sclaroff,et al.  Simultaneous Localization and Recognition of Dynamic Hand Gestures , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[12]  Wen Gao,et al.  Transition movement models for large vocabulary continuous sign language recognition , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[13]  Ulrich Canzler,et al.  Extraction of Non Manual Features for Videobased Sign Language Recognition , 2002, MVA.

[14]  Sudeep Sarkar,et al.  FUSION OF MANUAL AND NON-MANUAL INFORMATION IN AMERICAN SIGN LANGUAGE RECOGNITION , 2009 .

[15]  Ruiduo Yang,et al.  Handling Movement Epenthesis and Hand Segmentation Ambiguities in Continuous Sign Language Recognition Using Nested Dynamic Programming , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Avinash C. Kak,et al.  Purdue RVL-SLLL American Sign Language Database , 2006 .

[17]  Dimitris N. Metaxas,et al.  A Framework for Recognizing the Simultaneous Aspects of American Sign Language , 2001, Comput. Vis. Image Underst..