Modelling and segmenting subunits for sign language recognition based on hand motion analysis

Modelling and segmenting subunits is one of the important topics in sign language study. Many scholars have proposed the functional definition to subunits from the view of linguistics while the problem of efficiently implementing it using computer vision techniques is a challenge. On the other hand, a number of subunit segmentation work has been investigated for the task of vision-based sign language recognition whereas their subunits either somewhat lack the linguistic support or are improper. In this paper, we attempt to define and segment subunits using computer vision techniques, which also can be basically explained by sign language linguistics. A subunit is firstly defined as one continuous visual hand action in time and space, which comprises a series of interrelated consecutive frames. Then, a simple but efficient solution is developed to detect the subunit boundary using hand motion discontinuity. Finally, temporal clustering by dynamic time warping is adopted to merge similar segments and refine the results. The presented work does not need prior knowledge of the types of signs or number of subunits and is more robust to signer behaviour variation. Furthermore, it correlates highly with the definition of syllables in sign language while sharing characteristics of syllables in spoken languages. A set of comprehensive experiments on real-world signing videos demonstrates the effectiveness of the proposed model.

[1]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Mohammed Yeasin,et al.  Visual understanding of dynamic hand gestures , 2000, Pattern Recognit..

[3]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[4]  U. Bellugi,et al.  What the hands reveal about the brain , 1987 .

[5]  Scott K. Liddell,et al.  American Sign Language: The Phonological Base , 2013 .

[6]  Karl-Friedrich Kraiss,et al.  Towards an Automatic Sign Language Recognition System Using Subunits , 2001, Gesture Workshop.

[7]  Ming Ouhyoung,et al.  A real-time continuous gesture recognition system for sign language , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[8]  Narendra Ahuja,et al.  Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Wen Gao,et al.  A novel approach to automatically extracting basic units from Chinese sign language , 2004, ICPR 2004.

[10]  Yangsheng Xu,et al.  Trajectory fitting with smoothing splines using velocity information , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[11]  E. Klima The signs of language , 1979 .

[12]  George Awad,et al.  A Unified System for Segmentation and Tracking of Face and Hands in Sign Language Recognition , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[14]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[15]  Dimitris N. Metaxas,et al.  Toward Scalability in ASL Recognition: Breaking Down Signs into Phonemes , 1999, Gesture Workshop.

[16]  George Awad,et al.  Automatic Skin Segmentation for Gesture Recognition Combining Region and Support Vector Machine Active Learning , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[17]  Diane Brentari,et al.  A Prosodic Model of Sign Language Phonology , 1999 .