Fast sign language recognition benefited from low rank approximation

This paper proposes a framework based on the Hidden Markov Models (HMMs) benefited from the low rank approximation of the original sign videos for two aspects. First, under the observations that most visual information of a sign sequence typically concentrates on limited key frames, we apply an online low rank approximation of sign videos for the first time to select the key frames. Second, rather than fixing the number of hidden states for large vocabulary of variant signs, we further take the advantage of the low rank approximation to independently determine it for each sign to optimise predictions. With the key frame selection and the variant number of hidden states determination, an advanced framework based on HMMs for Sign Language Recognition (SLR) is proposed, which is denoted as Light-HMMs (because of the fewer frames and proper estimated hidden states). With the Kinect sensor, RGB-D data is fully investigated for the feature representation. In each frame, we adopt Skeleton Pair feature to character the motion and extract the Histograms of Oriented Gradients as the feature of the hand posture appearance. The proposed framework achieves an efficient computing and even better correct rate in classification. The widely experiments are conducted on large vocabulary sign datasets with up to 1000 classes of signs and the encouraging results are obtained.

[1]  Changsheng Xu,et al.  Discriminative Exemplar Coding for Sign Language Recognition With Kinect , 2013, IEEE Transactions on Cybernetics.

[2]  Thad Starner,et al.  American sign language recognition with the kinect , 2011, ICMI '11.

[3]  Tarik Arici,et al.  Gesture Recognition using Skeleton Data with Weighted Dynamic Time Warping , 2013, VISAPP.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[6]  Pei Yin,et al.  American Sign Language Phrase Verification in an Educational Game for Deaf Children , 2010, 2010 20th International Conference on Pattern Recognition.

[7]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Nicolas Pugeault,et al.  Sign Language Recognition using Sequential Pattern Trees , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Ming Ouhyoung,et al.  A sign language recognition system using hidden markov model and context sensitive search , 1996, VRST.

[10]  Guang Li,et al.  Sign Language Recognition and Translation with Kinect , 2013 .

[11]  Richard Bowden,et al.  Sign Language Recognition , 2011, Visual Analysis of Humans.

[12]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[13]  Wen Gao,et al.  Transition movement models for large vocabulary continuous sign language recognition , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[14]  Z. Liu,et al.  A real time system for dynamic hand gesture recognition with a depth sensor , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[15]  Vassilis Athitsos,et al.  A System for Large Vocabulary Sign Search , 2010, ECCV Workshops.

[16]  Petros Maragos,et al.  Advances in phonetics-based sub-unit modeling for transcription alignment and sign language recognition , 2011, CVPR 2011 WORKSHOPS.

[17]  Andrew Zisserman,et al.  Large-scale Learning of Sign Language by Watching TV (Using Co-occurrences) , 2013, BMVC.

[18]  Wen Gao,et al.  An approach based on phonemes to large vocabulary Chinese sign language recognition , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[19]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.