Space-Time Graphs Based on Interest Point Tracking for Sign Language

A hand tracking method is presented in this work, which achieves the best results found in the literature for the public RWTH-BOSTON-50 dataset, with a tracking error rate of 8.5%. Its main contribution is the extraction of intrinsic features from RGB sign language movies. In order to avoid some common limitations of model and appearance-based tracking methods, a movement pattern analysis was used as a feature basis. Such feature can be succinctly described as a space-time graph of interest point movement similarity, which is arranged as trees of dense trajectory connections to track hands in RGB sign language movies. In addition to basic geometry operations, simple graph methods are employed in the process, making it effective for parallel processing of large movie datasets.

[1]  Hermann Ney,et al.  Deep Sign: Enabling Robust Statistical Continuous Sign Language Recognition via Hybrid CNN-HMMs , 2018, International Journal of Computer Vision.

[2]  Luis A. Guerrero,et al.  Automatic recognition of the American sign language fingerspelling alphabet to assist people living with speech or hearing impairments , 2017, J. Ambient Intell. Humaniz. Comput..

[3]  Zhongfu Ye,et al.  An improved faster R-CNN approach for robust hand detection and classification in sign language , 2018, International Conference on Digital Image Processing.

[4]  Thanh Phuong Nguyen,et al.  Action-centric Polar Representation of Motion Trajectories for Online Action Recognition , 2016, VISIGRAPP.

[5]  Zaid Omar,et al.  A review of hand gesture and sign language recognition techniques , 2017, International Journal of Machine Learning and Cybernetics.

[6]  Christian Theobalt,et al.  GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Stan Sclaroff,et al.  Challenges in development of the American Sign Language Lexicon Video Dataset (ASLLVD) corpus , 2012 .

[8]  Ting Liu,et al.  Recent advances in convolutional neural networks , 2015, Pattern Recognit..

[9]  D. Anil Kumar,et al.  Continuous sign language recognition from tracking and shape features using Fuzzy Inference Engine , 2016, 2016 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET).

[10]  Petros Daras,et al.  SIGN LANGUAGE RECOGNITION BASED ON HAND AND BODY SKELETAL DATA , 2018, 2018 - 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON).

[11]  Alan W. C. Tan,et al.  A feature covariance matrix with serial particle filter for isolated sign language recognition , 2016, Expert Syst. Appl..

[12]  Aditya Trivedi,et al.  Real-time hand tracking using integrated optical flow and CAMshift algorithm , 2016, 2016 Second International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN).

[13]  Oscar Koller,et al.  MS-ASL: A Large-Scale Data Set and Benchmark for Understanding American Sign Language , 2018, BMVC.

[14]  Alan W. C. Tan,et al.  Block-based histogram of optical flow for isolated sign language recognition , 2016, J. Vis. Commun. Image Represent..

[15]  Gede Putra Kusuma,et al.  A Survey of Hand Gesture Recognition Methods in Sign Language , 2018 .

[16]  Hermann Ney,et al.  Tracking using dynamic programming for appearance-based sign language recognition , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[17]  Hermann Ney,et al.  Combination of Tangent Distance and an Image Distortion Model for Appearance-Based Sign Language Recognition , 2005, DAGM-Symposium.

[18]  Fengqing Zhu,et al.  Long Term Hand Tracking with Proposal Selection , 2018, 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[19]  Hermann Ney,et al.  Tracking Benchmark Databases for Video-Based Sign Language Recognition , 2010, ECCV Workshops.

[20]  Frank M. Shipman,et al.  Comparing Visual, Textual, and Multimodal Features for Detecting Sign Language in Video Sharing Sites , 2018, 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR).

[21]  N.R. Malik,et al.  Graph theory with applications to engineering and computer science , 1975, Proceedings of the IEEE.

[22]  Alan W. C. Tan,et al.  A four dukkha state-space model for hand tracking , 2017, Neurocomputing.

[23]  BatchNorm,et al.  Cross-modal Deep Variational Hand Pose Estimation , 2018 .