Attention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features

Sign language recognition(SLR) is a multidisciplinary research topic in pattern recognition and computer vision. Due to large amount of data from the continuous frames of sign language videos, selecting representative data to eliminate irrelevant information has always been a challenging problem in data preprocessing of sign language samples. In recent years, skeletal data emerged as a new type of data but received insufficient attention. Meanwhile, due to the increasing diversity of sign language features, making full use of them has also been an important research topic. In this paper, we improve keyframe-centered clips (KCC) sampling to get a new kind of sampling method called optimized keyframe-centered clips (OptimKCC) sampling to select key actions from sign language videos. Besides, we design a new kind of skeletal feature called Multi-Plane Vector Relation (MPVR) to describe the video samples. Finally, combined with the attention mechanism, we also use Attention-Based networks to distribute weights to the temporal features and the spatial features extracted from skeletal data. We implement comparison experiments on our own and the public sign language dataset under the Signer-Independent and the Signer-Dependent circumstances to show the advantages of our methods.

[1]  Chao Xie,et al.  Chinese sign language recognition with adaptive HMM , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[2]  Wen Gao,et al.  A Chinese sign language recognition system based on SOFM/SRN/HMM , 2004, Pattern Recognit..

[3]  Peng Guo,et al.  Multimodal Fusion Based on LSTM and a Couple Conditional Hidden Markov Model for Chinese Sign Language Recognition , 2019, IEEE Access.

[4]  Qinkun Xiao,et al.  Skeleton-based Chinese sign language recognition and generation for bidirectional communication between deaf and hearing people , 2020, Neural Networks.

[5]  M. Jayaraju,et al.  Spotting and recognition of hand gesture for Indian sign language recognition system with skin segmentation and SVM , 2017, 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET).

[6]  Kandarpa Kumar Sarma,et al.  Hand gesture recognition system with real-time palm tracking , 2014, 2014 Annual IEEE India Conference (INDICON).

[7]  Zhongfu Ye,et al.  A Novel Chinese Sign Language Recognition Method Based on Keyframe-Centered Clips , 2018, IEEE Signal Processing Letters.

[8]  Qing Zhu,et al.  Continuous Chinese sign language recognition with CNN-LSTM , 2017, International Conference on Digital Image Processing.

[9]  Xiaoxu Li,et al.  Chinese Sign Language Recognition Based on SHS Descriptor and Encoder-Decoder LSTM Model , 2017, CCBR.

[10]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[11]  Xu Zhang,et al.  Random Forest-Based Recognition of Isolated Sign Language Subwords Using Data from Accelerometers and Surface Electromyographic Sensors , 2016, Sensors.

[12]  Ming C. Leu,et al.  American Sign Language word recognition with a sensory glove using artificial neural networks , 2011, Eng. Appl. Artif. Intell..

[13]  Youngmo Han A low-cost visual motion data glove as an input device to interpret human hand gestures , 2010, IEEE Transactions on Consumer Electronics.

[14]  Kanad K. Biswas,et al.  Gesture recognition using Microsoft Kinect® , 2011, The 5th International Conference on Automation, Robotics and Applications.

[15]  Seong-Whan Lee,et al.  Garbage Model Formulation for Sign Language Spotting with Conditional Random Fields(Internationa Session 7) , 2007 .

[16]  P. V. V. Kishore,et al.  3D sign language recognition with joint distance and angular coded color topographical descriptor on a 2 - stream CNN , 2020, Neurocomputing.

[17]  Hyung-Il Choi,et al.  Sign Language Recognition System Using SVM and Depth Camera , 2014 .

[18]  Shoichi Hasegawa,et al.  Natural interactive 3D medical image viewer based on finger and arm gestures , 2013, The 6th 2013 Biomedical Engineering International Conference.

[19]  Siming He,et al.  Research of a Sign Language Translation System Based on Deep Learning , 2019, 2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM).

[20]  Renu Vig,et al.  Taguchi-TOPSIS based HOG parameter selection for complex background sign language recognition , 2020, J. Vis. Commun. Image Represent..

[21]  Saleh Aly,et al.  DeepArSLR: A Novel Signer-Independent Deep Learning Framework for Isolated Arabic Sign Language Gestures Recognition , 2020, IEEE Access.

[22]  Sultan Almotairi,et al.  User-Independent American Sign Language Alphabet Recognition Based on Depth Image and PCANet Features , 2019, IEEE Access.

[23]  Lale Akarun,et al.  Temporal Accumulative Features for Sign Language Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[24]  Sergio Escalera,et al.  Hand sign language recognition using multi-view hand skeleton , 2020, Expert Syst. Appl..

[25]  Lijiya A,et al.  Signet: A Deep Learning based Indian Sign Language Recognition System , 2019, 2019 International Conference on Communication and Signal Processing (ICCSP).

[26]  Bahman Zanj,et al.  A Neural Network based system for Persian sign language recognition , 2009, 2009 IEEE International Conference on Signal and Image Processing Applications.

[27]  Wen Gao,et al.  Sign Language Recognition Based on HMM/ANN/DP , 2000, Int. J. Pattern Recognit. Artif. Intell..

[28]  Nguyen Thanh Thuy,et al.  The SVM, SimpSVM and RVM on sign language recognition problem , 2017, 2017 Seventh International Conference on Information Science and Technology (ICIST).

[29]  E. Kiran Kumar,et al.  Training CNNs for 3-D Sign Language Recognition With Color Texture Coded Joint Angular Displacement Maps , 2018, IEEE Signal Processing Letters.

[30]  Md. Rayhanul Kabir,et al.  Bangla Sign Language Detection Using SIFT and CNN , 2018, 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT).

[31]  Houqiang Li,et al.  Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition , 2020, AAAI.

[32]  Shing Chiang Tan,et al.  Isolated sign language recognition using Convolutional Neural Network hand modelling and Hand Energy Image , 2019, Multimedia Tools and Applications.

[33]  Houqiang Li,et al.  Attention-Based 3D-CNNs for Large-Vocabulary Sign Language Recognition , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Meng Wang,et al.  Sign language recognition based on adaptive HMMS with data augmentation , 2016, 2016 IEEE International Conference on Image Processing (ICIP).