ASL-3DCNN: American sign language recognition technique using 3-D convolutional neural networks

The communication between a person from the impaired community with a person who does not understand sign language could be a tedious task. Sign language is the art of conveying messages using hand gestures. Recognition of dynamic hand gestures in American Sign Language (ASL) became a very important challenge that is still unresolved. In order to resolve the challenges of dynamic ASL recognition, a more advanced successor of the Convolutional Neural Networks (CNNs) called 3-D CNNs is employed, which can recognize the patterns in volumetric data like videos. The CNN is trained for classification of 100 words on Boston ASL (Lexicon Video Dataset) LVD dataset with more than 3300 English words signed by 6 different signers. 70% of the dataset is used for Training while the remaining 30% dataset is used for testing the model. The proposed work outperforms the existing state-of-art models in terms of precision (3.7%), recall (4.3%), and f-measure (3.9%). The computing time (0.19 seconds per frame) of the proposed work shows that the proposal may be used in real-time applications.

[1]  Yantao Li,et al.  SCANet , 2020, ACM Trans. Sens. Networks.

[2]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[3]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[4]  Gongfa Li,et al.  Human Lesion Detection Method Based on Image Information and Brain Signal , 2019, IEEE Access.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Stan Sclaroff,et al.  The American Sign Language Lexicon Video Dataset , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Honghai Liu,et al.  Gesture recognition based on an improved local sparse representation classification algorithm , 2017, Cluster Computing.

[8]  Geoffrey E. Hinton,et al.  On rectified linear units for speech processing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  SCANet , 2020, ACM Transactions on Sensor Networks.

[10]  Truong Q. Nguyen,et al.  Real-time sign language fingerspelling recognition using convolutional neural networks from depth map , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[11]  Hermann Ney,et al.  Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition , 2016, BMVC.

[12]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[13]  Honghai Liu,et al.  Hand gesture recognition based on convolution neural network , 2017, Cluster Computing.

[14]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[15]  Hermann Ney,et al.  Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Stephan Liwicki,et al.  Automatic recognition of fingerspelled words in British Sign Language , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[17]  Luc Van Gool,et al.  Real-time sign language letter and word recognition from depth data , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[18]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[19]  Zhi-jie Liang,et al.  3D Convolutional Neural Networks for Dynamic Sign Language Recognition , 2018, Comput. J..

[20]  Shuangquan Wang,et al.  SignFi , 2018, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[21]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[22]  Krishan Kumar,et al.  F-DES: Fast and Deep Event Summarization , 2017, IEEE Transactions on Multimedia.

[23]  Yuntao Cui,et al.  Appearance-Based Hand Sign Recognition from Intensity Image Sequences , 2000, Comput. Vis. Image Underst..

[24]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[25]  Navjot Singh,et al.  Deep Eigen Space Based ASL Recognition System , 2020, IETE Journal of Research.

[26]  Masaru Takeuchi,et al.  A method for recognizing a sequence of sign language words represented in a Japanese sign language sentence , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[27]  Honghai Liu,et al.  Jointly network: a network based on CNN and RBM for gesture recognition , 2018, Neural Computing and Applications.

[28]  Wen Gao,et al.  A continuous Chinese sign language recognition system , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[29]  Philippe Robert,et al.  PRAXIS: Towards automatic cognitive assessment using gesture recognition , 2018, Expert Syst. Appl..

[30]  Dimitris N. Metaxas,et al.  Handshapes and Movements: Multiple-Channel American Sign Language Recognition , 2003, Gesture Workshop.

[31]  Xin Liu,et al.  Real Time Large Vocabulary Continuous Sign Language Recognition Based on OP/Viterbi Algorithm , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[32]  Benjamin Schrauwen,et al.  Sign Language Recognition Using Convolutional Neural Networks , 2014, ECCV Workshops.

[33]  S. Foo,et al.  Hand pose estimation for American sign language recognition , 2004, Thirty-Sixth Southeastern Symposium on System Theory, 2004. Proceedings of the.

[34]  Richard Bowden,et al.  A boosted classifier tree for hand shape detection , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[35]  Geoffrey E. Hinton,et al.  GEMINI: Gradient Estimation Through Matrix Inversion After Noise Injection , 1988, NIPS.

[36]  Wen Gao,et al.  An approach based on phonemes to large vocabulary Chinese sign language recognition , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[37]  Sunil Vadera,et al.  A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images , 2017, Expert Syst. J. Knowl. Eng..