Towards Accessible Sign Language Assessment and Learning

Recently, a phonology-based sign language assessment approach has been proposed using sign language production acquired in 3D space using Kinect sensor. In order to scale the sign language assessment system to realistic application, there is need to reduce the dependency on Kinect, which is not accessible to wider community, and develop solutions that can potentially work with web-cameras. This paper takes a step in that direction by investigating sign language recognition and sign language assessment in 2D space either by dropping the depth coordinate in Kinect or using methods for skeleton estimation from videos. Experimental studies on Swiss German Sign Language corpus SMILE show that, while loss of depth information leads to considerable drop in sign language recognition performance, high level of sign language assessment performance can still be obtained.

[1]  Ben Saunders,et al.  Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks , 2021, International Journal of Computer Vision.

[2]  Richard Bowden,et al.  A Phonology-based Approach for Isolated Sign Production Assessment in Sign Language , 2020, ICMI Companion.

[3]  Gerasimos Potamianos,et al.  Multimodal Sign Language Recognition via Temporal Deformable Convolutional Sequence Learning , 2020, INTERSPEECH.

[4]  Mathew Magimai-Doss,et al.  An HMM Approach with Inherent Model Selection for Sign Language and Gesture Recognition , 2020, LREC.

[5]  Oscar Koller,et al.  Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Necati Cihan Camgoz,et al.  Text2Sign: Towards Sign Language Production Using Neural Machine Translation and Generative Adversarial Networks , 2020, International Journal of Computer Vision.

[7]  Michael J. Black,et al.  VIBE: Video Inference for Human Body Pose and Shape Estimation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Mathew Magimai.-Doss,et al.  Subunits Inference and Lexicon Development Based on Pairwise Comparison of Utterances and Signs , 2019, Inf..

[9]  Richard Bowden,et al.  HMM-based Approaches to Model Multichannel Information in Sign Language Inspired from Articulatory Features-based Speech Processing , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Dario Pavllo,et al.  3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Hermann Ney,et al.  Neural Sign Language Translation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Sarah Ebling,et al.  SMILE Swiss German Sign Language Dataset , 2018, LREC.

[13]  Oscar Koller,et al.  SubUNets: End-to-End Hand Shape and Continuous Sign Language Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Matt Huenerfauth,et al.  Evaluation of Language Feedback Methods for Student Videos of American Sign Language , 2017, ACM Trans. Access. Comput..

[15]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[16]  Yaser Sheikh,et al.  Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Hermann Ney,et al.  Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition , 2016, BMVC.

[19]  Hermann Ney,et al.  Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data is Continuous and Weakly Labelled , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Hermann Ney,et al.  Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers , 2015, Comput. Vis. Image Underst..

[21]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Louisa Willoughby,et al.  Errors and Feedback in the Beginner Auslan Classroom , 2015 .

[24]  Christopher John,et al.  SignAssess - Online Sign Language Training Assignments via the Browser, Desktop and Mobile , 2012, ICCHP.

[25]  Thad Starner,et al.  CopyCat: An American Sign Language game for deaf children , 2011, Face and Gesture 2011.

[26]  Andrea J. van Doorn,et al.  Acceptability ratings by humans and automatic gesture recognition for variations in sign productions , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[27]  Jithendra Vepa,et al.  An Acoustic Model Based on Kullback-Leibler Divergence for Posterior Features , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[28]  Thad Starner,et al.  American sign language recognition in game development for deaf children , 2006, Assets '06.

[29]  Dimitris N. Metaxas,et al.  ASL recognition based on a coupling between HMMs and 3D motion analysis , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[30]  M. B. Waldron,et al.  Isolated ASL sign recognition system for deaf persons , 1995 .

[31]  Richard Bowden,et al.  Sign Language Recognition , 2011, Visual Analysis of Humans.

[32]  Bülent Sankur,et al.  SignTutor: An Interactive System for Sign Language Tutoring , 2009, IEEE Multimedia.

[33]  Hervé Bourlard,et al.  Using KL-based acoustic models in a large vocabulary recognition task , 2008, INTERSPEECH.

[34]  Dimitris N. Metaxas,et al.  Parallel hidden Markov models for American sign language recognition , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[35]  Mohammed Waleed Kadous,et al.  Machine Recognition of Auslan Signs Using PowerGloves: Towards Large-Lexicon Recognition of Sign Lan , 1996 .