Recognition of Affective and Grammatical Facial Expressions: A Study for Brazilian Sign Language

Individuals with hearing impairment typically face difficulties in communicating with hearing individuals and during the acquisition of reading and writing skills. Widely adopted by the deaf, Sign Language (SL) has a grammatical structure where facial expressions assume grammatical and affective functions, differentiate lexical items, participate in syntactic construction, and contribute to intensification processes. Automatic Sign Language Recognition (ASLR) technology supports the communication between deaf and hearing individuals, translating sign language gestures into written or spoken sentences of a target language. The recognition of facial expressions can improve ASLR accuracy rates. There are cases where the absence of a facial expression can create wrong translations, making them necessary for the understanding of sign language. This paper presents an approach to facial recognition for sign language. Brazilian Sign Language (Libras) is used as a case study. In our approach, we code Libras’ facial expression using the Facial Action Coding System (FACS). In the paper, we evaluate two convolutional neural networks, a standard CNN and hybrid CNN+LSTM, for AU recognition. We evaluate the models on a challenging real-world video database of facial expressions in Libras. The results obtained were 0.87 f1-score average and indicated the potential of the system to recognize Libras’ facial expressions.

[1]  P. Ekman Facial expression and emotion. , 1993, The American psychologist.

[2]  Ruimin Shen,et al.  Region and Temporal Dependency Fusion for Multi-label Action Unit Detection , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[3]  Li Chen,et al.  AU R-CNN: Encoding Expert Prior Knowledge into R-CNN for Action Unit Detection , 2018, Neurocomputing.

[4]  Honggang Zhang,et al.  Deep Region and Multi-label Learning for Facial Action Unit Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Sarajane Marques Peres,et al.  Grammatical Facial Expressions Recognition with Machine Learning , 2014, FLAIRS Conference.

[6]  Mohammad H. Mahoor,et al.  Extended DISFA Dataset: Investigating Posed and Spontaneous Facial Expressions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Stefanos Zafeiriou,et al.  A survey on mouth modeling and analysis for Sign Language recognition , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[8]  Shan Li,et al.  Deep Facial Expression Recognition: A Survey , 2018, IEEE Transactions on Affective Computing.

[9]  R. Gur,et al.  Automated Facial Action Coding System for dynamic analysis of facial expressions in neuropsychiatric disorders , 2011, Journal of Neuroscience Methods.

[10]  M. Siegal,et al.  Deafness, conversation and theory of mind. , 1995, Journal of child psychology and psychiatry, and allied disciplines.

[11]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[12]  Mohammad H. Mahoor,et al.  DISFA: A Spontaneous Facial Action Intensity Database , 2013, IEEE Transactions on Affective Computing.

[13]  Anima Majumder,et al.  Automatic Facial Expression Recognition System Using Deep Network-Based Data Fusion , 2018, IEEE Transactions on Cybernetics.

[14]  C. Scott,et al.  The Matching of Facial Expressions by Deaf and Hearing Children and Their Production and Comprehension of Emotion Labels , 1998 .

[15]  Jacob Cohen,et al.  The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability , 1973 .

[16]  Andrea Cavallaro,et al.  Automatic Analysis of Facial Affect: A Survey of Registration, Representation, and Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[18]  Arman Savran,et al.  Regression-based intensity estimation of facial action units , 2012, Image Vis. Comput..

[19]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[20]  Vladimir Pavlovic,et al.  Deep Structured Learning for Facial Action Unit Intensity Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Bing-Fei Wu,et al.  Adaptive Feature Mapping for Customizing Deep Learning Based Facial Expression Recognition Model , 2018, IEEE Access.

[22]  Maja Pantic,et al.  A Dynamic Texture-Based Approach to Recognition of Facial Actions and Their Temporal Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Mohammad H. Mahoor,et al.  Going deeper in facial expression recognition using deep neural networks , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[24]  Gwen Littlewort,et al.  Automatic coding of facial expressions displayed during posed and genuine pain , 2009, Image Vis. Comput..

[25]  Pablo Andrés Arbeláez,et al.  Multi-View Dynamic Facial Action Unit Detection , 2017, Image Vis. Comput..

[26]  Masahiko Toyonaga,et al.  Facial Expression Sequence Recognition for a Japanese Sign Language Training System , 2018, 2018 Joint 10th International Conference on Soft Computing and Intelligent Systems (SCIS) and 19th International Symposium on Advanced Intelligent Systems (ISIS).

[27]  Yan Wang,et al.  EmotioNet Challenge: Recognition of facial expressions of emotion in the wild , 2017, ArXiv.

[28]  Beat Fasel,et al.  Automati Fa ial Expression Analysis: A Survey , 1999 .

[29]  Honggang Zhang,et al.  Joint patch and multi-label learning for facial action unit detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Qiang Ji,et al.  Classifier Learning with Prior Probabilities for Facial Action Unit Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Fernando De la Torre,et al.  Modeling Spatial and Temporal Cues for Multi-label Facial Action Unit Detection , 2016, ArXiv.

[32]  Thad Starner,et al.  American sign language recognition with the kinect , 2011, ICMI '11.

[34]  Björn W. Schuller,et al.  DeepCoder: Semi-Parametric Variational Autoencoders for Automatic Facial Action Coding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Xiaofeng Wang,et al.  Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Fernando De la Torre,et al.  Learning Spatial and Temporal Cues for Multi-Label Facial Action Unit Detection , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[38]  Benjamin Schrauwen,et al.  Sign Language Recognition Using Convolutional Neural Networks , 2014, ECCV Workshops.

[39]  H. Emrah Tasli,et al.  Deep learning based FACS Action Unit occurrence and intensity estimation , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[40]  Hong Zhang,et al.  Facial expression recognition via learning deep sparse autoencoders , 2018, Neurocomputing.

[41]  Emely Pujólli da Silva,et al.  Recognition of Non-Manual Expressions in Brazilian Sign Language , 2017 .

[42]  Yong Du,et al.  Facial Expression Recognition Based on Deep Evolutional Spatial-Temporal Networks , 2017, IEEE Transactions on Image Processing.

[43]  Lijun Yin,et al.  EAC-Net: A Region-Based Deep Enhancing and Cropping Approach for Facial Action Unit Detection , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[44]  Qiang Ji,et al.  Facial Action Unit Recognition and Intensity Estimation Enhanced Through Label Dependencies , 2019, IEEE Transactions on Image Processing.

[45]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[46]  Gwen Littlewort,et al.  Drowsy Driver Detection Through Facial Movement Analysis , 2007, ICCV-HCI.

[47]  Emely Pujólli da Silva,et al.  SILFA: Sign Language Facial Action Database for the Development of Assistive Technologies for the Deaf , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[48]  Martin Kampel,et al.  Facial Expression Recognition using Convolutional Neural Networks: State of the Art , 2016, ArXiv.

[49]  Louis-Philippe Morency,et al.  OpenFace 2.0: Facial Behavior Analysis Toolkit , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[50]  Dezheng Zhang,et al.  A comprehensive survey on automatic facial action unit analysis , 2019, The Visual Computer.

[51]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[52]  Facial Action Unit Recognition Augmented by Their Dependencies , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[53]  Sarajane Marques Peres,et al.  Grammatical facial expression recognition in sign language discourse: a study at the syntax level , 2017, Inf. Syst. Frontiers.

[54]  José Mario De Martino,et al.  Brazilian Sign Language Recognition Using Kinect , 2016, ECCV Workshops.

[55]  Fei Yang,et al.  Non-manual grammatical marker recognition based on multi-scale, spatio-temporal analysis of head pose and facial expressions , 2014, Image Vis. Comput..

[56]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[57]  Michel F. Valstar,et al.  Joint Action Unit localisation and intensity estimation through heatmap regression , 2018, BMVC.

[58]  Moritz Knorr,et al.  The significance of facial features for automatic sign language recognition , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[59]  Maja Pantic,et al.  Automatic Analysis of Facial Expressions: The State of the Art , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[61]  Scott K. Liddell American Sign Language Syntax , 1981 .

[62]  Maja Pantic,et al.  Automatic Analysis of Facial Actions: A Survey , 2019, IEEE Transactions on Affective Computing.

[63]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[64]  Wen-Sheng Chu,et al.  Learning Facial Action Units from Web Images with Scalable Weakly Supervised Clustering , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[65]  Fernando De la Torre,et al.  Learning facial action units with spatiotemporal cues and multi-label sampling , 2019, Image Vis. Comput..

[66]  Zhigang Zhu,et al.  Action Unit Detection with Region Adaptation, Multi-labeling Learning and Optimal Temporal Fusing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).