Facial Expression Recognition with Skip-Connection to Leverage Low-Level Features

Deep convolutional neural networks (CNNs) have established their feet in the ground of computer vision and machine learning, used in various applications. In this work, an attempt is made to learn a CNN for a task of facial expression recognition (FER). Our network has convolutional layers linked with an FC layer with a skip-connection to the classification layer. Motivation behind this design is that lower layers of a CNN are responsible for lower level features, and facial expressions can be mainly encoded in low-to-mid level features. Hence, in order to leverage the responses from lower layers, all convo-lutional layers are integrated via FC layers. Moreover, a network with shared parameters is used to extract landmark motion trajectory features. These visual and landmark features are fused to improve the performance. Our method is evaluated on the CK+ and Oulu-CASIA facial expression datasets.

[1]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Li-Chen Fu,et al.  Temporal-Contrastive Appearance Network for Facial Expression Recognition , 2018, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[3]  Jiayu Dong,et al.  Dynamic Facial Expression Recognition Based on Convolutional Neural Networks with Dense Connections , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[4]  Shuicheng Yan,et al.  Peak-Piloted Deep Network for Facial Expression Recognition , 2016, ECCV.

[5]  Byung Cheol Song,et al.  Multi-modal emotion recognition using semi-supervised learning and multiple neural networks in the wild , 2017, ICMI.

[6]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Manisha Verma,et al.  LBVCNN: Local Binary Volume Convolutional Neural Network for Facial Expression Recognition From Image Sequences , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Marian Stewart Bartlett,et al.  Exemplar Hidden Markov Models for classification of facial expressions in videos , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Matti Pietikäinen,et al.  Facial expression recognition from near-infrared videos , 2011, Image Vis. Comput..

[10]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Abdellah Madani,et al.  Fusing multi-stream deep neural networks for facial expression recognition , 2018, Signal Image Video Process..

[12]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Matti Pietikäinen,et al.  Towards a practical lipreading system , 2011, CVPR 2011.

[14]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[15]  Mohammad H. Mahoor,et al.  Facial Expression Recognition Using Enhanced Deep 3 D Convolutional Neural Networks , .

[16]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[17]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[19]  Gaurav Sharma,et al.  LOMo: Latent Ordinal Model for Facial Analysis in Videos , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[21]  Shiguang Shan,et al.  Deeply Learning Deformable Facial Action Parts Model for Dynamic Expression Analysis , 2014, ACCV.

[22]  Graham W. Taylor,et al.  Multi-task Learning of Facial Landmarks and Expression , 2014, 2014 Canadian Conference on Computer and Robot Vision.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ruiping Wang,et al.  Learning Expressionlets via Universal Manifold Model for Dynamic Facial Expression Recognition. , 2016, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[25]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[26]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Lijun Yin,et al.  CNN based 3D facial expression recognition using masking and landmark features , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).

[29]  Stefanos Zafeiriou,et al.  300 Faces In-The-Wild Challenge: database and results , 2016, Image Vis. Comput..

[30]  Matti Pietikäinen,et al.  Dynamic Facial Expression Recognition Using Longitudinal Facial Expression Atlases , 2012, ECCV.

[31]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[32]  Yong Du,et al.  Facial Expression Recognition Based on Deep Evolutional Spatial-Temporal Networks , 2017, IEEE Transactions on Image Processing.