In-the-wild facial expression recognition in extreme poses

In the computer research area, facial expression recognition is a hot research problem. Recent years, the research has moved from the lab environment to in-the-wild circumstances. It is challenging, especially under extreme poses. But current expression detection systems are trying to avoid the pose effects and gain the general applicable ability. In this work, we solve the problem in the opposite approach. We consider the head poses and detect the expressions within special head poses. Our work includes two parts: detect the head pose and group it into one pre-defined head pose class; do facial expression recognize within each pose class. Our experiments show that the recognition results with pose class grouping are much better than that of direct recognition without considering poses. We combine the hand-crafted features, SIFT, LBP and geometric feature, with deep learning feature as the representation of the expressions. The handcrafted features are added into the deep learning framework along with the high level deep learning features. As a comparison, we implement SVM and random forest to as the prediction models. To train and test our methodology, we labeled the face dataset with 6 basic expressions.

[1]  J. Gower Generalized procrustes analysis , 1975 .

[2]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[3]  Yaniv Taigman,et al.  Descriptor Based Methods in the Wild , 2008 .

[4]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[6]  Yajie Miao,et al.  EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[7]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Shang-Hong Lai,et al.  Online facial expression recognition based on combining texture and geometric information , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[9]  C. Goodall Procrustes methods in the statistical analysis of shape , 1991 .

[10]  Shaogang Gong,et al.  Audio- and Video-based Biometric Person Authentication , 1997, Lecture Notes in Computer Science.

[11]  Topi Mäenpää,et al.  The local binary pattern approach to texture analysis - extensions and applications , 2003 .

[12]  Deepak Ghimire,et al.  Facial expression recognition based on region specific appearance and geometric features , 2015, 2015 Tenth International Conference on Digital Information Management (ICDIM).

[13]  Yu Zhang,et al.  Very deep convolutional networks for end-to-end speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Jiri Matas,et al.  XM2VTSDB: The Extended M2VTS Database , 1999 .

[16]  Yong Tao,et al.  Compound facial expressions of emotion , 2014, Proceedings of the National Academy of Sciences.

[17]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[18]  Aleix M. Martínez,et al.  A Model of the Perception of Facial Expressions of Emotion by Humans: Research Overview and Perspectives , 2012, J. Mach. Learn. Res..

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  S. K. Shah,et al.  Facial expression detection , 2013, 2013 IEEE International Conference on Computational Intelligence and Computing Research.

[21]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[22]  Hatice Gunes,et al.  How to distinguish posed from spontaneous smiles using geometric features , 2007, ICMI '07.

[23]  F. Rohlf,et al.  Extensions of the Procrustes Method for the Optimal Superimposition of Landmarks , 1990 .

[24]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[25]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Hongbin Zha,et al.  Modeling facial expression space for recognition , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Aly A. Farag,et al.  Facial expression recognition based on geometric and optical flow features in colour image sequences , 2012 .

[29]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[30]  Thomas S. Huang,et al.  Interactive Facial Feature Localization , 2012, ECCV.

[31]  Shaogang Gong,et al.  Robust facial expression recognition using local binary patterns , 2005, IEEE International Conference on Image Processing 2005.

[32]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Xiaogang Wang,et al.  DeepID3: Face Recognition with Very Deep Neural Networks , 2015, ArXiv.