Facial Expression Analysis Based on High Dimensional Binary Features

High dimensional engineered features have yielded high performance results on a variety of visual recognition tasks and attracted significant recent attention. Here, we examine the problem of expression recognition in static facial images. We first present a technique to build high dimensional, \(\sim 60\mathrm{k}\) features composed of dense Census transformed vectors based on locations defined by facial keypoint predictions. The approach yields state of the art performance at 96.8% accuracy for detecting facial expressions on the well known Cohn-Kanade plus (CK+) evaluation and 93.2% for smile detection on the GENKI dataset. We also find that the subsequent application of a linear discriminative dimensionality reduction technique can make the approach more robust when keypoint locations are less precise. We go on to explore the recognition of expressions captured under more challenging pose and illumination conditions. Specifically, we test this representation on the GENKI smile detection dataset. Our high dimensional feature technique yields state of the art performance on both of these well known evaluations.

[1]  Yichuan Tang,et al.  Deep Learning using Support Vector Machines , 2013, ArXiv.

[2]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[3]  Xiaoou Tang,et al.  Surpassing Human-Level Face Verification Performance on LFW with GaussianFace , 2014, AAAI.

[4]  James L. Crowley,et al.  Smile Detection Using Multi-scale Gaussian Derivatives , 2013 .

[5]  Marian Stewart Bartlett,et al.  Exploring Bag of Words Architectures in the Facial Expression Domain , 2012, ECCV Workshops.

[6]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[7]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Razvan Pascanu,et al.  Combining modality specific deep neural networks for emotion recognition in video , 2013, ICMI '13.

[12]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[13]  András Lörincz,et al.  High quality facial expression recognition in video streams using shape related information only , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[14]  Gwen Littlewort,et al.  Multiple kernel learning for emotion recognition in the wild , 2013, ICMI '13.

[15]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[16]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[17]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[18]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Vitomir Struc,et al.  Gabor-Based Kernel Partial-Least-Squares Discrimination Features for Face Recognition , 2009, Informatica.

[20]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[21]  Jean Meunier,et al.  Emotion recognition using dynamic grid-based HoG features , 2011, Face and Gesture 2011.

[22]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[23]  Vitomir Struc,et al.  Photometric Normalization Techniques for Illumination Invariance , 2011 .

[24]  Gwen Littlewort,et al.  Toward Practical Smile Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Shiguang Shan,et al.  Enhancing Expression Recognition in the Wild with Unlabeled Reference Data , 2012, ACCV.

[26]  James L. Crowley,et al.  Head Pose Estimation Using Multi-scale Gaussian Derivatives , 2013, SCIA.

[27]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[30]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[31]  Pascal Vincent,et al.  Disentangling Factors of Variation for Facial Expression Recognition , 2012, ECCV.

[32]  Peter Eisert,et al.  Analyzing Facial Expressions for Virtual Conferencing , 1998, IEEE Computer Graphics and Applications.

[33]  Masashi Sugiyama,et al.  Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis , 2007, J. Mach. Learn. Res..

[34]  Caifeng Shan,et al.  Smile detection by boosting pixel differences , 2012, IEEE Transactions on Image Processing.