Fast multi-view face alignment via multi-task auto-encoders

Face alignment is an important problem in computer vision. It is still an open problem due to the variations of facial attributes (e.g., head pose, facial expression, illumination variation). Many studies have shown that face alignment and facial attribute analysis are often correlated. This paper develops a two-stage multi-task Auto-encoders framework for fast face alignment by incorporating head pose information to handle large view variations. In the first and second stages, multi-task Auto-encoders are used to roughly locate and further refine facial landmark locations with related pose information, respectively. Besides, the shape constraint is naturally encoded into our two-stage face alignment framework to preserve facial structures. A coarse-to-fine strategy is adopted to refine the facial landmark results with the shape constraint. Furthermore, the computational cost of our method is much lower than its deep learning competitors. Experimental results on various challenging datasets show the effectiveness of the proposed method.

[1]  Maja Pantic,et al.  Optimization Problems for Fast AAM Fitting in-the-Wild , 2013, 2013 IEEE International Conference on Computer Vision.

[2]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Shengcai Liao,et al.  Coarse-to-Fine Statistical Shape Model by Bayesian Inference , 2007, ACCV.

[4]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[5]  Donghoon Lee,et al.  Face alignment using cascade Gaussian process regression trees , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Shiguang Shan,et al.  Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment , 2014, ECCV.

[7]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[8]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[9]  Xiaoou Tang,et al.  Learning Deep Representation for Face Alignment with Auxiliary Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Junzhou Huang,et al.  Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Maja Pantic,et al.  Gauss-Newton Deformable Part Models for Face Alignment In-the-Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[13]  Roland Göcke,et al.  Learning AAM fitting through simulation , 2009, Pattern Recognition.

[14]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[15]  Tieniu Tan,et al.  Transformation invariant subspace clustering , 2016, Pattern Recognit..

[16]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Luc Van Gool,et al.  Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Jiri Matas,et al.  XM2VTSDB: The Extended M2VTS Database , 1999 .

[19]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Xiaoming Liu,et al.  Large-Pose Face Alignment via CNN-Based Dense 3D Model Fitting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Harry Shum,et al.  A Bayesian mixture model for multi-view face alignment , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[25]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Wenhan Luo,et al.  Unified Face Analysis by Iterative Multi-output Random Forests , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[28]  Beat Fasel,et al.  Automatic facial expression analysis: a survey , 2003, Pattern Recognit..

[29]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[30]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.