Deep Appearance Models: A Deep Boltzmann Machine Approach for Face Modeling

The “interpretation through synthesis” approach to analyze face images, particularly Active Appearance Models (AAMs) method, has become one of the most successful face modeling approaches over the last two decades. AAM models have ability to represent face images through synthesis using a controllable parameterized Principal Component Analysis (PCA) model. However, the accuracy and robustness of the synthesized faces of AAMs are highly depended on the training sets and inherently on the generalizability of PCA subspaces. This paper presents a novel Deep Appearance Models (DAMs) approach, an efficient replacement for AAMs, to accurately capture both shape and texture of face images under large variations. In this approach, three crucial components represented in hierarchical layers are modeled using the Deep Boltzmann Machines (DBM) to robustly capture the variations of facial shapes and appearances. DAMs are therefore superior to AAMs in inferencing a representation for new face images under various challenging conditions. The proposed approach is evaluated in various applications to demonstrate its robustness and capabilities, i.e. facial super-resolution reconstruction, facial off-angle reconstruction or face frontalization, facial occlusion removal and age estimation using challenging face databases, i.e. Labeled Face Parts in the Wild, Helen and FG-NET. Comparing to AAMs and other deep learning based approaches, the proposed DAMs achieve competitive results in those applications, thus this showed their advantages in handling occlusions, facial representation, and reconstruction.

[1]  Chunping Liu,et al.  Face Verification Across Aging Based on Deep Convolutional Networks and Local Binary Patterns , 2015, IScIDE.

[2]  George Trigeorgis,et al.  Adaptive cascaded regression , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[3]  Bart Thomee,et al.  New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative , 2010, MIR '10.

[4]  Roland Göcke,et al.  A Nonlinear Discriminative Approach to AAM Fitting , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Björn Stenger,et al.  Expressive Visual Text-to-Speech Using Active Appearance Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Marios Savvides,et al.  CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection , 2016, ArXiv.

[7]  Ruslan Salakhutdinov,et al.  Learning in Markov Random Fields using Tempered Transitions , 2009, NIPS.

[8]  Joachim Denzler,et al.  Instance-Weighted Transfer Learning of Active Appearance Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Ching Y. Suen,et al.  Kernel spectral regression of perceived age from hybrid facial features , 2011, Face and Gesture 2011.

[10]  Stefanos Zafeiriou,et al.  Menpo: A Comprehensive Platform for Parametric Image Alignment and Visual Deformable Models , 2014, ACM Multimedia.

[11]  Shuicheng Yan,et al.  Towards Multi-view and Partially-Occluded Face Alignment , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Timothy F. Cootes,et al.  Interpreting face images using active appearance models , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[13]  Biao Wang,et al.  Robust pose normalization for face recognition under varying views , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[14]  Daijin Kim,et al.  Pose-Robust Facial Expression Recognition Using View-Based 2D $+$ 3D AAM , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[15]  Stefanos Zafeiriou,et al.  A Unified Framework for Compositional Fitting of Active Appearance Models , 2016, International Journal of Computer Vision.

[16]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Tal Hassner,et al.  Effective face frontalization in unconstrained images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Alberto Del Bimbo,et al.  Effective 3D based frontalization for unconstrained face recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[19]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[20]  Thomas S. Huang,et al.  Interactive Facial Feature Localization , 2012, ECCV.

[21]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[22]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[23]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Ralph Gross,et al.  Generic vs. person specific active appearance models , 2005, Image Vis. Comput..

[25]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Stefanos Zafeiriou,et al.  A Semi-automatic Methodology for Facial Landmark Annotation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[27]  Kun Zhou,et al.  Intrinsic Face Image Decomposition with Human Face Priors , 2014, ECCV.

[28]  Timothy F. Cootes,et al.  An Algorithm for Tuning an Active Appearance Model to New Data , 2006, BMVC.

[29]  Shiguang Shan,et al.  Stacked Progressive Auto-Encoders (SPAE) for Face Recognition Across Poses , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Stefanos Zafeiriou,et al.  Unifying holistic and Parts-Based Deformable Model fitting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[32]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[33]  Dacheng Tao,et al.  Robust Face Recognition via Multimodal Deep Face Representation , 2015, IEEE Transactions on Multimedia.

[34]  Stefanos Zafeiriou,et al.  Feature-Based Lucas–Kanade and Active Appearance Models , 2015, IEEE Transactions on Image Processing.

[35]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Jeffrey F. Cohn,et al.  Person-Independent 3D Gaze Estimation Using Face Frontalization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[37]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[38]  Bo Li,et al.  Active appearance models using statistical characteristics of Gabor based texture representation , 2013, J. Vis. Commun. Image Represent..

[39]  Xiaogang Wang,et al.  Deep Learning Identity-Preserving Face Space , 2013, 2013 IEEE International Conference on Computer Vision.

[40]  Mohammad H. Mahoor,et al.  Bidirectional Warping of Active Appearance Model , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[41]  Christopher K. I. Williams,et al.  The Shape Boltzmann Machine: A Strong Model of Object Shape , 2012, International Journal of Computer Vision.

[42]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[43]  Qiang Ji,et al.  Facial Feature Tracking Under Varying Facial Expressions and Face Poses Based on Restricted Boltzmann Machines , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Changsheng Li,et al.  Learning ordinal discriminative features for age estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Ching Y. Suen,et al.  Age estimation using Active Appearance Models and Support Vector Machine regression , 2009, 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems.

[46]  Xiaogang Wang,et al.  Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations , 2014, NIPS.

[47]  Geoffrey E. Hinton,et al.  Deep Lambertian Networks , 2012, ICML.

[48]  Ching Y. Suen,et al.  Spectral Regression based age determination , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[49]  Meng Wang,et al.  Deep Aging Face Verification With Large Gaps , 2016, IEEE Transactions on Multimedia.

[50]  Timothy F. Cootes,et al.  Face Recognition Using Active Appearance Models , 1998, ECCV.

[51]  Maja Pantic,et al.  Fast Algorithms for Fitting Active Appearance Models to Unconstrained Images , 2016, International Journal of Computer Vision.

[52]  Ching Y. Suen,et al.  Contourlet appearance model for facial age estimation , 2011, 2011 International Joint Conference on Biometrics (IJCB).

[53]  Chih-Yuan Yang,et al.  Structured Face Hallucination , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Horst Bischof,et al.  Fast Active Appearance Model Search Using Canonical Correlation Analysis , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Jiwen Lu,et al.  Single Sample Face Recognition via Learning Deep Supervised Autoencoders , 2015, IEEE Transactions on Information Forensics and Security.

[56]  Daniel Pizarro-Perez,et al.  Light-invariant fitting of active appearance models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Stefanos Zafeiriou,et al.  Robust Statistical Face Frontalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[58]  Alan C. Bovik,et al.  Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures , 2009, IEEE Signal Processing Magazine.

[59]  Andrew Blake,et al.  On compositional Image Alignment, with an application to Active Appearance Models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Geoffrey E. Hinton,et al.  Robust Boltzmann Machines for recognition and denoising , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[61]  David J. Fleet,et al.  Dynamical binary latent variable models for 3D human pose tracking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[62]  Yun Fu,et al.  Human Age Estimation With Regression on Discriminative Aging Manifold , 2008, IEEE Transactions on Multimedia.

[63]  Maja Pantic,et al.  Optimization Problems for Fast AAM Fitting in-the-Wild , 2013, 2013 IEEE International Conference on Computer Vision.

[64]  J. Tenenbaum,et al.  Efficient analysis-by-synthesis in vision : A computational framework , behavioral tests , and comparison with neural representations , 2015 .

[65]  Stan Z. Li,et al.  Direct appearance models , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[66]  Sridha Sridharan,et al.  Fourier Active Appearance Models , 2011, 2011 International Conference on Computer Vision.

[67]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[68]  Tien D. Bui,et al.  Beyond Principal Components: Deep Boltzmann Machines for face modeling , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Stefanos Zafeiriou,et al.  Bayesian Active Appearance Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[70]  Xiaoou Tang,et al.  Learning Deep Representation for Face Alignment with Auxiliary Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Xiaolong Wang,et al.  Deeply-Learned Feature for Age Estimation , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[72]  Michael R. Lyu,et al.  Real-Time Non-rigid Shape Recovery Via Active Appearance Models for Augmented Reality , 2006, ECCV.

[73]  Emile A. Hendriks,et al.  Capturing appearance variation in active appearance models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[74]  Shaogang Gong,et al.  Cumulative Attribute Space for Age and Crowd Density Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[76]  Kha Gia Quach,et al.  Fine Tuning Age-estimation with Global and Local Facial Features , 2011 .

[77]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[78]  Petros Maragos,et al.  Adaptive and constrained algorithms for inverse compositional Active Appearance Model fitting , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[79]  Joshua B. Tenenbaum,et al.  Efficient analysis-by-synthesis in vision: A computational framework, behavioral tests, and modeling neuronal representations , 2015, Annual Meeting of the Cognitive Science Society.

[80]  Stefanos Zafeiriou,et al.  HOG active appearance models , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[81]  Honglak Lee,et al.  Learning hierarchical representations for face verification with convolutional deep belief networks , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.