A Deep Sum-Product Architecture for Robust Facial Attributes Analysis

Recent works have shown that facial attributes are useful in a number of applications such as face recognition and retrieval. However, estimating attributes in images with large variations remains a big challenge. This challenge is addressed in this paper. Unlike existing methods that assume the independence of attributes during their estimation, our approach captures the interdependencies of local regions for each attribute, as well as the high-order correlations between different attributes, which makes it more robust to occlusions and misdetection of face regions. First, we have modeled region interdependencies with a discriminative decision tree, where each node consists of a detector and a classifier trained on a local region. The detector allows us to locate the region, while the classifier determines the presence or absence of an attribute. Second, correlations of attributes and attribute predictors are modeled by organizing all of the decision trees into a large sum-product network (SPN), which is learned by the EM algorithm and yields the most probable explanation (MPE) of the facial attributes in terms of the region's localization and classification. Experimental results on a large data set with 22,400 images show the effectiveness of the proposed approach.

[1]  Xiaogang Wang,et al.  Random Sampling for Subspace Face Recognition , 2006, International Journal of Computer Vision.

[2]  Kilian Q. Weinberger,et al.  Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.

[3]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[4]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5]  Xiaogang Wang,et al.  A discriminative deep model for pedestrian detection with occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[7]  Shree K. Nayar,et al.  FaceTracer: A Search Engine for Large Collections of Images with Faces , 2008, ECCV.

[8]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Jian Sun,et al.  An associate-predict model for face recognition , 2011, CVPR 2011.

[10]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Pedro M. Domingos,et al.  Discriminative Learning of Sum-Product Networks , 2012, NIPS.

[12]  Jian Sun,et al.  Face recognition with learning-based descriptor , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Nicolas Le Roux,et al.  Learning a Generative Model of Images by Factoring Appearance and Shape , 2011, Neural Computation.

[15]  Xiaogang Wang,et al.  Hierarchical face parsing via deep learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Xiaogang Wang,et al.  A unified framework for subspace face recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Steffen L. Lauritzen,et al.  Bayesian updating in causal probabilistic networks by local computations , 1990 .

[18]  Xiaogang Wang,et al.  Pedestrian Parsing via Deep Decompositional Network , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[20]  Geoffrey E. Hinton,et al.  Robust Boltzmann Machines for recognition and denoising , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Fei-Fei Li,et al.  Combining randomization and discrimination for fine-grained image categorization , 2011, CVPR 2011.

[22]  Subhransu Maji,et al.  Describing people: A poselet-based approach to attribute classification , 2011, 2011 International Conference on Computer Vision.

[23]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[24]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[25]  Charlie Tang Gated Boltzmann Machine for Recognition under Occlusion , 2010 .

[26]  Adnan Darwiche,et al.  A differential approach to inference in Bayesian networks , 2000, JACM.