A Deep Cascade Network for Unaligned Face Attribute Classification

Humans focus attention on different face regions when recognizing face attributes. Most existing face attribute classification methods use the whole image as input. Moreover, some of these methods rely on fiducial landmarks to provide defined face parts. In this paper, we propose a cascade network that simultaneously learns to localize face regions specific to attributes and performs attribute classification without alignment. First, a weakly-supervised face region localization network is designed to automatically detect regions (or parts) specific to attributes. Then multiple part-based networks and a whole-image-based network are separately constructed and combined together by the region switch layer and attribute relation layer for final attribute classification. A multi-net learning method and hint-based model compression is further proposed to get an effective localization model and a compact classification model, respectively. Our approach achieves significantly better performance than state-of-the-art methods on unaligned CelebA dataset, reducing the classification error by 30.9%.

[1]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[2]  Ivan Laptev,et al.  Is object localization for free? - Weakly-supervised learning with convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[5]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[6]  M. Posner,et al.  The attention system of the human brain. , 1990, Annual review of neuroscience.

[7]  Yaser S. Abu-Mostafa,et al.  A Method for Learning From Hints , 1992, NIPS.

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[10]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[11]  Adriana Kovashka,et al.  Learning Attributes from Human Gaze , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[12]  Yu Cheng,et al.  Fully-Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[14]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[15]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[16]  Bernard Ghanem,et al.  On the relationship between visual attributes and convolutional networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[18]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[20]  Peter N. Belhumeur,et al.  POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Shiguang Shan,et al.  Two Birds, One Stone: Jointly Learning Binary Code for Large-Scale Face Image Retrieval and Attributes Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Cordelia Schmid,et al.  Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[25]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[26]  Rama Chellappa,et al.  FaceNet2ExpNet: Regularizing a Deep Face Recognition Net for Expression Recognition , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[27]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[28]  Jing Wang,et al.  Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Eirikur Agustsson,et al.  From Face Images and Attributes to Attributes , 2016, ACCV.

[30]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[31]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[32]  K. S. Venkatesh,et al.  Deep Attributes for One-Shot Face Recognition , 2016, ECCV Workshops.

[33]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[35]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[36]  R. Desimone,et al.  Selective attention gates visual processing in the extrastriate cortex. , 1985, Science.

[37]  Shree K. Nayar,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Describable Visual Attributes for Face Verification and Image Search , 2022 .

[38]  Rama Chellappa,et al.  Attributes for Improved Attributes: A Multi-Task Network for Attribute Classification , 2016, ArXiv.

[39]  Terrance E. Boult,et al.  MOON: A Mixed Objective Optimization Network for the Recognition of Facial Attributes , 2016, ECCV.

[40]  Mohamed R. Amer,et al.  Facial Attributes Classification Using Multi-task Representation Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).