Sparse fully convolutional network for face labeling

Abstract This paper proposes a sparse fully convolutional network (FCN) for face labeling. FCN has demonstrated strong capabilities in learning representations for semantic segmentation. However, it often suffers from heavy redundancy in parameters and connections. To ease this problem, group Lasso regularization and intra-group Lasso regularization are utilized to sparsify the convolutional layers of the FCN. Based on this framework, parameters that correspond to the same output channel are grouped into one group, and these parameters are simultaneously zeroed out during training. For the parameters in groups that are not zeroed out, intra-group Lasso provides further regularization. The essence of the regularization framework lies in its ability to offer better feature selection and higher sparsity. Moreover, a fully connected conditional random fields (CRF) model is used to refine the output of the sparse FCN. The proposed approach is evaluated on the LFW face dataset with the state-of-the-art performance. Compared with a non-regularized FCN, the sparse FCN reduces the number of parameters by 91.55% while increasing the segmentation performance by 11% relative error reduction.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  Weisheng Li,et al.  Adaptive Class Preserving Representation for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[4]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[5]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[6]  Lucas Theis,et al.  Fast Face-Swap Using Convolutional Neural Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[8]  Xiaogang Wang,et al.  Visual Tracking with Fully Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Misha Denil,et al.  Predicting Parameters in Deep Learning , 2014 .

[10]  Marcia Binder Schmidt,et al.  The Dzogchen primer : embracing the spiritual path according to the great perfection , 2003, The Journal of Asian Studies.

[11]  Tara N. Sainath,et al.  Deep Convolutional Neural Networks for Large-scale Speech Tasks , 2015, Neural Networks.

[12]  Andrew Zisserman,et al.  Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[13]  Yiran Chen,et al.  A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[15]  Cong Xu,et al.  Coordinating Filters for Faster Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Sergio Escalera,et al.  End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks , 2017, ArXiv.

[17]  Trevor Darrell,et al.  Learning the Structure of Deep Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Yu Yang,et al.  Class specific sparse representation for classification , 2015, Signal Process..

[19]  Danilo Comminiello,et al.  Group sparse regularization for deep neural networks , 2016, Neurocomputing.

[20]  Xiaochun Cao,et al.  Makeup Like a Superstar: Deep Localized Makeup Transfer Network , 2016, IJCAI.

[21]  Dragomir Anguelov,et al.  Markov random field models for hair and face segmentation , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[22]  Xiaogang Wang,et al.  Convolutional neural networks with low-rank regularization , 2015, ICLR.

[23]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[24]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[25]  Sangram Ganguly,et al.  A theoretical analysis of Deep Neural Networks for texture classification , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[26]  Aurobinda Routray,et al.  Automatic facial expression recognition using features of salient facial patches , 2015, IEEE Transactions on Affective Computing.

[27]  Roberto Cipolla,et al.  Training CNNs with Low-Rank Filters for Efficient Image Classification , 2015, ICLR.

[28]  Honglak Lee,et al.  Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Iasonas Kokkinos,et al.  Pushing the Boundaries of Boundary Detection using Deep Learning , 2015, ICLR 2016.

[30]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[31]  Xiaogang Wang,et al.  Hierarchical face parsing via deep learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Ming-Hsuan Yang,et al.  Multi-objective convolutional learning for face labeling , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Noah Simon,et al.  A Sparse-Group Lasso , 2013 .