Attention-Aware Face Hallucination via Deep Reinforcement Learning

Face hallucination is a domain-specific super-resolution problem with the goal to generate high-resolution (HR) faces from low-resolution (LR) input images. In contrast to existing methods that often learn a single patch-to-patch mapping from LR to HR images and are regardless of the contextual interdependency between patches, we propose a novel Attention-aware Face Hallucination (Attention-FH) framework which resorts to deep reinforcement learning for sequentially discovering attended patches and then performing the facial part enhancement by fully exploiting the global interdependency of the image. Specifically, in each time step, the recurrent policy network is proposed to dynamically specify a new attended region by incorporating what happened in the past. The state (i.e., face hallucination result for the whole image) can thus be exploited and updated by the local enhancement network on the selected region. The Attention-FH approach jointly learns the recurrent policy network and local enhancement network through maximizing the long-term reward that reflects the hallucination performance over the whole image. Therefore, our proposed Attention-FH is capable of adaptively personalizing an optimal searching path for each face image according to its own characteristic. Extensive experiments show our approach significantly surpasses the state-of-the-arts on in-the-wild faces with large pose and illumination variations. The state (i.e., face hallucination result for the whole image) can thus be exploited and updated by the local enhancement network on the selected region. The Attention-FH approach jointly learns the recurrent policy network and local enhancement network through maximizing the long-term reward that reflects the hallucination performance over the whole image. Therefore, our proposed Attention-FH is capable of adaptively personalizing an optimal searching path for each face image according to its own characteristic. Extensive experiments show our approach significantly surpasses the state-of-the-arts on in-the-wild faces with large pose and illumination variations.

[1]  Ce Liu,et al.  A Bayesian Approach to Alignment-Based Image Hallucination , 2012, ECCV.

[2]  Xiaogang Wang,et al.  Hallucinating face by eigentransformation , 2005, IEEE Trans. Syst. Man Cybern. Part C.

[3]  Lei Zhang,et al.  Convolutional Sparse Coding for Image Super-Resolution , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Qi Yin,et al.  Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not? , 2015, ArXiv.

[5]  Erik G. Learned-Miller,et al.  Unsupervised Joint Alignment of Complex Images , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[6]  Li Xu,et al.  Shepard Convolutional Neural Networks , 2015, NIPS.

[7]  Xiaoou Tang,et al.  Deep Cascaded Bi-Network for Face Hallucination , 2016, ECCV.

[8]  Svetlana Lazebnik,et al.  Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Chih-Yuan Yang,et al.  Structured Face Hallucination , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Yuning Jiang,et al.  Learning Face Hallucination in the Wild , 2015, AAAI.

[12]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[13]  David Zhang,et al.  FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[14]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[15]  Chun Qi,et al.  Hallucinating face by position-patch , 2010, Pattern Recognit..

[16]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[17]  Shiguang Shan,et al.  Deep Network Cascade for Image Super-resolution , 2014, ECCV.

[18]  Xiaogang Wang,et al.  DeepID3: Face Recognition with Very Deep Neural Networks , 2015, ArXiv.

[19]  John R. Hershey,et al.  Global-Local Face Upsampling Network , 2016, ArXiv.

[20]  Richard Socher,et al.  Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.

[21]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[22]  Wilson S. Geisler,et al.  Optimal eye movement strategies in visual search , 2005, Nature.

[23]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[24]  Eric P. Xing,et al.  Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[26]  Shuicheng Yan,et al.  Tree-Structured Reinforcement Learning for Sequential Object Localization , 2016, NIPS.

[27]  Itamar Arel,et al.  Reinforcement learning based visual attention with application to face detection , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[28]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[29]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Neil Genzlinger A. and Q , 2006 .

[31]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Xiaoou Tang,et al.  Learning Deep Representation for Face Alignment with Auxiliary Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Harry Shum,et al.  Face Hallucination: Theory and Practice , 2007, International Journal of Computer Vision.

[35]  Klaus J. Kirchberg,et al.  Robust Face Detection Using the Hausdorff Distance , 2001, AVBPA.

[36]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.