A Novel GAN-Based Network for Unmasking of Masked Face

Recent deep learning based image editing methods have achieved promising results for removing object in an image but fail to generate plausible results for removing large objects of complex nature, especially in facial images. The objective of this work is to remove mask objects in facial images. This problem is challenging because (1) most of the time facial masks cover quite a large region of face that even extends beyond the actual face boundary below chin, and (2) facial image pairs with and without mask object do not exist for training. We break the problem into two stages: mask object detection and image completion of the removed mask region. The first stage of our model automatically produces binary segmentation for the mask region. Then, the second stage removes the mask and synthesizes the affected region with fine details while retaining the global coherency of face structure. For this, we have employed a GAN-based network using two discriminators where one discriminator helps learn the global structure of the face and then another discriminator comes in to focus learning on the deep missing region. To train our model in a supervised manner, we create a paired synthetic dataset using publicly available CelebA dataset and evaluated on real world images collected from the Internet. Our model outperforms others representative state-of-the-art approaches both qualitatively and quantitatively.

[1]  Mehran Ebrahimi,et al.  EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning , 2019, ArXiv.

[2]  Qin Huang,et al.  SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting , 2018, BMVC.

[3]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[4]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[5]  Kamran Javed,et al.  Image unmosaicing without location information using stacked GAN , 2019, IET Comput. Vis..

[6]  Alan C. Bovik,et al.  Blind/Referenceless Image Spatial Quality Evaluator , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[7]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Bernt Schiele,et al.  Adversarial Scene Editing: Automatic Object Removal from Weak Supervision , 2018, NeurIPS.

[11]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Sang Chul Ahn,et al.  Glasses removal from facial image using recursive error compensation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[15]  William A. P. Smith,et al.  A shadow constrained conditional generative adversarial net for SRTM data restoration , 2020, Remote Sensing of Environment.

[16]  Juneho Yi,et al.  Interactive Removal of Microphone Object in Facial Images , 2019, Electronics.

[17]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[18]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[19]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[20]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[21]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[23]  Jing Wang,et al.  Robust object removal with an exemplar-based image inpainting approach , 2014, Neurocomputing.

[24]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Hao Li,et al.  High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[27]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[28]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[29]  Mahadev Satyanarayanan,et al.  OpenFace: A general-purpose face recognition library with mobile applications , 2016 .

[30]  Ming-Hsuan Yang,et al.  Generative Face Completion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[32]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[35]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[36]  Seho Bae,et al.  UMGAN: Generative adversarial network for image unmosaicing using perceptual loss , 2019, 2019 16th International Conference on Machine Vision Applications (MVA).

[37]  Eli Shechtman,et al.  Image melding , 2012, ACM Trans. Graph..

[38]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.