Toward Realistic Image Compositing With Adversarial Learning

Compositing a realistic image is a challenging task and usually requires considerable human supervision using professional image editing software. In this work we propose a generative adversarial network (GAN) architecture for automatic image compositing. The proposed model consists of four sub-networks: a transformation network that improves the geometric and color consistency of the composite image, a refinement network that polishes the boundary of the composite image, and a pair of discriminator network and a segmentation network for adversarial learning. Experimental results on both synthesized images and real images show that our model, Geometrically and Color Consistent GANs (GCC-GANs), can automatically generate realistic composite images compared to several state-of-the-art methods, and does not require any manual effort.

[1]  Vladlen Koltun,et al.  Semi-Parametric Image Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Larry S. Davis,et al.  Learning Rich Features for Image Manipulation Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[4]  Alexei A. Efros,et al.  Using Color Compatibility for Assessing Image Realism , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[8]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Yannick Hold-Geoffroy,et al.  A Perceptual Measure for Deep Single Image Camera Calibration , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[11]  Ming-Hsuan Yang,et al.  Deep Image Harmonization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Julie Dorsey,et al.  Understanding and improving the realism of image composites , 2012, ACM Trans. Graph..

[13]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[14]  Alexei A. Efros,et al.  Photo clip art , 2007, ACM Trans. Graph..

[15]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[16]  Yannick Hold-Geoffroy,et al.  Deep Outdoor Illumination Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ersin Yumer,et al.  ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[19]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Edward H. Adelson,et al.  A multiresolution spline with application to image mosaics , 1983, TOGS.

[22]  Paul E. Debevec,et al.  Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography , 1998, SIGGRAPH '08.

[23]  Wojciech Matusik,et al.  CG2Real: Improving the Realism of Computer Generated Images Using a Large Collection of Photographs , 2011, IEEE Transactions on Visualization and Computer Graphics.

[24]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Alexei A. Efros,et al.  Learning a Discriminative Model for the Perception of Realism in Composite Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[27]  David Salesin,et al.  Interactive digital photomontage , 2004, ACM Trans. Graph..

[28]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Benjamin Cohen,et al.  Where and Who? Automatic Semantic-Aware Person Composition , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[30]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[32]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[33]  David A. Forsyth,et al.  Rendering synthetic objects into legacy photographs , 2011, ACM Trans. Graph..

[34]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[36]  Kalyan Sunkavalli,et al.  Automatic Scene Inference for 3D Object Compositing , 2014, ACM Trans. Graph..