Structural Consistency and Controllability for Diverse Colorization

Colorizing a given gray-level image is an important task in the media and advertising industry. Due to the ambiguity inherent to colorization (many shades are often plausible), recent approaches started to explicitly model diversity. However, one of the most obvious artifacts, structural inconsistency, is rarely considered by existing methods which predict chrominance independently for every pixel. To address this issue, we develop a conditional random field based variational auto-encoder formulation which is able to achieve diversity while taking into account structural consistency. Moreover, we introduce a controllability mechanism that can incorporate external constraints from diverse sources including a user interface. Compared to existing baselines, we demonstrate that our method obtains more diverse and globally consistent colorizations on the LFW, LSUN-Church and ILSVRC-2015 datasets.

[1]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[2]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[3]  Gregory Shakhnarovich,et al.  Learning Representations for Automatic Colorization , 2016, ECCV.

[4]  Deepu Rajan,et al.  Image colorization using similar images , 2012, ACM Multimedia.

[5]  Yong Yu,et al.  Unsupervised Diverse Colorization via Generative Adversarial Networks , 2017, ECML/PKDD.

[6]  Bernhard Schölkopf,et al.  Automatic Image Colorization Via Multimodal Predictions , 2008, ECCV.

[7]  Ofer Meshi,et al.  Asynchronous Parallel Coordinate Minimization for MAP Inference , 2017, NIPS.

[8]  Klaus Mueller,et al.  Transferring color to greyscale images , 2002, ACM Trans. Graph..

[9]  Marc Pollefeys,et al.  Globally Convergent Parallel MAP LP Relaxation Solver using the Frank-Wolfe Algorithm , 2014, ICML.

[10]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[11]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[12]  Alexander G. Schwing,et al.  Creativity: Generating Diverse Questions Using Variational Autoencoders , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Honglak Lee,et al.  Learning to Align from Scratch , 2012, NIPS.

[14]  Mohammad Norouzi,et al.  PixColor: Pixel Recursive Colorization , 2017, BMVC.

[15]  Dani Lischinski,et al.  Colorization by example , 2005, EGSR '05.

[16]  S. Srihari Mixture Density Networks , 1994 .

[17]  Bin Sheng,et al.  Deep Colorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Christoph H. Lampert,et al.  Probabilistic Image Colorization , 2017, BMVC.

[19]  Yoshihiro Kanamori,et al.  DeepProp: Extracting Deep Features from a Single Image for Edit Propagation , 2016, Comput. Graph. Forum.

[20]  Rama Chellappa,et al.  Gaussian Conditional Random Field Network for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[23]  Dani Lischinski,et al.  Colorization using optimization , 2004, ACM Trans. Graph..

[24]  Kristen Grauman,et al.  Fine-Grained Visual Comparisons with Local Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[26]  Sebastian Nowozin,et al.  Regression Tree Fields — An efficient, non-parametric approach to image labeling problems , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  David A. Forsyth,et al.  Learning Large-Scale Automatic Image Colorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Ming-Yu Liu,et al.  Deep Gaussian Conditional Random Field Network: A Model-Based Deep Network for Discriminative Denoising , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Xiaofeng Tao,et al.  Transient attributes for high-level understanding and editing of outdoor scenes , 2014, ACM Trans. Graph..

[30]  Hiroshi Ishikawa,et al.  Let there be color! , 2016, ACM Trans. Graph..

[31]  Alexei A. Efros,et al.  Real-time user-guided image colorization with learned deep priors , 2017, ACM Trans. Graph..

[32]  Hans-Peter Seidel,et al.  Design and volume optimization of space structures , 2017, ACM Trans. Graph..

[33]  Gang Hua,et al.  Labeled Faces in the Wild: A Survey , 2016 .

[34]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[35]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[36]  Edward H. Adelson,et al.  Learning Gaussian Conditional Random Fields for Low-Level Vision , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Stephen Lin,et al.  Semantic colorization with internet images , 2011, ACM Trans. Graph..

[38]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[39]  Iasonas Kokkinos,et al.  Fast, Exact and Multi-scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs , 2016, ECCV.

[40]  Aditya Deshpande,et al.  Learning Diverse Image Colorization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Domonkos Varga,et al.  Twin Deep Convolutional Neural Network for Example-Based Image Colorization , 2017, CAIP.

[42]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[43]  LinLin Shen,et al.  Deep Feature Consistent Variational Autoencoder , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[44]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[45]  Marc Pollefeys,et al.  Globally Convergent Dual MAP LP Relaxation Solvers using Fenchel-Young Margins , 2012, NIPS.

[46]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[47]  Iasonas Kokkinos,et al.  Dense and Low-Rank Gaussian CRFs Using Deep Embeddings , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48]  Svetlana Lazebnik,et al.  Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space , 2017, NIPS.

[49]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Jonathan T. Barron,et al.  The Fast Bilateral Solver , 2015, ECCV.

[51]  Takeshi Naemura,et al.  Automatic colorization of grayscale images using multiple images on the web , 2009, SIGGRAPH '09.

[52]  Marc Pollefeys,et al.  Distributed message passing for large scale graphical models , 2011, CVPR 2011.