Learning Diverse Image Colorization

Colorization is an ambiguous problem, with multiple viable colorizations for a single grey-level image. However, previous methods only produce the single most probable colorization. Our goal is to model the diversity intrinsic to the problem of colorization and produce multiple colorizations that display long-scale spatial co-ordination. We learn a low dimensional embedding of color fields using a variational autoencoder (VAE). We construct loss terms for the VAE decoder that avoid blurry outputs and take into account the uneven distribution of pixel colors. Finally, we build a conditional model for the multi-modal distribution between grey-level image and the color field embeddings. Samples from this conditional model result in diverse colorization. We demonstrate that our method obtains better diverse colorizations than a standard conditional variational autoencoder (CVAE) model, as well as a recently proposed conditional generative adversarial network (cGAN).

[1]  U. V. Luxburg,et al.  Improving Variational Autoencoders with Inverse Autoregressive Flow , 2016 .

[2]  Gregory Shakhnarovich,et al.  Learning Representations for Automatic Colorization , 2016, ECCV.

[3]  Klaus Mueller,et al.  Transferring color to greyscale images , 2002, ACM Trans. Graph..

[4]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[5]  Sebastian Nowozin,et al.  Loss-Specific Training of Non-Parametric Image Restoration Models: A New State of the Art , 2012, ECCV.

[6]  Jiajun Wu,et al.  Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks , 2016, NIPS.

[7]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[8]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[9]  C. Bishop Mixture density networks , 1994 .

[10]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[11]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[12]  Honglak Lee,et al.  Learning to Align from Scratch , 2012, NIPS.

[13]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[14]  Dani Lischinski,et al.  Colorization using optimization , 2004, ACM Trans. Graph..

[15]  Takeshi Naemura,et al.  Automatic colorization of grayscale images using multiple images on the web , 2009, SIGGRAPH '09.

[16]  Bernhard Schölkopf,et al.  Automatic Image Colorization Via Multimodal Predictions , 2008, ECCV.

[17]  Gregory Shakhnarovich,et al.  Diverse M-Best Solutions in Markov Random Fields , 2012, ECCV.

[18]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[19]  David A. Forsyth,et al.  Learning Large-Scale Automatic Image Colorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[21]  Tamara L. Berg,et al.  Learning Temporal Transformations from Time-Lapse Videos , 2016, ECCV.

[22]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[24]  Martial Hebert,et al.  An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.

[25]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[26]  Hiroshi Ishikawa,et al.  Let there be color! , 2016, ACM Trans. Graph..

[27]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[28]  Bin Sheng,et al.  Deep Colorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[30]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[31]  Gang Hua,et al.  Labeled Faces in the Wild: A Survey , 2016 .