Deep Unsupervised Intrinsic Image Decomposition by Siamese Training

We harness modern intrinsic decomposition tools based on deep learning to increase their applicability on realworld use cases. Traditional techniques are derived from the Retinex theory: handmade prior assumptions constrain an optimization to yield a unique solution that is qualitatively satisfying on a limited set of examples. Modern techniques based on supervised deep learning leverage largescale databases that are usually synthetic or sparsely annotated. Decomposition quality on images in the wild is therefore arguable. We propose an end-to-end deep learning solution that can be trained without any ground truth supervision, as this is hard to obtain. Time-lapses form an ubiquitous source of data that (under a scene staticity assumption) capture a constant albedo under varying shading conditions. We exploit this natural relationship to train in an unsupervised siamese manner on image pairs. Yet, the trained network applies to single images at inference time. We present a new dataset to demonstrate our siamese training on, and reach results that compete with the state of the art, despite the unsupervised nature of our training scheme. As evaluation is difficult, we rely on extensive experiments to analyze the strengths and weaknesses of our and related methods.

[1]  Jitendra Malik,et al.  Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Alexei A. Efros,et al.  Learning Data-Driven Reflectance Priors for Intrinsic Image Decomposition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  E. Land,et al.  Lightness and retinex theory. , 1971, Journal of the Optical Society of America.

[4]  Stephen Lin,et al.  Estimating Intrinsic Images from Image Sequences with Biased Illumination , 2004, ECCV.

[5]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[6]  Stephen Lin,et al.  A Closed-Form Solution to Retinex with Nonlocal Texture Constraints , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Jiajun Wu,et al.  Self-Supervised Intrinsic Image Decomposition , 2017, NIPS.

[8]  Adolfo Muñoz,et al.  Intrinsic Images by Clustering , 2012, Comput. Graph. Forum.

[9]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Peter V. Gehler,et al.  Recovering Intrinsic Images with a Global Sparsity Prior on Reflectance , 2011, NIPS.

[11]  H. Barrow,et al.  RECOVERING INTRINSIC SCENE CHARACTERISTICS FROM IMAGES , 1978 .

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  英樹 藤堂,et al.  Interactive intrinsic video editing , 2014, ACM Trans. Graph..

[14]  Christian Theobalt,et al.  Live intrinsic video , 2016, ACM Trans. Graph..

[15]  Yair Weiss,et al.  Deriving intrinsic images from image sequences , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[16]  Chuohao Yeo,et al.  Intrinsic images decomposition using a local and global sparse representation of reflectance , 2011, CVPR 2011.

[17]  E. Reinhard Photographic Tone Reproduction for Digital Images , 2002 .

[18]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jinze Yu,et al.  Rank-constrained PCA for intrinsic images decomposition , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[20]  Edward H. Adelson,et al.  Ground truth dataset and baseline evaluations for intrinsic image algorithms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Vladlen Koltun,et al.  A Simple Model for Intrinsic Image Decomposition with Depth Cues , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Jian Shi,et al.  Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Balazs Kovacs,et al.  Shading Annotations in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Alexei A. Efros,et al.  Webcam clip art: appearance and illuminant transfer from time-lapse sequences , 2009, ACM Trans. Graph..

[26]  Zhengqi Li,et al.  Learning Intrinsic Image Decomposition from Watching the World , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Stella X. Yu,et al.  Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Balazs Kovacs,et al.  Intrinsic Decompositions for Image Editing , 2017, Comput. Graph. Forum.

[29]  Xuelong Li,et al.  Intrinsic images using optimization , 2011, CVPR 2011.

[30]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[31]  Noah Snavely,et al.  Intrinsic images in the wild , 2014, ACM Trans. Graph..

[32]  Pierre-Yves Laffont,et al.  Intrinsic Decomposition of Image Sequences from Local Temporal Variations , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Luc Van Gool,et al.  DARN: a Deep Adversial Residual Network for Intrinsic Image Decomposition , 2016, ArXiv.

[34]  Sylvain Paris,et al.  User-assisted image compositing for photographic lighting , 2013, ACM Trans. Graph..