论文信息 - CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering

CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering

Intrinsic image decomposition is a challenging, long-standing computer vision problem for which ground truth data is very difficult to acquire. We explore the use of synthetic data for training CNN-based intrinsic image decomposition models, then applying these learned models to real-world images. To that end, we present CGIntrinsics, a new, large-scale dataset of physically-based rendered images of scenes with full ground truth decompositions. The rendering process we use is carefully designed to yield high-quality, realistic images, which we find to be crucial for this problem domain. We also propose a new end-to-end training method that learns better decompositions by leveraging CGIntrinsics, and optionally IIW and SAW, two recent datasets of sparse annotations on real-world images. Surprisingly, we find that a decomposition network trained solely on our synthetic data outperforms the state-of-the-art on both IIW and SAW, and performance improves even further when IIW and SAW data is added during training. Our work demonstrates the suprising effectiveness of carefully-rendered synthetic data for the intrinsic images task.

Zhengqi Li | Noah Snavely | Noah Snavely | Zhengqi Li

[1] Adolfo Muñoz,et al. Intrinsic Images by Clustering , 2012, Comput. Graph. Forum.

[2] Jitendra Malik,et al. Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Noah Snavely,et al. Intrinsic images in the wild , 2014, ACM Trans. Graph..

[4] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[5] Balazs Kovacs,et al. Shading Annotations in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Jonathan T. Barron,et al. Fast bilateral-space stereo for synthetic defocus , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Balazs Kovacs,et al. Intrinsic Decompositions for Image Editing , 2017, Comput. Graph. Forum.

[8] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[9] Stella X. Yu,et al. Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10] Qiao Wang,et al. VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Jitendra Malik,et al. Intrinsic Scene Properties from a Single RGB-D Image , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Yizhou Yu,et al. An L1 image transform for edge-preserving smoothing and scene-level intrinsic decomposition , 2015, ACM Trans. Graph..

[13] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[14] Peter V. Gehler,et al. Reflectance Adaptive Filtering Improves Intrinsic Image Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Michael J. Black,et al. A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[16] Zhengqi Li,et al. Learning Intrinsic Image Decomposition from Watching the World , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] E. Land,et al. Lightness and retinex theory. , 1971, Journal of the Optical Society of America.

[18] Rob Fergus,et al. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[19] Antonio M. López,et al. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Edward H. Adelson,et al. Ground truth dataset and baseline evaluations for intrinsic image algorithms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23] Stephen Lin,et al. A Closed-Form Solution to Retinex with Nonlocal Texture Constraints , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24] Vladlen Koltun,et al. A Simple Model for Intrinsic Image Decomposition with Depth Cues , 2013, 2013 IEEE International Conference on Computer Vision.

[25] Jian Shi,et al. Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Alexei A. Efros,et al. Learning Data-Driven Reflectance Priors for Intrinsic Image Decomposition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27] Joost van de Weijer,et al. Intrinsic image evaluation on synthetic complex scenes , 2013, 2013 IEEE International Conference on Image Processing.

[28] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29] Vladlen Koltun,et al. Playing for Benchmarks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30] Stephen Lin,et al. Unified Depth Prediction and Intrinsic Image Decomposition from a Single Image via Joint Convolutional Neural Fields , 2016, ECCV.

[31] Peter V. Gehler,et al. Recovering Intrinsic Images with a Global Sparsity Prior on Reflectance , 2011, NIPS.

[32] Seungyong Lee,et al. Intrinsic Image Decomposition Using Structure-Texture Separation and Surface Normals , 2014, ECCV.

[33] Ersin Yumer,et al. Neural Face Editing with Intrinsic Image Disentangling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Chuohao Yeo,et al. Intrinsic images decomposition using a local and global sparse representation of reflectance , 2011, CVPR 2011.

[35] Vladlen Koltun,et al. Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[36] Erik Reinhard,et al. Photographic tone reproduction for digital images , 2002, ACM Trans. Graph..

[37] Ersin Yumer,et al. Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Pascal Fua,et al. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39] William T. Freeman,et al. Learning Ordinal Relationships for Mid-Level Vision , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40] Thomas A. Funkhouser,et al. Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Stella X. Yu,et al. Learning lightness from human judgement on relative reflectance , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42] Jiajun Wu,et al. Self-Supervised Intrinsic Image Decomposition , 2017, NIPS.