Scene Intrinsics and Depth from a Single Image

Intrinsic image decomposition factorizes an observed image into its physical causes. This is most commonly framed as a decomposition into reflectance and shading, although recent progress has made full decompositions into shape, illumination, reflectance, and shading possible. However, existing factorization approaches require depth sensing to initialize the optimization of scene intrinsics. Rather than relying on depth sensors, we show that depth estimated purely from monocular appearance can provide sufficient cues for intrinsic image analysis. Our full intrinsic pipeline regresses depth by a fully convolutional network then jointly optimizes the intrinsic factorization to recover the input image. This combination yields full decompositions by uniting feature learning through deep network regression with physical modeling through statistical priors and random field regularization. This work demonstrates the first pipeline for full intrinsic decomposition of scenes from a single color image input alone.

[1]  J. Koenderink,et al.  Perturbation Study of Shading in Pictures , 1996, Perception.

[2]  Ce Liu,et al.  Depth Transfer: Depth Extraction from Video Using Non-Parametric Sampling , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  H. Barrow,et al.  RECOVERING INTRINSIC SCENE CHARACTERISTICS FROM IMAGES , 1978 .

[4]  David J. Kriegman,et al.  The Bas-Relief Ambiguity , 2004, International Journal of Computer Vision.

[5]  Ping-Sing Tsai,et al.  Shape from Shading: A Survey , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Jitendra Malik,et al.  Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  William T. Freeman,et al.  Learning Local Evidence for Shading and Reflectance , 2001, ICCV.

[9]  ShenChunhua,et al.  Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields , 2016 .

[10]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jitendra Malik,et al.  Aligning 3D models to RGB-D images of cluttered scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Vladlen Koltun,et al.  A Simple Model for Intrinsic Image Decomposition with Depth Cues , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Geoffrey E. Hinton,et al.  Deep Lambertian Networks , 2012, ICML.

[14]  Katsushi Ikeuchi,et al.  Numerical Shape from Shading and Occluding Boundaries , 1981, Artif. Intell..

[15]  Berthold K. P. Horn SHAPE FROM SHADING: A METHOD FOR OBTAINING THE SHAPE OF A SMOOTH OPAQUE OBJECT FROM ONE VIEW , 1970 .

[16]  Jitendra Malik,et al.  Intrinsic Scene Properties from a Single RGB-D Image , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Edward H. Adelson,et al.  Recovering intrinsic images from a single image , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  K. Hohn,et al.  Determining Lightness from an Image , 2004 .

[19]  Noah Snavely,et al.  Intrinsic images in the wild , 2014, ACM Trans. Graph..

[20]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[21]  J J Koenderink,et al.  What Does the Occluding Contour Tell Us about Solid Shape? , 1984, Perception.

[22]  Paul Debevec,et al.  Inverse global illumination: Recovering re?ectance models of real scenes from photographs , 1998 .

[23]  Carsten Rother,et al.  Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation , 2013, NIPS.

[24]  Andrew P. Witkin,et al.  Shape from Contour , 1980 .

[25]  E. Land,et al.  Lightness and retinex theory. , 1971, Journal of the Optical Society of America.

[26]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[28]  Edward H. Adelson,et al.  Ground truth dataset and baseline evaluations for intrinsic image algorithms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.