Automatic Scene Inference for 3D Object Compositing

We present a user-friendly image editing system that supports a drag-and-drop object insertion (where the user merely drags objects into the image, and the system automatically places them in 3D and relights them appropriately), postprocess illumination editing, and depth-of-field manipulation. Underlying our system is a fully automatic technique for recovering a comprehensive 3D scene model (geometry, illumination, diffuse albedo, and camera parameters) from a single, low dynamic range photograph. This is made possible by two novel contributions: an illumination inference algorithm that recovers a full lighting model of the scene (including light sources that are not directly visible in the photograph), and a depth estimation algorithm that combines data-driven depth transfer with geometric reasoning about the scene layout. A user study shows that our system produces perceptually convincing results, and achieves the same level of realism as techniques that require significant user interaction.

[1]  Alexei A. Efros,et al.  Automatic photo pop-up , 2005, ACM Trans. Graph..

[2]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[3]  Ian D. Reid,et al.  Single View Metrology , 2000, International Journal of Computer Vision.

[4]  Honglak Lee,et al.  Automatic Single-Image 3d Reconstructions of Indoor Manhattan World Scenes , 2007, ISRR.

[5]  Kevin G. Suffern,et al.  Painting with light , 2002, SIGGRAPH '02.

[6]  Richard Szeliski,et al.  Manhattan-world stereo , 2009, CVPR.

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Paul Debevec,et al.  Inverse global illumination: Recovering re?ectance models of real scenes from photographs , 1998 .

[10]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[11]  Erik Reinhard,et al.  Image-based material editing , 2005, SIGGRAPH '05.

[12]  Frédo Durand,et al.  A gentle introduction to bilateral filtering and its applications , 2007, SIGGRAPH Courses.

[13]  H. Intraub,et al.  Wide-angle memories of close-up scenes. , 1989, Journal of experimental psychology. Learning, memory, and cognition.

[14]  Erik Reinhard,et al.  Compositing images through light source detection , 2010, Comput. Graph..

[15]  Jan-Michael Frahm,et al.  Piecewise planar and non-planar stereo for urban scene reconstruction , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Stephen Gould,et al.  Single image depth estimation from predicted semantic labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[18]  Derek Hoiem,et al.  Recovering the spatial layout of cluttered rooms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Shree K. Nayar,et al.  Eyes for relighting , 2004, ACM Trans. Graph..

[20]  Hany Farid,et al.  Exposing digital forgeries by detecting inconsistencies in lighting , 2005, MM&Sec '05.

[21]  Kobus Barnard,et al.  Understanding Bayesian Rooms Using Composite 3D Object Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  HanrahanPat,et al.  Example-based synthesis of 3D object arrangements , 2012 .

[23]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[24]  Paul E. Debevec,et al.  Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography , 1998, SIGGRAPH '08.

[25]  Alexei A. Efros,et al.  Photo clip art , 2007, ACM Trans. Graph..

[26]  Todd E. Zickler,et al.  Blind Reflectometry , 2010, ECCV.

[27]  Richard Szeliski,et al.  Manhattan-world stereo , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Ron O Dror,et al.  Statistical characterization of real-world illumination. , 2004, Journal of vision.

[29]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[30]  T. Kanade,et al.  Geometric reasoning for single image structure recovery , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Martial Hebert,et al.  Data-Driven Scene Understanding from 3D Models , 2012, BMVC.

[32]  Pat Hanrahan,et al.  Example-based synthesis of 3D object arrangements , 2012, ACM Trans. Graph..

[33]  Erik Reinhard,et al.  Image-based material editing , 2005, SIGGRAPH '05.

[34]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  David A. Forsyth,et al.  Rendering synthetic objects into legacy photographs , 2011, ACM Trans. Graph..

[36]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[37]  Julie Dorsey,et al.  Effic ient Re-rendering of Naturally Illuminated Environments , 1994 .

[38]  Todd E. Zickler,et al.  Passive Reflectometry , 2008, ECCV.

[39]  Hany Farid,et al.  Exposing Digital Forgeries in Complex Lighting Environments , 2007, IEEE Transactions on Information Forensics and Security.

[40]  Krista A. Ehinger,et al.  Recognizing scene viewpoint using panoramic place representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Pat Hanrahan,et al.  A signal-processing framework for reflection , 2004, ACM Trans. Graph..

[42]  Bruce Walter,et al.  Visual equivalence: towards a new standard for image fidelity , 2007, ACM Trans. Graph..

[43]  Raquel Urtasun,et al.  Efficient Exact Inference for 3D Indoor Scene Understanding , 2012, ECCV.

[44]  Pierre Poulin,et al.  Interactive Virtual Relighting and Remodeling of Real Scenes , 1999, Rendering Techniques.

[45]  Luiz Velho,et al.  Augmented reality using full panoramic captured scene light-depth maps , 2012, SA '12.

[46]  Ko Nishino,et al.  Reflectance and Natural Illumination from a Single Image , 2012, ECCV.

[47]  André Gagalowicz,et al.  Image-based rendering of diffuse, specular and glossy surfaces from a single image , 2001, SIGGRAPH.

[48]  Greg Humphreys,et al.  Physically Based Rendering, Second Edition: From Theory To Implementation , 2010 .

[49]  Peter F. Sturm,et al.  Estimating Photometric Properties from Image Collections , 2013, Journal of Mathematical Imaging and Vision.

[50]  Ken-ichi Anjyo,et al.  Tour into the picture: using a spidery mesh interface to make animation from a single image , 1997, SIGGRAPH.

[51]  Ko Nishino,et al.  Single image multimaterial estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Alexei A. Efros,et al.  Estimating natural illumination from a single outdoor image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[53]  Ce Liu,et al.  Depth Extraction from Video Using Non-parametric Sampling , 2012, ECCV.

[54]  Jitendra Malik,et al.  Intrinsic Scene Properties from a Single RGB-D Image , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  Sylvain Paris,et al.  User-assisted image compositing for photographic lighting , 2013, ACM Trans. Graph..

[56]  Nikos Paragios,et al.  Illumination estimation and cast shadow detection through a higher-order graphical model , 2011, CVPR 2011.

[57]  Greg Humphreys,et al.  Physically Based Rendering: From Theory to Implementation , 2004 .

[58]  Yinda Zhang,et al.  FrameBreak: Dramatic Image Extrapolation by Guided Shift-Maps , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  Edward H. Adelson,et al.  Ground truth dataset and baseline evaluations for intrinsic image algorithms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[60]  Simon Gibson,et al.  Interactive Rendering with Real-World Illumination , 2000, Rendering Techniques.