论文信息 - Automatic Scene Inference for 3D Object Compositing

Automatic Scene Inference for 3D Object Compositing

We present a user-friendly image editing system that supports a drag-and-drop object insertion (where the user merely drags objects into the image, and the system automatically places them in 3D and relights them appropriately), postprocess illumination editing, and depth-of-field manipulation. Underlying our system is a fully automatic technique for recovering a comprehensive 3D scene model (geometry, illumination, diffuse albedo, and camera parameters) from a single, low dynamic range photograph. This is made possible by two novel contributions: an illumination inference algorithm that recovers a full lighting model of the scene (including light sources that are not directly visible in the photograph), and a depth estimation algorithm that combines data-driven depth transfer with geometric reasoning about the scene layout. A user study shows that our system produces perceptually convincing results, and achieves the same level of realism as techniques that require significant user interaction.

[1] Alexei A. Efros,et al. Automatic photo pop-up , 2005, ACM Trans. Graph..

[2] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[3] Ian D. Reid,et al. Single View Metrology , 2000, International Journal of Computer Vision.

[4] Honglak Lee,et al. Automatic Single-Image 3d Reconstructions of Indoor Manhattan World Scenes , 2007, ISRR.

[5] Kevin G. Suffern,et al. Painting with light , 2002, SIGGRAPH '02.

[6] Richard Szeliski,et al. Manhattan-world stereo , 2009, CVPR.

[7] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[8] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9] Paul Debevec,et al. Inverse global illumination: Recovering re?ectance models of real scenes from photographs , 1998 .

[10] Thorsten Joachims,et al. Training linear SVMs in linear time , 2006, KDD '06.

[11] Erik Reinhard,et al. Image-based material editing , 2005, SIGGRAPH '05.

[12] Frédo Durand,et al. A gentle introduction to bilateral filtering and its applications , 2007, SIGGRAPH Courses.

[13] H. Intraub,et al. Wide-angle memories of close-up scenes. , 1989, Journal of experimental psychology. Learning, memory, and cognition.

[14] Erik Reinhard,et al. Compositing images through light source detection , 2010, Comput. Graph..

[15] Jan-Michael Frahm,et al. Piecewise planar and non-planar stereo for urban scene reconstruction , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16] Stephen Gould,et al. Single image depth estimation from predicted semantic labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[18] Derek Hoiem,et al. Recovering the spatial layout of cluttered rooms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19] Shree K. Nayar,et al. Eyes for relighting , 2004, ACM Trans. Graph..

[20] Hany Farid,et al. Exposing digital forgeries by detecting inconsistencies in lighting , 2005, MM&Sec '05.

[21] Kobus Barnard,et al. Understanding Bayesian Rooms Using Composite 3D Object Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22] HanrahanPat,et al. Example-based synthesis of 3D object arrangements , 2012 .

[23] John Hart,et al. ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[24] Paul E. Debevec,et al. Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography , 1998, SIGGRAPH '08.

[25] Alexei A. Efros,et al. Photo clip art , 2007, ACM Trans. Graph..

[26] Todd E. Zickler,et al. Blind Reflectometry , 2010, ECCV.

[27] Richard Szeliski,et al. Manhattan-world stereo , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28] Ron O Dror,et al. Statistical characterization of real-world illumination. , 2004, Journal of vision.

[29] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[30] T. Kanade,et al. Geometric reasoning for single image structure recovery , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[31] Martial Hebert,et al. Data-Driven Scene Understanding from 3D Models , 2012, BMVC.

[32] Pat Hanrahan,et al. Example-based synthesis of 3D object arrangements , 2012, ACM Trans. Graph..

[33] Erik Reinhard,et al. Image-based material editing , 2005, SIGGRAPH '05.

[34] Pascal Fua,et al. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35] David A. Forsyth,et al. Rendering synthetic objects into legacy photographs , 2011, ACM Trans. Graph..

[36] Alexei A. Efros,et al. Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[37] Julie Dorsey,et al. Effic ient Re-rendering of Naturally Illuminated Environments , 1994 .

[38] Todd E. Zickler,et al. Passive Reflectometry , 2008, ECCV.

[39] Hany Farid,et al. Exposing Digital Forgeries in Complex Lighting Environments , 2007, IEEE Transactions on Information Forensics and Security.

[40] Krista A. Ehinger,et al. Recognizing scene viewpoint using panoramic place representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41] Pat Hanrahan,et al. A signal-processing framework for reflection , 2004, ACM Trans. Graph..

[42] Bruce Walter,et al. Visual equivalence: towards a new standard for image fidelity , 2007, ACM Trans. Graph..

[43] Raquel Urtasun,et al. Efficient Exact Inference for 3D Indoor Scene Understanding , 2012, ECCV.

[44] Pierre Poulin,et al. Interactive Virtual Relighting and Remodeling of Real Scenes , 1999, Rendering Techniques.

[45] Luiz Velho,et al. Augmented reality using full panoramic captured scene light-depth maps , 2012, SA '12.

[46] Ko Nishino,et al. Reflectance and Natural Illumination from a Single Image , 2012, ECCV.

[47] André Gagalowicz,et al. Image-based rendering of diffuse, specular and glossy surfaces from a single image , 2001, SIGGRAPH.

[48] Greg Humphreys,et al. Physically Based Rendering, Second Edition: From Theory To Implementation , 2010 .

[49] Peter F. Sturm,et al. Estimating Photometric Properties from Image Collections , 2013, Journal of Mathematical Imaging and Vision.

[50] Ken-ichi Anjyo,et al. Tour into the picture: using a spidery mesh interface to make animation from a single image , 1997, SIGGRAPH.

[51] Ko Nishino,et al. Single image multimaterial estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52] Alexei A. Efros,et al. Estimating natural illumination from a single outdoor image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[53] Ce Liu,et al. Depth Extraction from Video Using Non-parametric Sampling , 2012, ECCV.

[54] Jitendra Malik,et al. Intrinsic Scene Properties from a Single RGB-D Image , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[55] Sylvain Paris,et al. User-assisted image compositing for photographic lighting , 2013, ACM Trans. Graph..

[56] Nikos Paragios,et al. Illumination estimation and cast shadow detection through a higher-order graphical model , 2011, CVPR 2011.

[57] Greg Humphreys,et al. Physically Based Rendering: From Theory to Implementation , 2004 .

[58] Yinda Zhang,et al. FrameBreak: Dramatic Image Extrapolation by Guided Shift-Maps , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[59] Edward H. Adelson,et al. Ground truth dataset and baseline evaluations for intrinsic image algorithms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[60] Simon Gibson,et al. Interactive Rendering with Real-World Illumination , 2000, Rendering Techniques.