Virtual rephotography: novel view prediction error for 3D reconstruction

The ultimate goal of many image-based modeling systems is to render photo-realistic novel views of a scene without visible artifacts. Existing evaluation metrics and benchmarks focus mainly on the geometric accuracy of the reconstructed model, which is, however, a poor predictor of visual accuracy. Furthermore, using only geometric accuracy by itself does not allow evaluating systems that either lack a geometric scene representation or utilize coarse proxy geometry. Examples include a light field and most image-based rendering systems. We propose a unified evaluation approach based on novel view prediction error that is able to analyze the visual quality of any method that can render novel views from input images. One key advantage of this approach is that it does not require ground truth geometry. This dramatically simplifies the creation of test datasets and benchmarks. It also allows us to evaluate the quality of an unknown scene during the acquisition and reconstruction process, which is useful for acquisition planning. We evaluate our approach on a range of methods, including standard geometry-plus-texture pipelines as well as image-based rendering techniques, compare it to existing geometry-based benchmarks, demonstrate its utility for a range of use cases, and present a new virtual rephotography-based benchmark for image-based modeling and rendering systems.

[1]  Michael Cohen,et al.  First-person Hyperlapse Videos , 2014, SIGGRAPH 2014.

[2]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[3]  Scott J. Daly,et al.  Visible differences predictor: an algorithm for the assessment of image fidelity , 1992, Electronic Imaging.

[4]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[5]  Michael Goesele,et al.  Ghosting and popping detection for image-based rendering , 2016, 2016 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON).

[6]  Nelson L. Max,et al.  Image-based rendering of range data with estimated depth uncertainty , 2004, IEEE Computer Graphics and Applications.

[7]  Pascal Fua,et al.  Measuring the Self-Consistency of Stereo Algorithms , 2000, ECCV.

[8]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Roberto Cipolla,et al.  Using Multiple Hypotheses to Improve Depth-Maps for Multi-View Stereo , 2008, ECCV.

[10]  Horst Bischof,et al.  Photogrammetric Camera Network Design for Micro Aerial Vehicles , 2012 .

[11]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Jan-Michael Frahm,et al.  Real-Time Visibility-Based Fusion of Depth Maps , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Min H. Kim,et al.  Preference and artifact analysis for video transitions of places , 2013, TAP.

[14]  Richard Szeliski,et al.  Towards Internet-scale multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Ken Perlin,et al.  Improving noise , 2002, SIGGRAPH.

[16]  Simon Fuhrmann,et al.  MVE - An image-based reconstruction environment , 2015, Comput. Graph..

[17]  Steven M. Seitz,et al.  The Visual Turing Test for Scene Reconstruction , 2013, 2013 International Conference on 3D Vision.

[18]  Bruce Walter,et al.  Visual equivalence: towards a new standard for image fidelity , 2007, ACM Trans. Graph..

[19]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[20]  Richard Szeliski,et al.  Prediction error as a quality metric for motion and stereo , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[21]  Matthias Zwicker,et al.  Surface splatting , 2001, SIGGRAPH.

[22]  Li Guo,et al.  Multi-view Reconstruction Using Band Graph-Cuts: Multi-view Reconstruction Using Band Graph-Cuts , 2010 .

[23]  Wolfgang Heidrich,et al.  HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions , 2011, SIGGRAPH 2011.

[24]  Horst Bischof,et al.  Online Feedback for Structure-from-Motion Image Acquisition , 2012, BMVC.

[25]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[26]  Zoran Popovic,et al.  PhotoCity: training experts at large-scale image acquisition through a competitive game , 2011, CHI.

[27]  Rafal Mantiuk Quantifying image quality in graphics: perspective on subjective and objective metrics and their performance , 2013, Electronic Imaging.

[28]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Michael Goesele,et al.  Let There Be Color! Large-Scale Texturing of 3D Reconstructions , 2014, ECCV.

[30]  Andrew W. Fitzgibbon,et al.  Image-Based Rendering Using Image-Based Priors , 2005, International Journal of Computer Vision.

[31]  Henrik Aanæs,et al.  Large Scale Multi-view Stereopsis Evaluation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[33]  Hans-Peter Seidel,et al.  Dynamic range independent image quality assessment , 2008, ACM Trans. Graph..

[34]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[35]  Hans-Peter Seidel,et al.  Relighting objects from image collections , 2009, CVPR 2009.

[36]  Michael Goesele,et al.  Multi-View Stereo for Community Photo Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[37]  Paul Debevec,et al.  Inverse global illumination: Recovering re?ectance models of real scenes from photographs , 1998 .

[38]  Anders Bjorholm Dahl,et al.  Large-Scale Data for Multiple-View Stereopsis , 2016, International Journal of Computer Vision.

[39]  W. F. orstner Pros and Cons Against Performance Characterization of Vision Algorithms , 1996 .

[40]  Philipp Urban,et al.  Color-Image Quality Assessment: From Prediction to Optimization , 2014, IEEE Transactions on Image Processing.

[41]  George Drettakis,et al.  Perception of Visual Artifacts in Image‐Based Rendering of Façades , 2011, EGSR '11.

[42]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[43]  Richard Szeliski,et al.  First-person hyper-lapse videos , 2014, ACM Trans. Graph..

[44]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[45]  Gabriel Taubin,et al.  High Resolution Surface Reconstruction from Multi-view Aerial Imagery , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[46]  M. Goesele,et al.  Fusion of depth maps with multiple scales , 2011, ACM Trans. Graph..

[47]  Marc Stamminger,et al.  On predicting visual popping in dynamic scenes , 2009, APGV '09.

[48]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[49]  Bruce Walter,et al.  Visual equivalence: towards a new standard for image fidelity , 2007, SIGGRAPH 2007.

[50]  Jean-Michel Dischler,et al.  Simplification of meshes with digitized radiance , 2015, The Visual Computer.

[51]  George Drettakis,et al.  Perception of perspective distortions in image-based rendering , 2013, ACM Trans. Graph..

[52]  Anita Sellent,et al.  A ghosting artifact detector for interpolated image quality assessment , 2010, ISCE 2010.

[53]  Frédo Durand,et al.  Computational rephotography , 2010, TOGS.

[54]  Frank P. Ferrie,et al.  Autonomous exploration: driven by uncertainty , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.