Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines

We present a practical and robust deep learning solution for capturing and rendering novel views of complex real world scenes for virtual exploration. Previous approaches either require intractably dense view sampling or provide little to no guidance for how users should sample views of a scene to reliably render high-quality novel views. Instead, we propose an algorithm for view synthesis from an irregular grid of sampled views that first expands each sampled view into a local light field via a multiplane image (MPI) scene representation, then renders novel views by blending adjacent local light fields. We extend traditional plenoptic sampling theory to derive a bound that specifies precisely how densely users should sample views of a given scene when using our algorithm. In practice, we apply this bound to capture and render views of real world scenes that achieve the perceptual quality of Nyquist rate view sampling while using up to 4000x fewer views. We demonstrate our approach's practicality with an augmented reality smartphone app that guides users to capture input images of a scene and viewers that enable realtime virtual exploration on desktop and mobile platforms.

[1]  Richard Szeliski,et al.  On the Motion and Appearance of Specularities in Image Sequences , 2002, ECCV.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Frédo Durand,et al.  Unstructured Light Fields , 2012, Comput. Graph. Forum.

[4]  Alex Kendall,et al.  End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Ravi Ramamoorthi,et al.  Learning to Synthesize a 4D RGBD Light Field from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  George Drettakis,et al.  Depth synthesis and local warps for plausible image-based navigation , 2013, TOGS.

[7]  Tom Duff,et al.  Compositing digital images , 1984, SIGGRAPH.

[8]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  John Flynn,et al.  Stereo magnification , 2018, ACM Trans. Graph..

[10]  M. Levoy,et al.  Fast volume rendering using a shear-warp factorization of the viewing transformation , 1994, SIGGRAPH.

[11]  Qionghai Dai,et al.  Light Field Reconstruction Using Deep Convolutional Network on EPI , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Richard Szeliski,et al.  Casual 3D photography , 2017, ACM Trans. Graph..

[13]  Leonard McMillan,et al.  Plenoptic Modeling: An Image-Based Rendering System , 2023 .

[14]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[15]  Paul E. Debevec,et al.  A system for acquiring, processing, and rendering panoramic light field stills for virtual reality , 2018, ACM Trans. Graph..

[16]  Michael Goesele,et al.  Image-based rendering in the gradient domain , 2013, ACM Trans. Graph..

[17]  Peter Hedman,et al.  Instant 3D photography , 2018, ACM Trans. Graph..

[18]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[20]  Harry Shum,et al.  Review of image-based rendering techniques , 2000, Visual Communications and Image Processing.

[21]  George Drettakis,et al.  A Bayesian Approach for Selective Image-Based Rendering Using Superpixels , 2015, 2015 International Conference on 3D Vision.

[22]  Jonathan T. Barron,et al.  Jump: virtual reality video , 2016, ACM Trans. Graph..

[23]  Yi Zhang,et al.  UnrealCV: Virtual Worlds for Computer Vision , 2017, ACM Multimedia.

[24]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[25]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[26]  Richard Szeliski,et al.  Layered depth images , 1998, SIGGRAPH.

[27]  George Drettakis,et al.  Scalable inside-out image-based rendering , 2016, ACM Trans. Graph..

[28]  Qionghai Dai,et al.  Light field from micro-baseline image pair , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Tsuhan Chen,et al.  Spectral analysis for sampling image-based rendering data , 2003, IEEE Trans. Circuits Syst. Video Technol..

[30]  Lance Williams,et al.  View Interpolation for Image Synthesis , 1993, SIGGRAPH.

[31]  Gordon Wetzstein,et al.  Layered 3D: tomographic image synthesis for attenuation-based light field and high dynamic range displays , 2011, SIGGRAPH 2011.

[32]  Frédo Durand,et al.  3DTV at home , 2017, ACM Trans. Graph..

[33]  Douglas Lanman,et al.  Shield fields: modeling and capturing 3D occluders , 2008, SIGGRAPH 2008.

[34]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Michael Goesele,et al.  Image-based rendering for scenes with reflections , 2012, ACM Trans. Graph..

[37]  John Flynn,et al.  Deep Stereo: Learning to Predict New Views from the World's Imagery , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Marc Levoy,et al.  High performance imaging using large camera arrays , 2005, SIGGRAPH 2005.

[39]  Harry Shum,et al.  Plenoptic sampling , 2000, SIGGRAPH.

[40]  Gordon Wetzstein,et al.  Tensor displays , 2012, ACM Trans. Graph..

[41]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[42]  David Salesin,et al.  Surface light fields for 3D photography , 2000, SIGGRAPH.

[43]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[44]  Narendra Ahuja,et al.  DeepMVS: Learning Multi-view Stereopsis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Jan-Michael Frahm,et al.  Deep blending for free-viewpoint image-based rendering , 2018, ACM Trans. Graph..

[46]  Li Zhang,et al.  Soft 3D reconstruction for view synthesis , 2017, ACM Trans. Graph..

[47]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.