Local light field fusion

We present a practical and robust deep learning solution for capturing and rendering novel views of complex real world scenes for virtual exploration. Previous approaches either require intractably dense view sampling or provide little to no guidance for how users should sample views of a scene to reliably render high-quality novel views. Instead, we propose an algorithm for view synthesis from an irregular grid of sampled views that first expands each sampled view into a local light field via a multiplane image (MPI) scene representation, then renders novel views by blending adjacent local light fields. We extend traditional plenoptic sampling theory to derive a bound that specifies precisely how densely users should sample views of a given scene when using our algorithm. In practice, we apply this bound to capture and render views of real world scenes that achieve the perceptual quality of Nyquist rate view sampling while using up to 4000X fewer views. We demonstrate our approach's practicality with an augmented reality smart-phone app that guides users to capture input images of a scene and viewers that enable realtime virtual exploration on desktop and mobile platforms.

[1]  Douglas Lanman,et al.  Shield fields: modeling and capturing 3D occluders , 2008, ACM Trans. Graph..

[2]  Richard Szeliski,et al.  On the Motion and Appearance of Specularities in Image Sequences , 2002, ECCV.

[3]  Michael Goesele,et al.  Image-based rendering for scenes with reflections , 2012, ACM Trans. Graph..

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Paul E. Debevec,et al.  A system for acquiring, processing, and rendering panoramic light field stills for virtual reality , 2018, ACM Trans. Graph..

[7]  Yuan Chang,et al.  A review on image-based rendering , 2019, Virtual Real. Intell. Hardw..

[8]  Michael Goesele,et al.  Image-based rendering in the gradient domain , 2013, ACM Trans. Graph..

[9]  Leonard McMillan,et al.  Plenoptic Modeling: An Image-Based Rendering System , 2023 .

[10]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Frédo Durand,et al.  Unstructured Light Fields , 2012, Comput. Graph. Forum.

[12]  Ravi Ramamoorthi,et al.  Learning to Synthesize a 4D RGBD Light Field from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  John Flynn,et al.  Deep Stereo: Learning to Predict New Views from the World's Imagery , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Tom Duff,et al.  Compositing digital images , 1984, SIGGRAPH.

[15]  David Salesin,et al.  Surface light fields for 3D photography , 2000, SIGGRAPH.

[16]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[17]  Lance Williams,et al.  View Interpolation for Image Synthesis , 1993, SIGGRAPH.

[18]  George Drettakis,et al.  A Bayesian Approach for Selective Image-Based Rendering Using Superpixels , 2015, 2015 International Conference on 3D Vision.

[19]  Jonathan T. Barron,et al.  Jump: virtual reality video , 2016, ACM Trans. Graph..

[20]  Tsuhan Chen,et al.  Spectral analysis for sampling image-based rendering data , 2003, IEEE Trans. Circuits Syst. Video Technol..

[21]  M. Levoy,et al.  Fast volume rendering using a shear-warp factorization of the viewing transformation , 1994, SIGGRAPH.

[22]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[23]  Richard Szeliski,et al.  Casual 3D photography , 2017, ACM Trans. Graph..

[24]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[25]  Qionghai Dai,et al.  Light Field Reconstruction Using Deep Convolutional Network on EPI , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Harry Shum,et al.  Plenoptic sampling , 2000, SIGGRAPH.

[27]  Peter Hedman,et al.  Instant 3D photography , 2018, ACM Trans. Graph..

[28]  Harry Shum,et al.  Review of image-based rendering techniques , 2000, Visual Communications and Image Processing.

[29]  Richard Szeliski,et al.  Layered depth images , 1998, SIGGRAPH.

[30]  George Drettakis,et al.  Scalable inside-out image-based rendering , 2016, ACM Trans. Graph..

[31]  Narendra Ahuja,et al.  DeepMVS: Learning Multi-view Stereopsis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Gordon Wetzstein,et al.  Layered 3D: tomographic image synthesis for attenuation-based light field and high dynamic range displays , 2011, SIGGRAPH 2011.

[33]  Frédo Durand,et al.  3DTV at home , 2017, ACM Trans. Graph..

[34]  Gordon Wetzstein,et al.  Tensor displays , 2012, ACM Trans. Graph..

[35]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[36]  Gordon Wetzstein,et al.  Layered 3D: tomographic image synthesis for attenuation-based light field and high dynamic range displays , 2011, ACM Trans. Graph..

[37]  Xiaoming Chen,et al.  Fast Light Field Reconstruction with Deep Coarse-to-Fine Modeling of Spatial-Angular Clues , 2018, ECCV.

[38]  Yi Zhang,et al.  UnrealCV: Virtual Worlds for Computer Vision , 2017, ACM Multimedia.

[39]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[40]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[42]  George Drettakis,et al.  Depth synthesis and local warps for plausible image-based navigation , 2013, TOGS.

[43]  Qionghai Dai,et al.  Light field from micro-baseline image pair , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[45]  Jan-Michael Frahm,et al.  Deep blending for free-viewpoint image-based rendering , 2018, ACM Trans. Graph..

[46]  Li Zhang,et al.  Soft 3D reconstruction for view synthesis , 2017, ACM Trans. Graph..

[47]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[48]  Ting-Chun Wang,et al.  Learning-based view synthesis for light field cameras , 2016, ACM Trans. Graph..

[49]  Marc Levoy,et al.  High performance imaging using large camera arrays , 2005, ACM Trans. Graph..

[50]  Alex Kendall,et al.  End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[51]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[52]  John Flynn,et al.  Stereo magnification , 2018, ACM Trans. Graph..