Foveated Neural Radiance Fields for Real-Time and Egocentric Virtual Reality

Fig. 1. Illustration of our gaze-contingent neural synthesis method. (a) Visualization of our active-viewing-tailored egocentric neural scene representation. (b) Our method allows for extremely low data storage of high-quality 3D scene assets and achieves high perceptual image quality and low latency for interactive virtual reality. Here, we compare our method with the state-of-the-art alternative neural synthesis [Mildenhall et al. 2020] and foveated rendering solutions [Patney et al. 2016; Perry and Geisler 2002] and demonstrate that our method significantly reduces more than 99% time consumption (from 9s to 20ms) and data storage (from 100MB mesh to 0.5MB neural model) for first-person immersive viewing. The orange circle indicates the viewer’s gaze position. We also show zoomed-in views of the foveal images generated by various methods in (c) and show that our method is superior compared to existing methods in achieving high perceptual image fidelity.

[1]  Stephen DiVerdi,et al.  Deep Multi Depth Panoramas for View Synthesis , 2020, ECCV.

[2]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[3]  K. Hormann,et al.  Multi‐Scale Geometry Interpolation , 2010, Comput. Graph. Forum.

[4]  Harry Shum,et al.  Review of image-based rendering techniques , 2000, Visual Communications and Image Processing.

[5]  Desney S. Tan,et al.  Foveated 3D graphics , 2012, ACM Trans. Graph..

[6]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Qinbo Li,et al.  Synthesizing light field from a single image with variable MPI and two network fusion , 2020, ACM Trans. Graph..

[8]  Eddy Ilg,et al.  Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction , 2020, ECCV.

[9]  Ersin Yumer,et al.  Transformation-Grounded Image Generation Network for Novel 3D View Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Joohwan Kim,et al.  Towards foveated rendering for gaze-tracked virtual reality , 2016, ACM Trans. Graph..

[11]  Chongyang Ma,et al.  Deep Generative Modeling for Scene Synthesis via Hybrid Representations , 2018, ACM Trans. Graph..

[12]  Joohwan Kim,et al.  Latency Requirements for Foveated Rendering in Virtual Reality , 2017, ACM Trans. Appl. Percept..

[13]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  M. Zollhöfer,et al.  PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations , 2020, ECCV.

[15]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[16]  Gordon Wetzstein,et al.  Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.

[17]  Wilson S. Geisler,et al.  Gaze-contingent real-time simulation of arbitrary visual fields , 2002, IS&T/SPIE Electronic Imaging.

[18]  James Tompkin,et al.  MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images , 2020, ECCV.

[19]  Deva Ramanan,et al.  Towards Streaming Perception , 2020, ECCV.

[20]  Diego Gutierrez,et al.  Motion parallax for 360° RGBD video , 2019, IEEE Transactions on Visualization and Computer Graphics.

[21]  ALBERT PARRA POZO,et al.  An integrated 6DoF video camera and system design , 2019, ACM Trans. Graph..

[22]  Joohwan Kim,et al.  Perceptually-guided foveation for light field displays , 2017, ACM Trans. Graph..

[23]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[24]  The magnitude of stereopsis in peripheral visual fields , 2012 .

[25]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Jan Kautz,et al.  Extreme View Synthesis , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Hans-Peter Seidel,et al.  Luminance-contrast-aware foveated rendering , 2019, ACM Trans. Graph..

[28]  Yaron Lipman,et al.  SAL: Sign Agnostic Learning of Shapes From Raw Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Li-Yi Wei,et al.  Eccentricity effects on blur and depth perception. , 2017, Optics express.

[30]  Hao Li,et al.  PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Andreas Geiger,et al.  Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Gordon Wetzstein,et al.  DeepVoxels: Learning Persistent 3D Feature Embeddings , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Ting-Chun Wang,et al.  Learning-based view synthesis for light field cameras , 2016, ACM Trans. Graph..

[34]  Yaron Lipman,et al.  Implicit Geometric Regularization for Learning Shapes , 2020, ICML.

[35]  Andreas Geiger,et al.  Texture Fields: Learning Texture Representations in Function Space , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[37]  Ronen Basri,et al.  Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance , 2020, NeurIPS.

[38]  Paul Debevec,et al.  Immersive light field video with a layered mesh representation , 2020, ACM Trans. Graph..

[39]  Gordon Wetzstein,et al.  MetaSDF: Meta-learning Signed Distance Functions , 2020, NeurIPS.

[40]  Ravi Ramamoorthi,et al.  Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines , 2019 .