论文信息 - PlenOctrees for Real-time Rendering of Neural Radiance Fields

PlenOctrees for Real-time Rendering of Neural Radiance Fields

We introduce a method to render Neural Radiance Fields (NeRFs) in real time using PlenOctrees, an octree-based 3D representation which supports view-dependent effects. Our method can render 800×800 images at more than 150 FPS, which is over 3000 times faster than conventional NeRFs. We do so without sacrificing quality while preserving the ability of NeRFs to perform free-viewpoint rendering of scenes with arbitrary geometry and view-dependent effects. Real-time performance is achieved by pre-tabulating the NeRF into a PlenOctree. In order to preserve viewdependent effects such as specularities, we factorize the appearance via closed-form spherical basis functions. Specifically, we show that it is possible to train NeRFs to predict a spherical harmonic representation of radiance, removing the viewing direction as an input to the neural network. Furthermore, we show that PlenOctrees can be directly optimized to further minimize the reconstruction loss, which leads to equal or better quality compared to competing methods. Moreover, this octree optimization step can be used to reduce the training time, as we no longer need to wait for the NeRF training to converge fully. Our real-time neural rendering approach may potentially enable new applications such as 6-DOF industrial and product visualizations, as well as next generation AR/VR systems. PlenOctrees are amenable to in-browser rendering as well; please visit the project page for the interactive online demo, as well as video and code: https://alexyu. net/plenoctrees.

[1] Gordon Wetzstein,et al. Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[2] Kyaw Zaw Lin,et al. Neural Sparse Voxel Fields , 2020, NeurIPS.

[3] Michael Goesele,et al. Let There Be Color! Large-Scale Texturing of 3D Reconstructions , 2014, ECCV.

[4] Yaser Sheikh,et al. Mixture of volumetric primitives for efficient neural rendering , 2021, ACM Transactions on Graphics.

[5] Thomas Brox,et al. Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6] Li Zhang,et al. Soft 3D reconstruction for view synthesis , 2017, ACM Trans. Graph..

[7] Yu-Ting Tsai,et al. All-frequency precomputed radiance transfer using spherical radial basis functions and clustered tensor approximation , 2006, SIGGRAPH '06.

[8] Jonathan T. Barron,et al. IBRNet: Learning Multi-View Image-Based Rendering , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Richard Szeliski,et al. Stereo Matching with Transparency and Matting , 1999, International Journal of Computer Vision.

[10] Hao Zhang,et al. Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Jonathan T. Barron,et al. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Jitendra Malik,et al. Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.

[14] Kalyan Sunkavalli,et al. Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Hao Li,et al. PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16] Matthew Tancik,et al. pixelNeRF: Neural Radiance Fields from One or Few Images , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Duygu Ceylan,et al. DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction , 2019, NeurIPS.

[18] Ronen Basri,et al. Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance , 2020, NeurIPS.

[19] Wei Jiang,et al. DeRF: Decomposed Radiance Fields , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] R. Fisher. Dispersion on a sphere , 1953, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[21] Paul S. Heckbert. Color image quantization for frame buffer display , 1982, SIGGRAPH.

[22] Jonathan T. Barron,et al. NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Gordon Wetzstein,et al. DeepVoxels: Learning Persistent 3D Feature Embeddings , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Jitendra Malik,et al. Hierarchical Surface Prediction for 3D Object Reconstruction , 2017, 2017 International Conference on 3D Vision (3DV).

[25] Andreas Geiger,et al. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis , 2020, NeurIPS.

[26] Richard Szeliski,et al. Layered depth images , 1998, SIGGRAPH.

[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28] E. Adelson,et al. The Plenoptic Function and the Elements of Early Vision , 1991 .

[29] Jonathan T. Barron,et al. Learned Initializations for Optimizing Coordinate-Based Neural Representations , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Yaron Lipman,et al. Implicit Geometric Regularization for Learning Shapes , 2020, ICML.

[31] Ronen Basri,et al. Lambertian reflectance and linear subspaces , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[32] Ravi Ramamoorthi,et al. Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines , 2019 .

[33] Hao Li,et al. Monocular Real-Time Volumetric Performance Capture , 2020, ECCV.

[34] Supasorn Suwajanakorn,et al. NeX: Real-time View Synthesis with Neural Basis Expansion , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Gernot Riegler,et al. OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Jitendra Malik,et al. Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[37] Jan Kautz,et al. Precomputed radiance transfer for real-time rendering in dynamic, low-frequency lighting environments , 2002 .

[38] Marc Levoy,et al. Light field rendering , 1996, SIGGRAPH.

[39] Alex Trevithick,et al. GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering , 2020, ArXiv.

[40] Charles T. Loop,et al. Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Kiriakos N. Kutulakos,et al. A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[42] Gordon Wetzstein,et al. AutoInt: Automatic Integration for Fast Neural Volume Rendering , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Jonathan T. Barron,et al. Pushing the Boundaries of View Extrapolation With Multiplane Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Marek Kowalski,et al. FastNeRF: High-Fidelity Neural Rendering at 200FPS , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[45] Richard A. Newcombe,et al. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Jitendra Malik,et al. Learning a Multi-View Stereo Machine , 2017, NIPS.

[47] Yiyi Liao,et al. KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[48] Sebastian Nowozin,et al. Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Steven M. Seitz,et al. Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[50] Graham Fyffe,et al. Stereo Magnification: Learning View Synthesis using Multiplane Images , 2018, ArXiv.

[51] Andreas Geiger,et al. Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52] Michael Bosse,et al. Unstructured lumigraph rendering , 2001, SIGGRAPH.

[53] HeckbertPaul. Color image quantization for frame buffer display , 1982 .

[54] P. Hanrahan,et al. On the relationship between radiance and irradiance: determining the illumination from images of a convex Lambertian object. , 2001, Journal of the Optical Society of America. A, Optics, image science, and vision.

[55] A. Knoll. A Survey of Octree Volume Rendering Methods , 2006, VLUDS.

[56] David Salesin,et al. Surface light fields for 3D photography , 2000, SIGGRAPH.

[57] M. Ament,et al. Volume Rendering , 2015 .

[58] Jonathan T. Barron,et al. Baking Neural Radiance Fields for Real-Time View Synthesis , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[59] Marc Pollefeys,et al. Convolutional Occupancy Networks , 2020, ECCV.

[60] Jonathan T. Barron,et al. Deformable Neural Radiance Fields , 2020, ArXiv.