Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields

The rendering procedure used by neural radiance fields (NeRF) samples a scene with a single ray per pixel and may therefore produce renderings that are excessively blurred or aliased when training or testing images observe scene content at different resolutions. The straightforward solution of supersampling by rendering with multiple rays per pixel is impractical for NeRF, because rendering each ray requires querying a multilayer perceptron hundreds of times. Our solution, which we call “mip-NeRF” (à la “mipmap”), extends NeRF to represent the scene at a continuously-valued scale. By efficiently rendering anti-aliased conical frustums instead of rays, mip-NeRF reduces objectionable aliasing artifacts and significantly improves NeRF’s ability to represent fine details, while also being 7% faster than NeRF and half the size. Compared to NeRF, mip-NeRF reduces average error rates by 17% on the dataset presented with NeRF and by 60% on a challenging multiscale variant of that dataset that we present. Mip-NeRF is also able to match the accuracy of a brute-force supersampled NeRF on our multiscale dataset while being 22× faster.

[1]  Felix Heide,et al.  Neural Scene Graphs for Dynamic Scenes , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Shuang Zhao,et al.  Accurate appearance preserving prefiltering for rendering displacement-mapped surfaces , 2019, ACM Trans. Graph..

[3]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[4]  Kyaw Zaw Lin,et al.  Neural Sparse Voxel Fields , 2020, NeurIPS.

[5]  John Amanatides,et al.  Ray tracing with cones , 1984, SIGGRAPH.

[6]  Anjul Patney,et al.  Filtering distributions of normals for shading antialiasing , 2016, High Performance Graphics.

[7]  David Salesin,et al.  Surface light fields for 3D photography , 2000, SIGGRAPH.

[8]  Andreas Geiger,et al.  GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis , 2020, NeurIPS.

[9]  Harry Shum,et al.  Plenoptic sampling , 2000, SIGGRAPH.

[10]  Jonathan T. Barron,et al.  Deformable Neural Radiance Fields , 2020, ArXiv.

[11]  Jonathan T. Barron,et al.  NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Noah Snavely,et al.  Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[14]  Zhou Wang,et al.  On the Mathematical Properties of the Structural Similarity Index , 2012, IEEE Transactions on Image Processing.

[15]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[16]  M SeitzSteven,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1999 .

[17]  Frédo Durand,et al.  Unstructured Light Fields , 2012, Comput. Graph. Forum.

[18]  Michael Goesele,et al.  Let There Be Color! Large-Scale Texturing of 3D Reconstructions , 2014, ECCV.

[19]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[20]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[22]  Ronen Basri,et al.  Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance , 2020, NeurIPS.

[23]  Ravi Ramamoorthi,et al.  Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines , 2019 .

[24]  Richard Zhang,et al.  Making Convolutional Networks Shift-Invariant Again , 2019, ICML.

[25]  Lance Williams,et al.  Pyramidal parametrics , 1983, SIGGRAPH.

[26]  Jonathan T. Barron,et al.  NeRD: Neural Reflectance Decomposition from Image Collections , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Gordon Wetzstein,et al.  DeepVoxels: Learning Persistent 3D Feature Embeddings , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jonathan T. Barron,et al.  Learned Initializations for Optimizing Coordinate-Based Neural Representations , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Jonathan T. Barron,et al.  Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains , 2020, NeurIPS.

[31]  Jonathan T. Barron,et al.  Pushing the Boundaries of View Extrapolation With Multiplane Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Turner Whitted,et al.  An improved illumination model for shaded display , 1979, CACM.

[33]  Ravi Ramamoorthi,et al.  NeuMIP , 2021, ACM Trans. Graph..

[34]  Graham Fyffe,et al.  Stereo Magnification: Learning View Synthesis using Multiplane Images , 2018, ArXiv.

[35]  Charles T. Loop,et al.  Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Gordon Wetzstein,et al.  pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Fabrice Neyret,et al.  Representing appearance and pre-filtering subpixel data in sparse voxel octrees , 2012, EGGH-HPG'12.

[38]  Steven M. Seitz,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[39]  Homan Igehy,et al.  Tracing ray differentials , 1999, SIGGRAPH.

[40]  Justus Thies,et al.  Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Yannick Hold-Geoffroy,et al.  Neural Reflectance Fields for Appearance Acquisition , 2020, ArXiv.

[42]  Jonathan T. Barron,et al.  NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Nelson L. Max,et al.  Optical Models for Direct Volume Rendering , 1995, IEEE Trans. Vis. Comput. Graph..

[44]  Andreas Geiger,et al.  Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[46]  Paul Debevec,et al.  DeepView: View Synthesis With Learned Gradient Descent , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).