MixNeRF: Modeling a Ray with Mixture Density for Novel View Synthesis from Sparse Inputs

Neural Radiance Field (NeRF) has broken new ground in the novel view synthesis due to its simple concept and state-of-the-art quality. However, it suffers from severe performance degradation unless trained with a dense set of images with different camera poses, which hinders its practical applications. Although previous methods addressing this problem achieved promising results, they relied heavily on the additional training resources, which goes against the philosophy of sparse-input novel-view synthesis pursuing the training efficiency. In this work, we propose MixNeRF, an effective training strategy for novel view synthesis from sparse inputs by modeling a ray with a mixture density model. Our MixNeRF estimates the joint distribution of RGB colors along the ray samples by modeling it with mixture of distributions. We also propose a new task of ray depth estimation as a useful training objective, which is highly correlated with 3D scene geometry. Moreover, we remodel the colors with regenerated blending weights based on the estimated ray depth and further improves the robustness for colors and viewpoints. Our MixNeRF outperforms other state-of-the-art methods in various standard benchmarks with superior efficiency of training and inference.

[1]  Yifan Jiang,et al.  SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image , 2022, ECCV.

[2]  Bohyung Han,et al.  InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jonathan T. Barron,et al.  Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Pratul P. Srinivasan,et al.  Dense Depth Priors for Neural Radiance Fields from Sparse Input Views , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jonathan T. Barron,et al.  RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Mohammad Mahdi Johari,et al.  GeoNeRF: Generalizing NeRF with Geometry Priors , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Lourdes Agapito,et al.  CodeNeRF: Disentangled Neural Radiance Fields for Object Categories , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  C. Theobalt,et al.  Neural Rays for Occlusion-aware Image-based Rendering , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  D. Ramanan,et al.  Depth-supervised NeRF: Fewer Views and Faster Training for Free , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Gordon Wetzstein,et al.  Fast Training of Neural Lumigraph Representations using Meta Learning , 2021, NeurIPS.

[11]  Gerard Pons-Moll,et al.  Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Andreas Geiger,et al.  SMD-Nets: Stereo Mixture Density Networks , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Pieter Abbeel,et al.  Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Hao Su,et al.  MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Changhu Wang,et al.  MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Pratul P. Srinivasan,et al.  Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Gordon Wetzstein,et al.  Neural Lumigraph Rendering , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Pratul P. Srinivasan,et al.  IBRNet: Learning Multi-View Image-Based Rendering , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  V. Ferrari,et al.  ShaRF: Shape-conditioned Radiance Fields from a Single View , 2021, ICML.

[20]  L. Gool,et al.  Learning Accurate Dense Correspondences and When to Trust Them , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Chia-Kai Liang,et al.  Portrait Neural Radiance Fields from a Single Image , 2020, ArXiv.

[22]  Angjoo Kanazawa,et al.  pixelNeRF: Neural Radiance Fields from One or Few Images , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Alex Trevithick,et al.  GRF: Learning a General Radiance Field for 3D Representation and Rendering , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Jonathan T. Barron,et al.  NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Kyaw Zaw Lin,et al.  Neural Sparse Voxel Fields , 2020, NeurIPS.

[26]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[27]  Matthias Zwicker,et al.  SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Nojun Kwak,et al.  Training Multi-Object Detector by Estimating Bounding Box Distribution for Input Image , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Anders P. Eriksson,et al.  Implicit Surface Representations As Layers in Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Ravi Ramamoorthi,et al.  Local light field fusion , 2019, ACM Trans. Graph..

[31]  Chen Li,et al.  Generating Multiple Hypotheses for 3D Human Pose Estimation With Mixture Density Network , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Henrik Aanæs,et al.  Large Scale Multi-view Stereopsis Evaluation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[40]  Nelson L. Max,et al.  Optical Models for Direct Volume Rendering , 1995, IEEE Trans. Vis. Comput. Graph..

[41]  Nojun Kwak,et al.  Sparse MDOD: Training End-to-End Multi-Object Detector without Bipartite Matching , 2022, ArXiv.

[42]  Nojun Kwak,et al.  MD3D: Mixture-Density-Based 3D Object Detection in Point Clouds , 2022, IEEE Access.

[43]  S. Srihari Mixture Density Networks , 1994 .