论文信息 - NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

We present a learning-based method for synthesizing novel views of complex scenes using only unstructured collections of in-the-wild photographs. We build on Neural Radiance Fields (NeRF), which uses the weights of a multi-layer perceptron to model the density and color of a scene as a function of 3D coordinates. While NeRF works well on images of static subjects captured under controlled settings, it is incapable of modeling many ubiquitous, real-world phenomena in uncontrolled images, such as variable illumination or transient occluders. We introduce a series of extensions to NeRF to address these issues, thereby enabling accurate reconstructions from unstructured image collections taken from the internet. We apply our system, dubbed NeRF-W, to internet photo collections of famous landmarks, and demonstrate temporally consistent novel view renderings that are significantly closer to photorealism than the prior state of the art.

[1] Gordon Wetzstein,et al. State of the Art on Neural Rendering , 2020, Comput. Graph. Forum.

[2] Fridtjof Stein,et al. Efficient Computation of Optical Flow Using the Census Transform , 2004, DAGM-Symposium.

[3] Gordon Wetzstein,et al. DeepVoxels: Learning Persistent 3D Feature Embeddings , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Richard Szeliski,et al. Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5] Richard Szeliski,et al. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[7] Jan-Michael Frahm,et al. Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Victor Lempitsky,et al. Neural Point-Based Graphics , 2019, ECCV.

[9] Kim-Han Thung,et al. A survey of image quality measures , 2009, 2009 International Conference for Technical Postgraduates (TECHPOS).

[10] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12] Adrien Bousseau,et al. Coherent intrinsic images from photo collections , 2012, ACM Trans. Graph..

[13] A. Bovik,et al. A universal image quality index , 2002, IEEE Signal Processing Letters.

[14] David Lopez-Paz,et al. Optimizing the Latent Space of Generative Networks , 2017, ICML.

[15] Thrasyvoulos N. Pappas,et al. Perceptual criteria for image quality evaluation , 2005 .

[16] Jean Ponce,et al. Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] John Flynn,et al. Deep Stereo: Learning to Predict New Views from the World's Imagery , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Steven M. Seitz,et al. Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[19] Paul Debevec,et al. DeepView: View Synthesis With Learned Gradient Descent , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Gordon Wetzstein,et al. Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[21] Steven M. Seitz,et al. The Visual Turing Test for Scene Reconstruction , 2013, 2013 International Conference on 3D Vision.

[22] Ramin Zabih,et al. Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[23] Alex Kendall,et al. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[24] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25] Pascal Fua,et al. Image Matching Across Wide Baselines: From Paper to Practice , 2020, International Journal of Computer Vision.

[26] Steven M. Seitz,et al. LookinGood , 2018, ACM Trans. Graph..

[27] Jonathan T. Barron,et al. Unprocessing Images for Learned Raw Denoising , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Noah Snavely,et al. Neural Rerendering in the Wild , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Michael Bosse,et al. Unstructured lumigraph rendering , 2001, SIGGRAPH.

[30] Jan-Michael Frahm,et al. Augmenting Crowd-Sourced 3D Reconstructions Using Semantic Detections , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31] Andrea Fusiello. Image-based Rendering * , 2003 .

[32] Justus Thies,et al. Deferred Neural Rendering: Image Synthesis using Neural Textures , 2019 .

[33] Zhou Wang,et al. Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[34] Andrew W. Fitzgibbon,et al. Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[35] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.

[36] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[37] Peyman Milanfar,et al. NIMA: Neural Image Assessment , 2017, IEEE Transactions on Image Processing.

[38] Jonathan T. Barron,et al. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains , 2020, NeurIPS.