论文信息 - SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field

SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field

Despite the great success in 2D editing using user-friendly tools, such as Photoshop, semantic strokes, or even text prompts, similar capabilities in 3D areas are still limited, either relying on 3D modeling skills or allowing editing within only a few categories. In this paper, we present a novel semantic-driven NeRF editing approach, which enables users to edit a neural radiance field with a single image, and faithfully delivers edited novel views with high fidelity and multi-view consistency. To achieve this goal, we propose a prior-guided editing field to encode fine-grained geometric and texture editing in 3D space, and develop a series of techniques to aid the editing process, including cyclic constraints with a proxy mesh to facilitate geometric supervision, a color compositing mechanism to stabilize semantic-driven texture editing, and a feature-cluster-based regularization to preserve the irrelevant content unchanged. Extensive experiments and editing examples on both real-world and synthetic data demonstrate that our method achieves photo-realistic 3D editing using only a single edited image, pushing the bound of semantic-driven editing in 3D real-world scenes. Our project webpage: https://zju3dv.github.io/sine/.

[1] M. Irani,et al. Imagic: Text-Based Real Image Editing with Diffusion Models , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Yuhang Ming,et al. Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit Representation , 2022, 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[3] Ben Poole,et al. DreamFusion: Text-to-3D using 2D Diffusion , 2022, ICLR.

[4] Guosheng Lin,et al. SymmNeRF: Learning to Explore Symmetry Prior for Single-View View Synthesis , 2022, ACCV.

[5] F. Dellaert,et al. im2nerf: Image to Neural Radiance Field in the Wild , 2022, ArXiv.

[6] A. Vedaldi,et al. Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations , 2022, 2022 International Conference on 3D Vision (3DV).

[7] H. Bao,et al. Vox-Surf: Voxel-Based Implicit Surface Representation , 2022, IEEE Transactions on Visualization and Computer Graphics.

[8] Yue Liu,et al. UPST-NeRF: Universal Photorealistic Style Transfer of Neural Radiance Fields for 3D Scene , 2022, ArXiv.

[9] H. Bao,et al. NeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing , 2022, ECCV.

[10] Jianmin Zheng,et al. Object-Compositional Neural Implicit Surfaces , 2022, ECCV.

[11] H. Bao,et al. Factorized and Controllable Neural Re-Rendering of Outdoor Scene for Photo Extrapolation , 2022, ACM Multimedia.

[12] Jonathan Kelly,et al. LaTeRF: Label and Text Driven Object Radiance Fields , 2022, ECCV.

[13] Nicholas I. Kolkin,et al. ARF: Artistic Radiance Fields , 2022, ECCV.

[14] João F. Henriques,et al. SNeS: Learning Probably Symmetric Neural Surfaces from Incomplete Data , 2022, ECCV.

[15] X. Wang,et al. IDE-3D , 2022, ACM Trans. Graph..

[16] V. Sitzmann,et al. Decomposing NeRF for Editing via Feature Field Distillation , 2022, NeurIPS.

[17] Yu-Kun Lai,et al. StylizedNeRF: Consistent 3D Scene Stylization as Stylized NeRF via 2D-3D Mutual Learning , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Yu-Kun Lai,et al. NeRF-Editing: Geometry Editing of Neural Radiance Fields , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Tali Dekel,et al. Text2LIVE: Text-Driven Layered Image and Video Editing , 2022, ECCV.

[20] Yifan Jiang,et al. Unified Implicit Neural Stylization , 2022, ECCV.

[21] Yifan Jiang,et al. SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image , 2022, ECCV.

[22] L. Gool,et al. Pix2NeRF: Unsupervised Conditional $\pi$-GAN for Single Image to Neural Radiance Fields Translation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] T. Müller,et al. Instant neural graphics primitives with a multiresolution hash encoding , 2022, ACM Trans. Graph..

[24] Shai Bagon,et al. Splicing ViT Features for Semantic Appearance Transfer , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Martin R. Oswald,et al. NICE-SLAM: Neural Implicit Scalable Encoding for SLAM , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Shalini De Mello,et al. Efficient Geometry-aware 3D Generative Adversarial Networks , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Dongdong Chen,et al. CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Marek Kowalski,et al. CoNeRF: Controllable Neural Radiance Fields , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] P. Abbeel,et al. Zero-Shot Text-Guided Object Generation with Dream Fields , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Jonathan T. Barron,et al. RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Yebin Liu,et al. FENeRF: Face Editing in Neural Radiance Fields , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Andrea Tagliasacchi,et al. NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes , 2021, Trans. Mach. Learn. Res..

[34] Pratul P. Srinivasan,et al. Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Christian Theobalt,et al. StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis , 2021, ICLR.

[36] Hung-Yu Tseng,et al. Stylizing 3D Scene via Implicit Representation and HyperNetwork , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[37] H. Bao,et al. Neural rendering in a room , 2022, ACM Transactions on Graphics.

[38] Sanja Fidler,et al. EditGAN: High-Precision Semantic Image Editing , 2021, NeurIPS.

[39] Jonathan T. Barron,et al. Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition , 2021, NeurIPS.

[40] Tali Dekel,et al. Layered neural atlases for consistent video editing , 2021, ACM Trans. Graph..

[41] Hujun Bao,et al. Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[42] Lourdes Agapito,et al. CodeNeRF: Disentangled Neural Radiance Fields for Object Categories , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[43] Jonathan T. Barron,et al. HyperNeRF , 2021, ACM Trans. Graph..

[44] Yaron Lipman,et al. Volume Rendering of Neural Implicit Surfaces , 2021, NeurIPS.

[45] C. Theobalt,et al. NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction , 2021, NeurIPS.

[46] Paul Debevec,et al. NeRFactor , 2021, ACM Trans. Graph..

[47] Zhoutong Zhang,et al. Editing Conditional Radiance Fields , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[48] Julien Mairal,et al. Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[49] Michael J. Black,et al. LEAP: Learning Articulated Occupancy of People , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50] Pieter Abbeel,et al. Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[51] Daniel Cohen-Or,et al. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[52] Andrea Tagliasacchi,et al. COTR: Correspondence Transformer for Matching Across Images , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[53] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[54] Frank Dellaert,et al. Neural Volume Rendering: NeRF And Beyond , 2020, ArXiv.

[55] Jiajun Wu,et al. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56] Peter Wonka,et al. AdaBins: Depth Estimation Using Adaptive Bins , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57] Jiaolong Yang,et al. Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58] Zhengqi Li,et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59] Jonathan T. Barron,et al. Nerfies: Deformable Neural Radiance Fields , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[60] Andreas Geiger,et al. GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61] Stefano Ermon,et al. SDEdit: Image Synthesis and Editing with Stochastic Differential Equations , 2021, ArXiv.

[62] Jiajun Wu,et al. Object-Centric Neural Scene Rendering , 2020, ArXiv.

[63] Andreas Geiger,et al. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis , 2020, NeurIPS.

[64] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.

[65] Richard A. Newcombe,et al. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[67] Shiguang Shan,et al. AttGAN: Facial Attribute Editing by Only Changing What You Want , 2017, IEEE Transactions on Image Processing.

[68] Ali Farhadi,et al. PhotoShape , 2018, ACM Trans. Graph..

[69] James Hays,et al. SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[70] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[71] Fisher Yu,et al. Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[73] Henrik Aanæs,et al. Large Scale Multi-view Stereopsis Evaluation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[74] Marc Alexa,et al. As-rigid-as-possible surface modeling , 2007, Symposium on Geometry Processing.

[75] Nelson L. Max,et al. Optical Models for Direct Volume Rendering , 1995, IEEE Trans. Vis. Comput. Graph..

[76] William E. Lorensen,et al. Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.