Scene-Generalizable Interactive Segmentation of Radiance Fields

Existing methods for interactive segmentation in radiance fields entail scene-specific optimization and thus cannot generalize across different scenes, which greatly limits their applicability. In this work we make the first attempt at Scene-Generalizable Interactive Segmentation in Radiance Fields (SGISRF) and propose a novel SGISRF method, which can perform 3D object segmentation for novel (unseen) scenes represented by radiance fields, guided by only a few interactive user clicks in a given set of multi-view 2D images. In particular, the proposed SGISRF focuses on addressing three crucial challenges with three specially designed techniques. First, we devise the Cross-Dimension Guidance Propagation to encode the scarce 2D user clicks into informative 3D guidance representations. Second, the Uncertainty-Eliminated 3D Segmentation module is designed to achieve efficient yet effective 3D segmentation. Third, Concealment-Revealed Supervised Learning scheme is proposed to reveal and correct the concealed 3D segmentation errors resulted from the supervision in 2D space with only 2D mask annotations. Extensive experiments on two real-world challenging benchmarks covering diverse scenes demonstrate 1) effectiveness and scene-generalizability of the proposed method, 2) favorable performance compared to classical method requiring scene-specific optimization.

[1]  Pratul P. Srinivasan,et al.  BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis , 2023, SIGGRAPH.

[2]  Marcus A. Brubaker,et al.  SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Chi-Keung Tang,et al.  NeRF-RPN: A general framework for object detection in NeRFs , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ben Poole,et al.  DreamFusion: Text-to-3D using 2D Diffusion , 2022, ICLR.

[5]  A. Vedaldi,et al.  Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations , 2022, 2022 International Conference on 3D Vision (3DV).

[6]  M. Niethammer,et al.  PseudoClick: Interactive Image Segmentation with Click Imitation , 2022, ECCV.

[7]  Chunle Guo,et al.  FocusCut: Diving into a Focus View in Interactive Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  V. Sitzmann,et al.  Decomposing NeRF for Editing via Feature Field Distillation , 2022, NeurIPS.

[9]  Bryan C. Russell,et al.  Neural Volumetric Object Selection , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  T. Funkhouser,et al.  Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Andreas Geiger,et al.  TensoRF: Tensorial Radiance Fields , 2022, ECCV.

[12]  Pratul P. Srinivasan,et al.  Block-NeRF: Scalable Large Scene Neural View Synthesis , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  U. Neumann,et al.  Point-NeRF: Point-based Neural Radiance Fields , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  T. Müller,et al.  Instant neural graphics primitives with a multiresolution hash encoding , 2022, ACM Trans. Graph..

[15]  Kilian Q. Weinberger,et al.  Language-driven Semantic Segmentation , 2022, ICLR.

[16]  Benjamin Recht,et al.  Plenoxels: Radiance Fields without Neural Networks , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Chi-Keung Tang,et al.  Mask Transfiner for High-Quality Instance Segmentation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jonathan T. Barron,et al.  NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Andrea Tagliasacchi,et al.  NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes , 2021, Trans. Mach. Learn. Res..

[20]  Pratul P. Srinivasan,et al.  Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Hwann-Tzong Chen,et al.  Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yilei Zhang,et al.  Conditional Diffusion for Interactive Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Patrick Labatut,et al.  Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  C. Theobalt,et al.  NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction , 2021, NeurIPS.

[25]  Julien Mairal,et al.  Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Hao Su,et al.  MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Ren Ng,et al.  PlenOctrees for Real-time Rendering of Neural Radiance Fields , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Supasorn Suwajanakorn,et al.  NeX: Real-time View Synthesis with Neural Basis Expansion , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Konstantin Sofiiuk,et al.  Reviving Iterative Training with Mask Guidance for Interactive Segmentation , 2021, 2022 IEEE International Conference on Image Processing (ICIP).

[30]  Yuying Hao,et al.  PaddleSeg: A High-Efficient Development Toolkit for Image Segmentation , 2021, ArXiv.

[31]  Jonathan T. Barron,et al.  NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Zhao Zhang,et al.  Interactive Image Segmentation With First Click Attention , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Yunchao Wei,et al.  Interactive Object Segmentation With Inside-Outside Guidance , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Chang-Su Kim,et al.  Interactive Image Segmentation via Backpropagating Refinement Scheme , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[36]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Thomas Brox,et al.  3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation , 2016, MICCAI.

[38]  Ning Xu,et al.  Deep Interactive Object Selection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Trevor Darrell,et al.  Fully convolutional networks for semantic segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Vladimir Kolmogorov,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..

[42]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[43]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[44]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[45]  G. Drettakis,et al.  NeRFshop: Interactive Editing of Neural Radiance Fields , 2023, Proc. ACM Comput. Graph. Interact. Tech..

[46]  B. Mildenhall Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines , 2019 .