FreePoint: Unsupervised Point Cloud Instance Segmentation

Instance segmentation of point clouds is a crucial task in 3D field with numerous applications that involve localizing and segmenting objects in a scene. However, achieving satisfactory results requires a large number of manual annotations, which is a time-consuming and expensive process. To alleviate dependency on annotations, we propose a method, called FreePoint, for underexplored unsupervised class-agnostic instance segmentation on point clouds. In detail, we represent the point features by combining coordinates, colors, normals, and self-supervised deep features. Based on the point features, we perform a multicut algorithm to segment point clouds into coarse instance masks as pseudo labels, which are used to train a point cloud instance segmentation model. To alleviate the inaccuracy of coarse masks during training, we propose a weakly-supervised training strategy and corresponding loss. Our work can also serve as an unsupervised pre-training pretext for supervised semantic instance segmentation with limited annotations. For class-agnostic instance segmentation on point clouds, FreePoint largely fills the gap with its fully-supervised counterpart based on the state-of-the-art instance segmentation model Mask3D and even surpasses some previous fully-supervised methods. When serving as a pretext task and fine-tuning on S3DIS, FreePoint outperforms training from scratch by 5.8% AP with only 10% mask annotations.

[1]  Stella X. Yu,et al.  Cut and Learn for Unsupervised Object Detection and Instance Segmentation , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  J. Crowley,et al.  TokenCut: Segmenting Objects in Images and Videos With Self-Supervised Transformer and Normalized Cut , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Francis Engelmann,et al.  Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation using Bounding Boxes , 2022, ECCV.

[4]  Hongsheng Li,et al.  Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training , 2022, NeurIPS.

[5]  A. Vedaldi,et al.  Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Xudong Wang,et al.  Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering Transformers , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Samuel Albanie,et al.  Unsupervised Salient Object Detection with Spectral Cluster Voting , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Francis E. H. Tay,et al.  Masked Autoencoders for Point Cloud Self-supervised Learning , 2022, ECCV.

[9]  Xuan Thanh Nguyen,et al.  SoftGroup for 3D Instance Segmentation on Point Clouds , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  R. Rodrigo,et al.  CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Shalini De Mello,et al.  FreeSOLO: Learning to Segment Objects without Annotations , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jiwen Lu,et al.  Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  P. Swoboda,et al.  RAMA: A Rapid Multicut Algorithm on GPU , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  O. Litany,et al.  Mask3D for 3D Semantic Instance Segmentation , 2022, ArXiv.

[15]  B. Dai,et al.  Voxel-MAE: Masked Autoencoders for Pre-training Large-scale Point Clouds , 2022, ArXiv.

[16]  Yanyun Qu,et al.  Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Jiaya Jia,et al.  Guided Point Contrastive Learning for Semi-supervised Point Cloud Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Jean Ponce,et al.  Localizing Objects with Self-Supervised Transformers and no Labels , 2021, BMVC.

[19]  Kui Jia,et al.  Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Xinggang Wang,et al.  Hierarchical Aggregation for 3D Instance Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Tao Mei,et al.  Weakly Supervised Semantic Segmentation for Large-Scale Point Cloud , 2021, AAAI.

[22]  Le Hui,et al.  SSPC-Net: Semi-supervised Semantic 3D Point Cloud Segmentation Network , 2021, AAAI.

[23]  Xiaojuan Qi,et al.  One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Kavita Bala,et al.  PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jingdao Chen,et al.  LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation , 2021, IEEE Robotics and Automation Letters.

[26]  Wouter Van Gansbeke,et al.  Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Rohit Girdhar,et al.  Self-Supervised Pretraining of 3D Features on any Point-Cloud , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Saining Xie,et al.  Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Zhi Tian,et al.  BoxInst: High-Performance Instance Segmentation with Box Annotations , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Chunhua Shen,et al.  DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Leonidas J. Guibas,et al.  PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding , 2020, ECCV.

[32]  Gim Hee Lee,et al.  Weakly Supervised Semantic Point Cloud Segmentation: Towards 10× Fewer Labels , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Li Jiang,et al.  PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Bastian Leibe,et al.  3D-MPA: Multi-Proposal Aggregation for 3D Semantic Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Lu Fang,et al.  OccuSeg: Occupancy-Aware 3D Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Bernard Ghanem,et al.  3D Instance Segmentation via Multi-Task Metric Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Bo Yang,et al.  Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds , 2019, NeurIPS.

[38]  Shu Liu,et al.  Associatively Segmenting Instances and Semantics in Point Clouds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Matthias Nießner,et al.  3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Leonidas J. Guibas,et al.  GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Leonidas J. Guibas,et al.  PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Vladlen Koltun,et al.  Open3D: A Modern Library for 3D Data Processing , 2018, ArXiv.

[43]  Ulrich Neumann,et al.  SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[45]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Silvio Savarese,et al.  3D Semantic Parsing of Large-Scale Indoor Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[49]  Cordelia Schmid,et al.  Unsupervised object discovery and localization in the wild: Part-based matching with bottom-up region proposals , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Florentin Wörgötter,et al.  Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[54]  Anand Singh Jalal,et al.  A Density Based Algorithm for Discovering Density Varied Clusters in Large Spatial Databases , 2010 .

[55]  Radu Bogdan Rusu,et al.  Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments , 2010, KI - Künstliche Intelligenz.

[56]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[57]  M. R. Rao,et al.  The partition problem , 1993, Math. Program..