论文信息 - Zero-shot Point Cloud Segmentation by Transferring Geometric Primitives

Zero-shot Point Cloud Segmentation by Transferring Geometric Primitives

We investigate transductive zero-shot point cloud semantic segmentation in this paper, where unseen class labels are unavailable during training. Actually, the 3D geometric el- ements are essential cues to reason the 3D object type. If two categories share similar geometric primitives, they also have similar semantic representations. Based on this consideration, we propose a novel framework to learn the geometric prim- itives shared in seen and unseen categories’ objects, where the learned geometric primitives are served for transferring knowledge from seen to unseen categories. Speciﬁcally, a group of learnable prototypes automatically encode geomet- ric primitives via back-propagation. Then, the point visual representation is formulated as the similarity vector of its fea- ture to the prototypes, which implies semantic cues for both seen and unseen categories. Besides, considering a 3D object composed of multiple geometric primitives, we formulate the semantic representation as a mixture-distributed embedding for the ﬁne-grained match of visual representation. In the end, to effectively learn the geometric primitives and alleviate the misclassiﬁcation issue, we propose a novel Unknown-aware InfoNCE Loss to align the visual and semantic representa- tion. As a result, guided by semantic representations, the network recognizes the novel object represented with geometric primitives. Extensive experiments show that our method signiﬁcantly outperforms other state-of-the-art methods in the harmonic mean-intersection-over-union (hIoU), with the improvement of 17.8%, 30.4% and 9.2% on S3DIS, ScanNet and SemanticKITTI datasets, respectively. Codes will be re-leased.

[1] L. Petersson,et al. Zero-Shot Learning on 3D Point Cloud Objects and Beyond , 2021, International Journal of Computer Vision.

[2] Henghui Ding,et al. Prototypical Matching and Open Set Rejection for Zero-Shot Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[3] Alexandre Boulch,et al. Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds , 2021, 2021 International Conference on 3D Vision (3DV).

[4] Shiliang Pu,et al. RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] Bingbing Liu,et al. (AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Xinge Zhu,et al. Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Siyuan Zhou,et al. Context-aware Feature Generation For Zero-shot Semantic Segmentation , 2020, ACM Multimedia.

[8] Fengmao Lv,et al. Learning Unbiased Zero-Shot Semantic Segmentation Networks Via Transductive Transfer , 2020, IEEE Signal Processing Letters.

[9] David Berthelot,et al. FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[10] Lars Petersson,et al. Transductive Zero-Shot Learning for 3D Point Cloud Classification , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11] A. Markham,et al. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Yunchao Wei,et al. Consistent Structural Relation Learning for Zero-Shot Segmentation , 2020, NeurIPS.

[13] Kate Saenko,et al. Uncertainty-Aware Learning for Zero-Shot Semantic Segmentation , 2020, NeurIPS.

[14] Yansong Feng,et al. Paraphrase Generation with Latent Bag of Words , 2020, NeurIPS.

[15] Lars Petersson,et al. Mitigating the Hubness Problem for Zero-Shot Learning of 3D Objects , 2019, BMVC.

[16] Matthieu Cord,et al. Zero-Shot Semantic Segmentation , 2019, NeurIPS.

[17] Silvio Savarese,et al. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Leonidas J. Guibas,et al. KPConv: Flexible and Deformable Convolution for Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19] Cyrill Stachniss,et al. SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20] Lars Petersson,et al. Zero-shot Learning of 3D Point Cloud Objects , 2019, 2019 16th International Conference on Machine Vision Applications (MVA).

[21] Matthias Nießner,et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Silvio Savarese,et al. Joint 2D-3D-Semantic Data for Indoor Scene Understanding , 2017, ArXiv.

[23] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Tao Xiang,et al. Learning a Deep Embedding Model for Zero-Shot Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Yuji Matsumoto,et al. Ridge Regression, Hubness, and Zero-Shot Learning , 2015, ECML/PKDD.

[26] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[27] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[28] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[29] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[30] Alexandros Nanopoulos,et al. Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..

[31] Hanna M. Wallach,et al. Topic modeling: beyond bag-of-words , 2006, ICML.