VNI-Net: Vector Neurons-based Rotation-Invariant Descriptor for LiDAR Place Recognition

LiDAR-based place recognition plays a crucial role in Simultaneous Localization and Mapping (SLAM) and LiDAR localization. Despite the emergence of various deep learning-based and hand-crafting-based methods, rotation-induced place recognition failure remains a critical challenge. Existing studies address this limitation through specific training strategies or network structures. However, the former does not produce satisfactory results, while the latter focuses mainly on the reduced problem of SO(2) rotation invariance. Methods targeting SO(3) rotation invariance suffer from limitations in discrimination capability. In this paper, we propose a new method that employs Vector Neurons Network (VNN) to achieve SO(3) rotation invariance. We first extract rotation-equivariant features from neighboring points and map low-dimensional features to a high-dimensional space through VNN. Afterwards, we calculate the Euclidean and Cosine distance in the rotation-equivariant feature space as rotation-invariant feature descriptors. Finally, we aggregate the features using GeM pooling to obtain global descriptors. To address the significant information loss when formulating rotation-invariant descriptors, we propose computing distances between features at different layers within the Euclidean space neighborhood. This greatly improves the discriminability of the point cloud descriptors while ensuring computational efficiency. Experimental results on public datasets show that our approach significantly outperforms other baseline methods implementing rotation invariance, while achieving comparable results with current state-of-the-art place recognition methods that do not consider rotation issues.

[1]  Jingyi Xu,et al.  CVTNet: A Cross-View Transformer Network for Place Recognition Using LiDAR Data , 2023, ArXiv.

[2]  Danwei W. Wang,et al.  LSDNet: A Lightweight Self-Attentional Distillation Network for Visual Place Recognition , 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Qiuguo Zhu,et al.  RING++: Roto-Translation Invariant Gram for Global Localization on a Sparse Scan Map , 2022, IEEE Transactions on Robotics.

[4]  S. Scherer,et al.  SphereVLAD++: Attention-Based and Signal-Enhanced Viewpoint Invariant Descriptor , 2022, IEEE Robotics and Automation Letters.

[5]  Junyi Ma,et al.  OverlapTransformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition , 2022, IEEE Robotics and Automation Letters.

[6]  Sanping Zhou,et al.  TransVPR: Transformer-Based Place Recognition with Multi-Level Attention Aggregation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Tomasz Trzcinski,et al.  EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale , 2021, IEEE Robotics and Automation Letters.

[8]  Jian Yang,et al.  Pyramid Point Cloud Transformer for Large-Scale Place Recognition , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  S. Sridharan,et al.  LoGG3D-Net: Locally Guided Global Descriptor Learning for 3D Place Recognition , 2021, 2022 International Conference on Robotics and Automation (ICRA).

[10]  Hongyan Liu,et al.  RPR-Net: A Point Cloud-Based Rotation-Aware Large Scale Place Recognition Network , 2021, ECCV Workshops.

[11]  C. Stachniss,et al.  OverlapNet: a siamese network for computing LiDAR scan similarity with applications to loop closing and localization , 2021, Autonomous Robots.

[12]  Yu-Kun Lai,et al.  TransLoc3D : Point Cloud based Large-scale Place Recognition using Adaptive Receptive Fields , 2021, Commun. Inf. Syst..

[13]  Hongyan Liu,et al.  SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition , 2021, AAAI.

[14]  Andrea Tagliasacchi,et al.  Vector Neurons: A General Framework for SO(3)-Equivariant Networks , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Tomasz Trzcinski,et al.  MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition , 2021, 2021 International Joint Conference on Neural Networks (IJCNN).

[16]  Abhinav Valada,et al.  LCDNet: Deep Loop Closure Detection and Point Cloud Registration for LiDAR SLAM , 2021, IEEE Transactions on Robotics.

[17]  Michael Milford,et al.  Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Le Hui,et al.  Efficient 3D Point Cloud Feature Learning for Large-Scale Place Recognition , 2021, IEEE Transactions on Image Processing.

[19]  S. Sridharan,et al.  Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order Pooling , 2020, IEEE International Conference on Robotics and Automation.

[20]  Rong Xiong,et al.  DiSCO: Differentiable Scan Context With Orientation , 2020, IEEE Robotics and Automation Letters.

[21]  Haibo Wang,et al.  Self-supervising Fine-grained Region Similarities for Large-scale Image Localization , 2020, ECCV.

[22]  Pavel Izmailov,et al.  Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data , 2020, ICML.

[23]  Ying Wang,et al.  LiDAR Iris for Loop-Closure Detection , 2019, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  Quanshi Zhang,et al.  3D-Rotation-Equivariant Quaternion Neural Networks , 2019, ECCV.

[25]  M. Geist,et al.  Image-Based Place Recognition on Bucolic Environment Across Seasons From Semantic Edge Description , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[26]  David W. Rosen,et al.  Rotation Invariant Convolutions for 3D Point Clouds Deep Learning , 2019, 2019 International Conference on 3D Vision (3DV).

[27]  Chunxia Xiao,et al.  PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Byungjae Park,et al.  1-Day Learning, 1-Year Localization: Long-Term LiDAR Localization Using Scan Context Image , 2019, IEEE Robotics and Automation Letters.

[29]  Hesheng Wang,et al.  LPD-Net: 3D Point Cloud Learning for Large-Scale Place Recognition and Environment Analysis , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Gim Hee Lee,et al.  PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Li Li,et al.  Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds , 2018, ArXiv.

[32]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[33]  Kostas Daniilidis,et al.  Learning SO(3) Equivariant Representations with Spherical CNNs , 2017, International Journal of Computer Vision.

[34]  Giorgos Tolias,et al.  Fine-Tuning CNN Image Retrieval with No Human Annotation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Li He,et al.  M2DP: A novel 3D point cloud descriptor and its application in loop closure detection , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Dirk Schulz,et al.  A fast histogram-based similarity measure for detecting loop closures in 3-D LIDAR data , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[38]  Josef Sivic,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Federico Tombari,et al.  SHOT: Unique signatures of histograms for surface and texture description , 2014, Comput. Vis. Image Underst..

[40]  Benjamin Bustos,et al.  Harris 3D: a robust extension of the Harris operator for interest point detection on 3D meshes , 2011, The Visual Computer.

[41]  Yu Zhong,et al.  Intrinsic shape signatures: A shape descriptor for 3D object recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[42]  Hongbo Zhang,et al.  RINet: Efficient 3D Lidar-Based Place Recognition Using Rotation Invariant Neural Network , 2022, IEEE Robotics and Automation Letters.

[43]  Jingwei Song,et al.  SE(3)-Equivariant Point Cloud-Based Place Recognition , 2022, CoRL.

[44]  Sanping Zhou,et al.  TransVPR: Transformer-Based Place Recognition with Multi-Level Attention Aggregation Supplementary Material , 2022 .

[45]  Sabine Elmiger,et al.  Aggregation , 2019, Springer Texts in Business and Economics.

[46]  Gert Kootstra,et al.  International Conference on Robotics and Automation (ICRA) , 2008, ICRA 2008.