City-scale continual neural semantic mapping with three-layer sampling and panoptic representation

Neural implicit representations are drawing a lot of attention from the robotics community recently, as they are expressive, continuous and compact. However, city-scale continual implicit dense mapping based on sparse LiDAR input is still an under-explored challenge. To this end, we successfully build a city-scale continual neural mapping system with a panoptic representation that consists of environment-level and instance-level modelling. Given a stream of sparse LiDAR point cloud, it maintains a dynamic generative model that maps 3D coordinates to signed distance field (SDF) values. To address the difficulty of representing geometric information at different levels in city-scale space, we propose a tailored three-layer sampling strategy to dynamically sample the global, local and near-surface domains. Meanwhile, to realize high fidelity mapping of instance under incomplete observation, category-specific prior is introduced to better model the geometric details. We evaluate on the public SemanticKITTI dataset and demonstrate the significance of the newly proposed three-layer sampling strategy and panoptic representation, using both quantitative and qualitative results. Codes and model will be publicly available.

[1]  C. Stachniss,et al.  SHINE-Mapping: Large-Scale 3D Mapping Using Sparse Hierarchical Implicit Neural Representations , 2022, 2023 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Anima Anandkumar,et al.  Neural Scene Representation for Locomotion on Structured Terrain , 2022, IEEE Robotics and Automation Letters.

[3]  T. Funkhouser,et al.  Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Mustafa Mukadam,et al.  iSDF: Real-Time Neural Signed Distance Fields for Robot Perception , 2022, Robotics: Science and Systems.

[5]  C. Stachniss,et al.  VDBFusion: Flexible and Efficient TSDF Integration of Range Sensor Data , 2022, Sensors.

[6]  Martin R. Oswald,et al.  NICE-SLAM: Neural Implicit Scalable Encoding for SLAM , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  D. Ramanan,et al.  Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly- Throughs , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jonathan T. Barron,et al.  Urban Radiance Fields , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jeannette Bohg,et al.  Vision-Only Robot Navigation in a Neural Radiance World , 2021, IEEE Robotics and Automation Letters.

[10]  Dan B. Goldman,et al.  Neural RGB-D Surface Reconstruction , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Alexandre Boulch,et al.  NeeDrop: Self-supervised Shape Representation from Sparse Point Clouds using Needle Dropping , 2021, 2021 International Conference on 3D Vision (3DV).

[12]  Hujun Bao,et al.  Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Hongbin Zha,et al.  Continual Neural Mapping: Learning An Implicit Scene Representation from Sequential Observations , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Lihua Xie,et al.  F-LOAM : Fast LiDAR Odometry and Mapping , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Samy Bengio,et al.  Learnable Fourier Features for Multi-Dimensional Spatial Positional Encoding , 2021, NeurIPS.

[16]  Edgar Sucar,et al.  iMAP: Implicit Mapping and Positioning in Real-Time , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Charles T. Loop,et al.  Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Andreas Geiger,et al.  GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jonathan T. Barron,et al.  Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains , 2020, NeurIPS.

[20]  Gordon Wetzstein,et al.  Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.

[21]  Andras Majdik,et al.  LOL: Lidar-only Odometry and Localization in 3D point cloud maps* , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Thomas Funkhouser,et al.  Local Implicit Grid Representations for 3D Scenes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[24]  Marc Pollefeys,et al.  Convolutional Occupancy Networks , 2020, ECCV.

[25]  Y. Lipman,et al.  Implicit Geometric Regularization for Learning Shapes , 2020, ICML.

[26]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[27]  Cyrill Stachniss,et al.  SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Zhi-Qin John Xu,et al.  Training behavior of deep neural network in frequency domain , 2018, ICONIP.

[31]  Yoshua Bengio,et al.  On the Spectral Bias of Neural Networks , 2018, ICML.

[32]  Bo Li,et al.  SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[33]  Jiajun Wu,et al.  Learning Shape Priors for Single-View 3D Completion and Reconstruction , 2018, ECCV.

[34]  Cyrill Stachniss,et al.  Efficient Surfel-Based SLAM using 3D Laser Range Data in Urban Environments , 2018, Robotics: Science and Systems.

[35]  Roland Siegwart,et al.  Voxblox: Incremental 3D Euclidean Signed Distance Fields for on-board MAV planning , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[36]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[37]  Ji Zhang,et al.  LOAM: Lidar Odometry and Mapping in Real-time , 2014, Robotics: Science and Systems.

[38]  Wolfram Burgard,et al.  OctoMap: an efficient probabilistic 3D mapping framework based on octrees , 2013, Autonomous Robots.

[39]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[40]  Gabriel Taubin,et al.  The ball-pivoting algorithm for surface reconstruction , 1999, IEEE Transactions on Visualization and Computer Graphics.

[41]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.