Online Learning of a Probabilistic and Adaptive Scene Representation

Constructing and maintaining a consistent scene model on-the-fly is the core task for online spatial perception, interpretation, and action. In this paper, we represent the scene with a Bayesian nonparametric mixture model, seamlessly describing per-point occupancy status with a continuous probability density function. Instead of following the conventional data fusion paradigm, we address the problem of online learning the process how sequential point cloud data are generated from the scene geometry. An incremental and parallel inference is performed to update the parameter space in real-time. We experimentally show that the proposed representation achieves state-of-the-art accuracy with promising efficiency. The consistent probabilistic formulation assures a generative model that is adaptive to different sensor characteristics, and the model complexity can be dynamically adjusted on-the-fly according to different data scales.

[1]  Andrew J. Davison,et al.  FutureMapping: The Computational Structure of Spatial AI Systems , 2018, ArXiv.

[2]  Marc Pollefeys,et al.  Convolutional Occupancy Networks , 2020, ECCV.

[3]  Torsten Sattler,et al.  BAD SLAM: Bundle Adjusted Direct RGB-D SLAM , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Eric P. Xing,et al.  Parallel Markov Chain Monte Carlo for Nonparametric Mixture Models , 2013, ICML.

[6]  Andreas Zell,et al.  Efficient Map Representations for Multi-Dimensional Normal Distributions Transforms , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Jaime Valls Miró,et al.  Warped Gaussian Processes Occupancy Mapping With Uncertain Inputs , 2017, IEEE Robotics and Automation Letters.

[8]  Marc Pollefeys,et al.  RoutedFusion: Learning Real-Time Depth Map Fusion , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Fabio Tozeto Ramos,et al.  Towards real-time 3D continuous occupancy mapping using Hilbert maps , 2018, Int. J. Robotics Res..

[11]  Jiawen Chen,et al.  Scalable real-time volumetric surface reconstruction , 2013, ACM Trans. Graph..

[12]  Nathan Michael,et al.  MRFMap: Online Probabilistic 3D Mapping using Forward Ray Sensor Models , 2020, Robotics: Science and Systems.

[13]  Matthias Nießner,et al.  BundleFusion , 2016, TOGS.

[14]  John J. Leonard,et al.  Real-time large-scale dense RGB-D SLAM with volumetric fusion , 2014, Int. J. Robotics Res..

[15]  Hugh F. Durrant-Whyte,et al.  Contextual occupancy maps incorporating sensor and location uncertainty , 2010, 2010 IEEE International Conference on Robotics and Automation.

[16]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[17]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[18]  Nathan Michael,et al.  Efficient Parametric Multi-Fidelity Surface Mapping , 2020, Robotics: Science and Systems.

[19]  Shahram Izadi,et al.  Modeling Kinect Sensor Noise for Improved 3D Reconstruction and Tracking , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[20]  Jonathan P. How,et al.  Streaming, Distributed Variational Inference for Bayesian Nonparametrics , 2015, NIPS.

[21]  Radu Horaud,et al.  Joint Alignment of Multiple Point Sets with Batch and Incremental Expectation-Maximization , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Mao Ye,et al.  Dense Visual SLAM with Probabilistic Surfel Map , 2017, IEEE Transactions on Visualization and Computer Graphics.

[23]  Vladlen Koltun,et al.  Dense scene reconstruction with points of interest , 2013, ACM Trans. Graph..

[24]  Thomas Funkhouser,et al.  Local Deep Implicit Functions for 3D Shape , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Fabio Tozeto Ramos,et al.  Gaussian process occupancy maps* , 2012, Int. J. Robotics Res..

[26]  Brendan Englot,et al.  Fast, accurate gaussian process occupancy maps via test-data octrees and nested Bayesian fusion , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Stefan Leutenegger,et al.  ElasticFusion: Dense SLAM Without A Pose Graph , 2015, Robotics: Science and Systems.

[28]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Wolfram Burgard,et al.  OctoMap: an efficient probabilistic 3D mapping framework based on octrees , 2013, Autonomous Robots.

[30]  Tim Weyrich,et al.  Real-Time 3D Reconstruction in Dynamic Scenes Using Point-Based Fusion , 2013, 2013 International Conference on 3D Vision.

[31]  Jari Saarinen,et al.  3D normal distributions transform occupancy maps: An efficient representation for mapping in dynamic environments , 2013, Int. J. Robotics Res..

[32]  Fabio Tozeto Ramos,et al.  Hilbert maps: Scalable continuous occupancy mapping with stochastic gradient descent , 2015, Robotics: Science and Systems.

[33]  Marc Pollefeys,et al.  Photometric Bundle Adjustment for Dense Multi-view 3D Modeling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Sander Oude Elberink,et al.  Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications , 2012, Sensors.

[35]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[36]  Nathan Michael,et al.  Variable Resolution Occupancy Mapping Using Gaussian Mixture Models , 2019, IEEE Robotics and Automation Letters.

[37]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[38]  Jan Kautz,et al.  Accelerated Generative Models for 3D Point Cloud Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Fabio Tozeto Ramos,et al.  Bayesian Hilbert Maps for Dynamic Continuous Occupancy Mapping , 2017, CoRL.

[40]  Martial Hebert,et al.  Direct Fitting of Gaussian Mixture Models , 2019, 2019 16th Conference on Computer and Robot Vision (CRV).

[41]  Jari Saarinen,et al.  Normal Distributions Transform Occupancy Maps: Application to large-scale online 3D mapping , 2013, 2013 IEEE International Conference on Robotics and Automation.

[42]  George Vogiatzis,et al.  A Generative Model for Online Depth Fusion , 2012, ECCV.

[43]  Emmanuel Prados,et al.  Gradient Flows for Optimizing Triangular Mesh-based Surfaces: Applications to 3D Reconstruction Problems Dealing with Visibility , 2011, International Journal of Computer Vision.

[44]  Tao Xiang,et al.  Background Subtraction with DirichletProcess Mixture Models , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Nathan Michael,et al.  Efficient, Multifidelity Perceptual Representations via Hierarchical Gaussian Mixture Models , 2019, IEEE Transactions on Robotics.

[46]  Matthias Zwicker,et al.  Surfels: surface elements as rendering primitives , 2000, SIGGRAPH.

[47]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[48]  Nathan Michael,et al.  Real-Time Information-Theoretic Exploration with Gaussian Mixture Model Maps , 2019, Robotics: Science and Systems.

[49]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[50]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[51]  Hongbin Zha,et al.  An Efficient Volumetric Mesh Representation for Real-Time Scene Reconstruction Using Spatial Hashing , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[52]  Torsten Sattler,et al.  SurfelMeshing: Online Surfel-Based Mesh Reconstruction , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Jie Lu,et al.  A Survey on Bayesian Nonparametric Learning , 2019, ACM Comput. Surv..

[54]  Dahua Lin,et al.  Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation , 2013, NIPS.

[55]  Hongbin Zha,et al.  PSDF Fusion: Probabilistic Signed Distance Function for On-the-fly 3D Data Fusion and Scene Reconstruction , 2018, ECCV.

[56]  Junsoo Ha,et al.  A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning , 2020, ICLR.

[57]  Peter Biber,et al.  The normal distributions transform: a new approach to laser scan matching , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[58]  Wei Gao,et al.  FilterReg: Robust and Efficient Probabilistic Point-Set Registration Using Gaussian Filter and Twist Parameterization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Daniel D. Lee,et al.  Online Continuous Mapping using Gaussian Process Implicit Surfaces , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[60]  Nathan Michael,et al.  Efficient, Multi-Fidelity Perceptual Representations via Hierarchical Gaussian Mixture Models , 2017 .

[61]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[62]  Eddy Ilg,et al.  Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction , 2020, ECCV.

[63]  Thomas Funkhouser,et al.  Local Implicit Grid Representations for 3D Scenes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Jan Kautz,et al.  HGMR: Hierarchical Gaussian Mixtures for Adaptive 3D Registration , 2018, ECCV.

[65]  Raja Giryes,et al.  PointGMM: A Neural GMM Network for Point Clouds , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Roland Siegwart,et al.  Cubic Range Error Model for Stereo Vision with Illuminators , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[67]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[68]  Shi-Min Hu,et al.  Real-time High-accuracy Three-Dimensional Reconstruction with Consumer RGB-D Cameras , 2018, ACM Trans. Graph..

[69]  Andreas Geiger,et al.  Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Partha Pratim Das,et al.  Characterizations of Noise in Kinect Depth Images: A Review , 2014, IEEE Sensors Journal.

[71]  Pablo Ramon Soria,et al.  Geometric Priors for Gaussian Process Implicit Surfaces , 2017, IEEE Robotics and Automation Letters.