OctreeNet: A Novel Sparse 3-D Convolutional Neural Network for Real-Time 3-D Outdoor Scene Analysis

Convolutional neural networks (CNNs) for 3-D data analyses require a large size of memory and fast computation power, making real-time applications difficult. This article proposes a novel OctreeNet (a sparse 3-D CNN) to analyze the sparse 3-D laser scanning data gathered from outdoor environments. It uses a collection of shallow octrees for 3-D scene representation to reduce the memory footprint of 3-D-CNNs and performs point cloud classification on every single octree. Furthermore, the smallest non-trivial and non-overlapped kernel (SNNK) implements convolution directly on the octree structure to reduce dense 3-D convolutions to matrix operations at sparse locations. The proposed neural network implements a depth-first search algorithm for real-time predictions. A conditional random field model is utilized for learning global semantic relationships and refining point cloud classification results. Two public data sets (Semantic3D.net and Oakland) are selected to test the classification performance in outdoor scenes with different spatial sparsity. The experiments and benchmark test results show that the proposed approach can be effectively used in real-time 3-D laser data analyses. Note to Practitioners—This article was motivated by the limitations of existing deep learning technologies for analyzing 3-D laser scanning data. This technology enables robots to infer what the surroundings are, which is closely linked to semantic mapping and navigation tasks. Previous deep neural networks have seldom been used in robotic systems since they require a large amount of memory and fast computation power to apply dense 3-D operations. This article presents a sparse 3-D-Convolutional neural network (CNN) for real-time point cloud classification by exploiting the sparsity of 3-D data. This framework requires no GPUs. The practicality of the proposed method is verified on data sets gathered from different platforms and sensors. The proposed network can be adopted for other classification tasks with laser sensors.

[1]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[3]  Kurt Keutzer,et al.  SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[4]  Martial Hebert,et al.  Efficient 3-D scene analysis from streaming data , 2013, 2013 IEEE International Conference on Robotics and Automation.

[5]  Simon Lacroix,et al.  Classification of Outdoor 3D Lidar Data Based on Unsupervised Gaussian Mixture Models , 2017, IEEE Transactions on Automation Science and Engineering.

[6]  Silvio Savarese,et al.  SEGCloud: Semantic Segmentation of 3D Point Clouds , 2017, 2017 International Conference on 3D Vision (3DV).

[7]  Lars Petersson,et al.  Non-associative Higher-Order Markov Networks for Point Cloud Classification , 2014, ECCV.

[8]  Laurens van der Maaten,et al.  3D Semantic Segmentation with Submanifold Sparse Convolutional Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Martial Hebert,et al.  Data Structures for Efficient Dynamic Processing in 3-D , 2007, Int. J. Robotics Res..

[12]  Martial Hebert,et al.  3-D scene analysis via sequenced predictions over points and regions , 2011, 2011 IEEE International Conference on Robotics and Automation.

[13]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[14]  Bin Yang,et al.  SBNet: Sparse Blocks Network for Fast Inference , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Alexandre Boulch,et al.  Unstructured Point Cloud Semantic Labeling Using Deep Segmentation Networks , 2017, 3DOR@Eurographics.

[16]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Jitendra Malik,et al.  Hierarchical Surface Prediction for 3D Object Reconstruction , 2017, 2017 International Conference on 3D Vision (3DV).

[18]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Konrad Schindler,et al.  FAST SEMANTIC SEGMENTATION OF 3D POINT CLOUDS WITH STRONGLY VARYING DENSITY , 2016 .

[20]  Peter V. Gehler,et al.  Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Michael Felsberg,et al.  Deep Projective 3D Semantic Segmentation , 2017, CAIP.

[22]  Ben Graham,et al.  Sparse 3D convolutional neural networks , 2015, BMVC.

[23]  Dushyant Rao,et al.  Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Martial Hebert,et al.  Onboard contextual classification of 3-D point clouds with learned high-order Markov Random Fields , 2009, 2009 IEEE International Conference on Robotics and Automation.

[25]  Yang Liu,et al.  O-CNN , 2017, ACM Trans. Graph..

[26]  M. Himmelsbach,et al.  Real-time object classification in 3D point clouds using point feature histograms , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.