3D object detection based on sparse convolution neural network and feature fusion for autonomous driving in smart cities

Abstract People in cities are suffering from traffic congestion and air pollution in daily life partly due to a great number of private cars, and always face the danger of accidents, so autonomous driving is developed by many institutes and companies especially in recent years. Autonomous driving will play an import role in the future smart cities, reduce the time and economic cost of the whole society, and be helpful for the sustainability of the city and society. A significant task for autonomous driving is to detect surrounding objects accurately in real-time, including car, pedestrian, cyclist, etc. In this paper, we propose one end-to-end three dimensional (3D) object detection method based on voxelization, sparse convolution, and feature fusion. The proposed method exploits only point cloud as input, and it has two key components—small voxels and efficient feature fusion. Instead of utilizing extra networks to transform voxels, we directly average the points within each voxel as their feature representation. To enrich features for prediction, we have designed a two-step feature fusion method called fusion of fusion network that can combine information of multiple scales and 3D space. We have submitted to the official test server of the 3D detection benchmark—KITTI, and achieved state-of-the-art performance especially in the Cyclist class. Besides, detection speed of our method achieves 0.05 s/frame with a 2–4 fold runtime improvement against state-of-the-art methods due to its simple and compact architecture.

[1]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Bin Yang,et al.  Multi-Task Multi-Sensor Fusion for 3D Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Jiaolong Xu,et al.  Multiview random forest of local experts combining RGB and LIDAR data for pedestrian detection , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[5]  H. Mouftah,et al.  Autonomous vehicles in the sustainable cities, the beginning of a green adventure , 2019, Sustainable Cities and Society.

[6]  Jana Kosecka,et al.  3D Bounding Box Estimation Using Deep Learning and Geometry , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Bo Li,et al.  3D fully convolutional network for vehicle detection in point cloud , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Rico Krueger,et al.  Autonomous driving and residential location preferences: Evidence from a stated choice survey , 2019, Transportation Research Part C: Emerging Technologies.

[9]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Yi-Ting Chen,et al.  The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[11]  Fuchun Sun,et al.  HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Bo Li,et al.  SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[13]  Xiaonian Wang,et al.  Road-Segmentation-Based Curb Detection Method for Self-Driving via a 3D-LiDAR Sensor , 2018, IEEE Transactions on Intelligent Transportation Systems.

[14]  Jianxiong Xiao,et al.  Sliding Shapes for 3D Object Detection in Depth Images , 2014, ECCV.

[15]  Qichao Zhang,et al.  Multi-task learning for dangerous object detection in autonomous driving , 2017, Inf. Sci..

[16]  Bin Yang,et al.  PIXOR: Real-time 3D Object Detection from Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Ching-Yao Chan Advancements, prospects, and impacts of automated driving systems , 2017 .

[18]  Sanja Fidler,et al.  3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Bin Yang,et al.  Deep Continuous Fusion for Multi-sensor 3D Object Detection , 2018, ECCV.

[20]  Jing Ye,et al.  RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving , 2018, IEEE Robotics and Automation Letters.

[21]  Wei Liu,et al.  ParseNet: Looking Wider to See Better , 2015, ArXiv.

[22]  Alfred Benedikt Brendel,et al.  Young people’s travel behavior – Using the life-oriented approach to understand the acceptance of autonomous driving , 2019, Transportation Research Part D: Transport and Environment.

[23]  Viktoriya Kolarova,et al.  Assessing the effect of autonomous driving on value of travel time savings: A comparison between current and future preferences , 2019, Transportation Research Part A: Policy and Practice.

[24]  Yin Zhou,et al.  MVX-Net: Multimodal VoxelNet for 3D Object Detection , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[25]  Csaba Benedek,et al.  Instant Object Detection in Lidar Point Clouds , 2017, IEEE Geoscience and Remote Sensing Letters.

[26]  Tian Xia,et al.  Vehicle Detection from 3D Lidar Using Fully Convolutional Network , 2016, Robotics: Science and Systems.

[27]  Ingmar Posner,et al.  Voting for Voting in Online Point Cloud Object Detection , 2015, Robotics: Science and Systems.

[28]  Juhan Nam,et al.  Multi-Level and Multi-Scale Feature Aggregation Using Pretrained Convolutional Neural Networks for Music Auto-Tagging , 2017, IEEE Signal Processing Letters.

[29]  Silvio Savarese,et al.  Data-driven 3D Voxel Patterns for object category recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Huchuan Lu,et al.  Edge-Aware Convolution Neural Network Based Salient Object Detection , 2019, IEEE Signal Processing Letters.

[31]  Cosimo Rubino,et al.  3D Object Localisation from Multi-View Image Detections , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Jing Hu,et al.  A Multiscale Fusion Convolutional Neural Network for Plant Leaf Recognition , 2018, IEEE Signal Processing Letters.

[33]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  M. D. Adams,et al.  Lidar design, use, and calibration concepts for correct environmental detection , 2000, IEEE Trans. Robotics Autom..

[36]  Laurens van der Maaten,et al.  3D Semantic Segmentation with Submanifold Sparse Convolutional Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Li Sun,et al.  Integrating Deep Semantic Segmentation Into 3-D Point Cloud Registration , 2018, IEEE Robotics and Automation Letters.

[38]  Zhihui Lu,et al.  A self-adaptive approach to service deployment under mobile edge computing for autonomous driving , 2019, Eng. Appl. Artif. Intell..

[39]  Ben Graham,et al.  Sparse 3D convolutional neural networks , 2015, BMVC.

[40]  Dushyant Rao,et al.  Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[41]  Long Chen,et al.  Dynamic path planning for autonomous driving on various roads with avoidance of static and moving obstacles , 2018 .

[42]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[44]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[45]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Yi Zhang,et al.  Accurate and Real-Time Object Detection Based on Bird's Eye View on 3D Point Clouds , 2019, 2019 International Conference on 3D Vision (3DV).

[47]  Sanja Fidler,et al.  Monocular 3D Object Detection for Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Thomas Brox,et al.  Orientation-boosted Voxel Nets for 3D Object Recognition , 2016, BMVC.

[49]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).