SEG-VoxelNet for 3D Vehicle Detection from RGB and LiDAR Data

This paper proposes a SEG-VoxelNet that takes RGB images and LiDAR point clouds as inputs for accurately detecting 3D vehicles in autonomous driving scenarios, which for the first time introduces semantic segmentation technique to assist the 3D LiDAR point cloud based detection. Specifically, SEG-VoxelNet is composed of two sub-networks: an image semantic segmentation network (SEG-Net) and an improved-VoxelNet. The SEG-Net generates the semantic segmentation map which represents the probability of the category for each pixel. The improved-VoxelNet is capable of effectively fusing point cloud data with image semantic feature and generating accurate 3D bounding boxes of vehicles. Experiments on the KITTI 3D vehicle detection benchmark show that our approach outperforms the methods of state-of-the-art.

[1]  Bin Yang,et al.  SBNet: Sparse Blocks Network for Fast Inference , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Sascha Wirges,et al.  Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[3]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Jianxiong Xiao,et al.  Sliding Shapes for 3D Object Detection in Depth Images , 2014, ECCV.

[5]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Nassir Navab,et al.  Concurrent Spatial and Channel Squeeze & Excitation in Fully Convolutional Networks , 2018, MICCAI.

[7]  Yunhong Wang,et al.  Receptive Field Block Net for Accurate and Fast Object Detection , 2017, ECCV.

[8]  Bo Li,et al.  3D fully convolutional network for vehicle detection in point cloud , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Dushyant Rao,et al.  Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Kazunori Ohno,et al.  Vehicle detection and localization on bird's eye view elevation images using convolutional neural network , 2017, 2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR).

[12]  Marcelo H. Ang,et al.  A General Pipeline for 3D Detection of Vehicles , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Sanja Fidler,et al.  Monocular 3D Object Detection for Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Jana Kosecka,et al.  3D Bounding Box Estimation Using Deep Learning and Geometry , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Silvio Savarese,et al.  Subcategory-Aware Convolutional Neural Networks for Object Proposals and Detection , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[17]  Danfei Xu,et al.  PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Bin Yang,et al.  PIXOR: Real-time 3D Object Detection from Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Silvio Savarese,et al.  Data-driven 3D Voxel Patterns for object category recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[23]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[25]  Sanja Fidler,et al.  3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Tian Xia,et al.  Vehicle Detection from 3D Lidar Using Fully Convolutional Network , 2016, Robotics: Science and Systems.

[27]  Ingmar Posner,et al.  Voting for Voting in Online Point Cloud Object Detection , 2015, Robotics: Science and Systems.