Vehicle Detection from 3D Lidar Using Fully Convolutional Network

Convolutional network techniques have recently achieved great success in vision based detection tasks. This paper introduces the recent development of our research on transplanting the fully convolutional network technique to the detection tasks on 3D range scan data. Specifically, the scenario is set as the vehicle detection task from the range data of Velodyne 64E lidar. We proposes to present the data in a 2D point map and use a single 2D end-to-end fully convolutional network to predict the objectness confidence and the bounding boxes simultaneously. By carefully design the bounding box encoding, it is able to predict full 3D bounding boxes even using a 2D convolutional network. Experiments on the KITTI dataset shows the state-of-the-art performance of the proposed method.

[1]  M. Hebert,et al.  The Representation, Recognition, and Locating of 3-D Objects , 1986 .

[2]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Wolfram Burgard,et al.  Instace-Based AMN Classification for Improved Object Recognition in 2D and 3D Laser Range Data , 2007, IJCAI.

[4]  Dirk Wollherr,et al.  A clustering method for efficient segmentation of 3D laser data , 2008, 2008 IEEE International Conference on Robotics and Automation.

[5]  Christoph Stiller,et al.  Segmentation of 3D lidar data in non-flat urban environments using a local convexity criterion , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[6]  Sebastian Thrun,et al.  Robust vehicle localization in urban environments using probabilistic maps , 2010, 2010 IEEE International Conference on Robotics and Automation.

[7]  Roland Siegwart,et al.  Segmentation and Unsupervised Part-based Discovery of Repetitive Objects , 2010, Robotics: Science and Systems.

[8]  Michael Himmelsbach,et al.  Fast segmentation of 3D point clouds for ground vehicles , 2010, 2010 IEEE Intelligent Vehicles Symposium.

[9]  Bertrand Douillard,et al.  On the segmentation of 3D LIDAR point clouds , 2011, 2011 IEEE International Conference on Robotics and Automation.

[10]  Sebastian Thrun,et al.  Towards 3D object recognition via classification of arbitrary object tracks , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Andrew Y. Ng,et al.  Convolutional-Recursive Deep Learning for 3D Object Classification , 2012, NIPS.

[12]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Armin B. Cremers,et al.  Performance of histogram descriptors for the classification of 3D laser range data in urban environments , 2012, 2012 IEEE International Conference on Robotics and Automation.

[14]  Paul Newman,et al.  What could move? Finding cars, pedestrians and bicyclists in 3D laser data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[15]  Florentin Wörgötter,et al.  Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Armin B. Cremers,et al.  Laser-based segment classification using a mixture of bag-of-words , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Sanja Fidler,et al.  Holistic Scene Understanding for 3D Object Detection with RGBD Cameras , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[19]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[20]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[22]  Jianxiong Xiao,et al.  Sliding Shapes for 3D Object Detection in Depth Images , 2014, ECCV.

[23]  Dieter Fox,et al.  Unsupervised feature learning for 3D scene labeling , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Huimin Ma,et al.  3D Object Proposals for Accurate Object Class Detection , 2015, NIPS.

[26]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Ingmar Posner,et al.  Voting for Voting in Online Point Cloud Object Detection , 2015, Robotics: Science and Systems.

[28]  Yi Yang,et al.  DenseBox: Unifying Landmark Localization with End to End Object Detection , 2015, ArXiv.

[29]  Sven Behnke,et al.  RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.