Connecting the dots for real-time LiDAR-based object detection with YOLO

In this paper we introduce a generic method for people and vehicle detection using LiDAR data only, leveraging a pre-trained Convolutional Neural Network (CNN) from the RGB domain. Typically with machine learning algorithms, there is an inherent trade-off between the amount of training data available and the need for engineered features. The current state-of-the-art object detection and classification heavily rely on deep CNNs trained on enormous RGB image datasets. To take advantage of this inbuilt knowledge, we propose to fine-tune You only look once (YOLO) network transferring its understanding about object shapes to upsampled LiDAR images. Our method creates a dense depth/intensity map, which highlights object contours, from the 3D-point cloud of a LiDAR scan. The proposed method is hardware agnostic, hence can be used with any LiDAR data, independently on the number of channels or beams. Overall, the proposed pipeline exploits the notable similarity between upsampled LiDAR images and RGB images preventing the need to train a deep CNN from scratch. This transfer learning makes our method data efficient while avoiding the creation of heavily engineered features. Evaluation results show that our proposed LiDAR-only detection model has equivalent performance to its RGB-only counterpart.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Christoph Stiller,et al.  Segmentation of 3D lidar data in non-flat urban environments using a local convexity criterion , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[4]  Bo Li,et al.  3D fully convolutional network for vehicle detection in point cloud , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Marc Pollefeys,et al.  Semantically Guided Depth Upsampling , 2016, GCPR.

[6]  J. S. Berrio,et al.  Fusing Lidar and Semantic Image Information in Octree Maps , 2017 .

[7]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Shay B. Cohen,et al.  Advances in Neural Information Processing Systems 25 , 2012, NIPS 2012.

[9]  Paulo Peixoto,et al.  Real-Time Deep ConvNet-Based Vehicle Detection Using 3D-LIDAR Reflection Intensity Data , 2017, ROBOT.

[10]  Paulo Peixoto,et al.  DepthCN: Vehicle detection using 3D-LIDAR and ConvNet , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[11]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[12]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Cristiano Premebida,et al.  Pedestrian detection combining RGB and dense LIDAR data , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Francesc Moreno-Noguer,et al.  Deep Lidar CNN to Understand the Dynamics of Moving Vehicles , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Jiaolong Xu,et al.  Multiview random forest of local experts combining RGB and LIDAR data for pedestrian detection , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).