A Novel Multi-Sensor Fusion Based Object Detection and Recognition Algorithm for Intelligent Assisted Driving

The object detection and recognition algorithm based on the fusion of millimeter-wave radar and high-definition video data can improve the safety of intelligent-driving vehicles effectively. However, due to the different data modalities of millimeter-wave radar and video, how to fuse the two effectively is the key point. The difficulty lies in the data fusion methods such as insufficient adaptability of image distortion in data alignment and coordinate transformation and also the mismatching of information levels of the data to be fused. To solve the problem of data fusion of millimeter wave radar and video, this paper proposes a decision-level fusion method of millimeter-wave radar and high-definition video data based on angular alignment. Specifically, through the joint calibration and approximate interpolation, projected to polar coordinate system, the radar and the camera are angularly aligned in the horizontal direction. Then objects are detected by a deep neural network model from video data, and combined with those detected by radar to make the joint decision. Finally, object detection and recognition task based on the fusion of the two kinds of data is completed. Theoretical analysis and experimental results indicate that the accuracy of the algorithm based on the two data fusion is superior to that of the single detection and recognition algorithm on the basis of millimeter-wave radar or video data.

[1]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[2]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[3]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Libo Huang,et al.  A New Method of Target Detection Based on Autonomous Radar and Camera Data Fusion , 2017, ICVS 2017.

[6]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Ahmad El Sallab,et al.  YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud , 2018, ECCV Workshops.

[8]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  Jiaya Jia,et al.  Fast Point R-CNN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[11]  Qian Meng,et al.  A Multimodality Fusion Deep Neural Network and Safety Test Strategy for Intelligent Vehicles , 2021, IEEE Transactions on Intelligent Vehicles.

[12]  Fernando García,et al.  BirdNet: A 3D Object Detection Framework from LiDAR Information , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[13]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[15]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[17]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[18]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Tian Xia,et al.  Vehicle Detection from 3D Lidar Using Fully Convolutional Network , 2016, Robotics: Science and Systems.

[20]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Sergiu Nedevschi,et al.  Stabilization and Validation of 3D Object Position Using Multimodal Sensor Fusion and Semantic Segmentation , 2020, Sensors.

[22]  Markus Lienkamp,et al.  A Deep Learning-based Radar and Camera Sensor Fusion Architecture for Object Detection , 2019, 2019 Sensor Data Fusion: Trends, Solutions, Applications (SDF).

[23]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Yin Zhou,et al.  StarNet: Targeted Computation for Object Detection in Point Clouds , 2019, ArXiv.

[25]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  André Bourdoux,et al.  Radar-camera Fusion for Road Target Classification , 2020, 2020 IEEE Radar Conference (RadarConf20).

[27]  Horst-Michael Groß,et al.  Complex-YOLO: An Euler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds , 2018, ECCV Workshops.

[28]  Mohamed Zahran,et al.  YOLO4D: A Spatio-temporal Approach for Real-time Multi-object Detection and Classification from LiDAR Point Clouds , 2018 .

[29]  Jieping Ye,et al.  Object Detection in 20 Years: A Survey , 2019, Proceedings of the IEEE.

[30]  Hui Shen,et al.  PP-YOLO: An Effective and Efficient Implementation of Object Detector , 2020, ArXiv.

[31]  Hu Hong-yu Pedestrian detection by radar vision data fusion , 2013 .

[32]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[33]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[34]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.