Radar and Camera Early Fusion for Vehicle Detection in Advanced Driver Assistance Systems

Perception module is at the heart of modern Advanced Driver Assistance Systems (ADAS). To improve the quality and robustness of this module, especially in the presence of environmental noises such as varying lighting and weather conditions, fusion of sensors (mainly camera and LiDAR) has been the center of attention in the recent studies. In this paper, we focus on a relatively unexplored area which addresses the early fusion of camera and radar sensors. We feed a minimally processed radar signal to our deep learning architecture along with its corresponding camera frame to enhance the accuracy and robustness of our perception module. Our evaluation, performed on real world data, suggests that the complementary nature of radar and camera signals can be leveraged to reduce the lateral error by 15% when applied to object detection.

[1]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[2]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Jürgen Dickmann,et al.  Semantic Segmentation on Radar Point Clouds , 2018, 2018 21st International Conference on Information Fusion (FUSION).

[4]  S. Haykin Kalman Filtering and Neural Networks , 2001 .

[5]  Bin Yang,et al.  Deep Continuous Fusion for Multi-sensor 3D Object Detection , 2018, ECCV.

[6]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[7]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Avideh Zakhor,et al.  Sensor fusion for semantic segmentation of urban scenes , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Don H. Johnson,et al.  Array Signal Processing: Concepts and Techniques , 1993 .

[11]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[12]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  P. Stoica,et al.  MIMO Radar Signal Processing , 2008 .

[15]  S. Z. Gürbüz,et al.  Deep learning of micro-Doppler features for aided and unaided gait recognition , 2017, 2017 IEEE Radar Conference (RadarConf).

[16]  Youngbae Hwang,et al.  Robust Deep Multi-modal Learning Based on Gated Information Fusion Network , 2018, ACCV.

[17]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[18]  Danfei Xu,et al.  PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Mark A. Richards,et al.  Principles of Modern Radar: Basic Principles , 2013 .

[20]  Urs Schneider,et al.  Micro-doppler based human-robot classification using ensemble and deep learning approaches , 2017, 2018 IEEE Radar Conference (RadarConf18).

[21]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[22]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).