MS-YOLO: Object Detection Based on YOLOv5 Optimized Fusion Millimeter-Wave Radar and Machine Vision

Millimeter-wave radar and machine vision are both important means for intelligent vehicles to perceive the surrounding environment. Aiming at the problem of multi-sensor fusion, this paper proposes the object detection method of millimeter-wave radar and vision fusion. Radar and camera complement each other, and radar data fusion in machine vision network can effectively reduce the rate of missed detection under insufficient light conditions, and improve the accuracy of remote small object detection. The radar information is processed by mapping transformation neural network to obtain the mask map, so that radar information and visual information in the same scale. A multi-data source deep learning object detection network (MS-YOLO) based on millimeter-wave radar and vision fusion was proposed. Homemade datasets were used for training and testing. This maximized the use of sensor information and improved the detection accuracy under the premise of ensuring the detection speed. Compared with the original YOLOv5 (the fifth version of the You Only Look Once) network, the results show that the MS-YOLO network meets the accuracy requirements better. Among the models, the large model of MS-YOLO has the highest accuracy with an mAP reaching 0.888. The small model of MS-YOLO has good accuracy and speed, and the mAP reaches 0.841 while maintaining a high frame rate of 65 fps.