A temporal-based deep learning method for multiple objects detection in autonomous driving

This paper proposes a novel vision-based object detection method in autonomous driving, which introduces the temporal information into the deep learning-based detection method for moving object detection. Vision-based object detection is a critical technology for autonomous driving. The objects in the real world such as driving cars, don’t have great changes in their positions and velocities. So the position change of objects between two consecutive frames is not large. This is usually ignored by traditional works, which usually use object detection methods on still-images to detect moving objects. Considering the relationship among consecutive frames (temporal information), we present a robust and real-time tracking method following image detection to refine the object detection results. Based on the three key attributes (distances, sizes and positions), the tracking method aims to build the association between the detected objects on the current frame and those in previous frames. The proposed object detection with temporal information dramatically improves the performance of existing object detection algorithms based on stillimage. With the proposed method, we won the champion in the preceding vehicle detection task in 2017 intelligent vehicle future challenge(2017 IVFC)1.1http://mp.weixin.qq.com/s/IDrTDlJqb2Qx360nhgCXDw

[1]  Amnon Shashua,et al.  Vision-based ACC with a single camera: bounds on range and range rate accuracy , 2003, IEEE IV2003 Intelligent Vehicles Symposium. Proceedings (Cat. No.03TH8683).

[2]  Dongbin Zhao,et al.  A visual attention based convolutional neural network for image classification , 2016, 2016 12th World Congress on Intelligent Control and Automation (WCICA).

[3]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[4]  Frank Dellaert,et al.  MCMC-based particle filtering for tracking a variable number of interacting targets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Dongbin Zhao,et al.  Deep Reinforcement Learning With Visual Attention for Vehicle Classification , 2017, IEEE Transactions on Cognitive and Developmental Systems.

[6]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Luc Van Gool,et al.  Online Multiperson Tracking-by-Detection from a Single, Uncalibrated Camera , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Pascal Fua,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Multiple Object Tracking Using K-shortest Paths Optimization , 2022 .

[10]  Kuk-Jin Yoon,et al.  Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Qichao Zhang,et al.  Multi-task learning for dangerous object detection in autonomous driving , 2017, Inf. Sci..

[12]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Dongbin Zhao,et al.  Multi-task Learning with Cartesian Product-Based Multi-objective Combination for Dangerous Object Detection , 2017, ISNN.

[14]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.