论文信息 - Accurate, Low-Latency Visual Perception for Autonomous Racing: Challenges, Mechanisms, and Practical Solutions

Accurate, Low-Latency Visual Perception for Autonomous Racing: Challenges, Mechanisms, and Practical Solutions

Autonomous racing provides the opportunity to test safety-critical perception pipelines at their limit. This paper describes the practical challenges and solutions to applying state-of-the-art computer vision algorithms to build a low-latency, high-accuracy perception system for DUT18 Driverless (DUT18D), a 4WD electric race car with podium finishes at all Formula Driverless competitions for which it raced. The key components of DUT18D include YOLOv3-based object detection, pose estimation, and time synchronization on its dual stereovision/monovision camera setup. We highlight modifications required to adapt perception CNNs to racing domains, improvements to loss functions used for pose estimation, and methodologies for sub-microsecond camera synchronization among other improvements. We perform a thorough experimental evaluation of the system, demonstrating its accuracy and low-latency in real-world racing scenarios.

[1] Sinisa Todorovic,et al. Monocular Depth Estimation Using Neural Regression Forest , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Antonio Bicchi,et al. Towards the Design of Robotic Drivers for Full-Scale Self-Driving Racing Cars , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[3] Heechul Yun,et al. DeepPicar: A Low-Cost Deep Neural Network-Based Autonomous Car , 2017, 2018 IEEE 24th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA).

[4] E. William Yund,et al. Factors influencing the latency of simple reaction time , 2015, Front. Hum. Neurosci..

[5] Zheng Luo,et al. Driving Scene Perception Network: Real-Time Joint Detection, Depth Estimation and Semantic Segmentation , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7] Sertac Karaman,et al. FastDepth: Fast Monocular Depth Estimation on Embedded Systems , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[8] Ali Farhadi,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.

[9] Gustavo Carneiro,et al. Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue , 2016, ECCV.

[10] Ian D. Reid,et al. Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] Ankit Dhall. Real-time 3D Pose Estimation with a Monocular Camera Using Deep Learning and Object Priors On an Autonomous Racecar , 2018, ArXiv.

[12] Geoffrey C. Fox,et al. Real-Time, Cloud-Based Object Detection for Unmanned Aerial Vehicles , 2017, 2017 First IEEE International Conference on Robotic Computing (IRC).

[13] Skanda Koppula. Learning a CNN-based End-to-End Controller for a Formula SAE Racecar , 2017, ArXiv.

[14] Alexander Liniger,et al. Learning-Based Model Predictive Control for Autonomous Racing , 2019, IEEE Robotics and Automation Letters.

[15] Ashish Kapoor,et al. Explorations and Lessons Learned in Building an Autonomous Formula SAE Car from Simulations , 2019, SIMULTECH.

[16] Liang Lin,et al. Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[17] Wolfram Burgard,et al. Fast and accurate SLAM with Rao-Blackwellized particle filters , 2007, Robotics Auton. Syst..

[18] Renaud Dubé,et al. AMZ Driverless: The full autonomous racing system , 2019, J. Field Robotics.

[19] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20] Renaud Dubé,et al. Redundant Perception and State Estimation for Reliable Autonomous Racing , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[21] Jun Ni,et al. Path following control for autonomous formula racecar: Autonomous formula student competition , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[22] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[23] Sebastian Thrun,et al. Stanley: The robot that won the DARPA Grand Challenge , 2006, J. Field Robotics.

[24] Paolo Valigi,et al. Fast robust monocular depth estimation for Obstacle Detection with fully convolutional networks , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25] Christos-Savvas Bouganis,et al. DroNet: Efficient convolutional neural network detector for real-time UAV applications , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[26] Alexander Hofmann,et al. Design of an Autonomous Race Car for the Formula Student Driverless ( FSD ) , 2017 .

[27] Xiaowei Zhou,et al. Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Seyed Majid Azimi,et al. ShuffleDet: Real-Time Vehicle Detection Network in On-board Embedded UAV Imagery , 2018, ECCV Workshops.

[29] Eugenio Culurciello,et al. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.

[30] Ming Liu,et al. See the Future: A Semantic Segmentation Network Predicting Ego-Vehicle Trajectory With a Single Monocular Camera , 2020, IEEE Robotics and Automation Letters.

[31] Sebastian Thrun,et al. FastSLAM 2.0: an improved particle filtering algorithm for simultaneous localization and mapping that provably converges , 2003, IJCAI 2003.

[32] Yong-Sheng Chen,et al. Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33] Forrest N. Iandola,et al. SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34] Oisin Mac Aodha,et al. Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Zhen He,et al. Numerical Coordinate Regression with Convolutional Neural Networks , 2018, ArXiv.

[36] Luc Van Gool,et al. Real-time 3D Traffic Cone Detection for Autonomous Driving , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[37] Christoph Hermes,et al. 3D pose estimation of vehicles using a stereo camera , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[38] Renaud Dubé,et al. Design of an Autonomous Racecar: Perception, State Estimation and System Integration , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[39] Rob Fergus,et al. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[40] Dacheng Tao,et al. Deep Ordinal Regression Network for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41] N. Kaempchen,et al. Stereo vision based pose estimation of parking lots using 3D vehicle models , 2002, Intelligent Vehicle Symposium, 2002. IEEE.

[42] Senthil Yogamani,et al. Real-time Joint Object Detection and Semantic Segmentation Network for Automated Driving , 2019, ArXiv.

[43] Davide Scaramuzza,et al. How Fast Is Too Fast? The Role of Perception Latency in High-Speed Sense and Avoid , 2019, IEEE Robotics and Automation Letters.

[44] Bin Yang,et al. Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45] Ashutosh Saxena,et al. Depth Estimation Using Monocular and Stereo Cues , 2007, IJCAI.