Evaluating Validity of Synthetic Data in Perception Tasks for Autonomous Vehicles

Autonomous vehicles have the potential to completely upend the way we transport today, however deploying them safely at scale is not an easy task. Any autonomous driving system relies on multiple layers of software to function safely. Among these layers, the Perception layer is the most data intensive and also the most complex layer to get right. Companies need to collect and annotate lots of data to properly train deep learning perception models. Simulation systems have come up as an alternative to the expensive task of data collection and annotation. However, whether simulated data can be used as a proxy for real-world data is an ongoing debate. In this work, we attempt to address the question of whether models trained on simulated data can generalize well to the real-world. We collect datasets based on two different simulators with varying levels of graphics fidelity and use the KITTI dataset as an example of real- world data. We train three separate deep learning based object detection models on each of these datasets, and compare their performance on test sets collected from the same sources. We also add the recently released Waymo Open Dataset as a challenging test set. Performance is evaluated based on the mean average precision (mAP) metric for object detection. We find that training on simulation in general does not translate to generalizability on real-world data and that diversity in the training set is much more important than visual graphics’ fidelity.

[1]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jana Kosecka,et al.  3D Bounding Box Estimation Using Deep Learning and Geometry , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Cesare Pautasso,et al.  Mean Average Precision , 2009, Encyclopedia of Database Systems.

[5]  Biswarup Bhattacharya,et al.  SAD-GAN: Synthetic Autonomous Driving using Generative Adversarial Networks , 2016, ArXiv.

[6]  Martin Lauer,et al.  Team AnnieWAY's Entry to the 2011 Grand Cooperative Driving Challenge , 2012, IEEE Transactions on Intelligent Transportation Systems.

[7]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Bo Li,et al.  Monocular Depth Estimation with Hierarchical Fusion of Dilated CNNs and Soft-Weighted-Sum Inference , 2017, Pattern Recognit..

[9]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[10]  Qiao Wang,et al.  VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Andreas Geiger,et al.  Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes , 2017, International Journal of Computer Vision.

[12]  Dongbing Gu,et al.  UnDeepVO: Monocular Visual Odometry Through Unsupervised Deep Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Xiaolin Hu,et al.  UnrealStereo: A Synthetic Dataset for Analyzing Stereo Vision , 2016, ArXiv.