Collecting and Processing a Self-Driving Dataset in the UPB Campus

Although there is a diversity of publicly available datasets for autonomous driving, from small-scale to larger collections with thousands of miles of driving, we consider that the process of collecting and processing them is often overlooked in the literature. From a data-driven perspective, quality of a dataset has proven as important as quantity especially when evaluating self-driving technologies where safety is crucial. In this paper, we provide a guideline going through all the steps from configuring the hardware setup to obtaining a clean dataset. We describe the data collection scenario design, the hardware and software employed in the process, the challenges that must be considered, data filtering and validation stage. This work stems from our experience in collecting the UPB campus driving dataset released together with this work. It is our belief that having a clean and efficient process of collecting a small but meaningful dataset has the potential to improve benchmarking autonomous driving solutions, capturing local environment particularities.

[1]  Luc Van Gool,et al.  Failure Prediction for Autonomous Driving , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[2]  Ruigang Yang,et al.  The ApolloScape Open Dataset for Autonomous Driving and Its Application , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Bryan Reimer,et al.  MIT Autonomous Vehicle Technology Study: Large-Scale Deep Learning Based Analysis of Driver Behavior and Interaction with Automation , 2017 .

[4]  Raquel Frizera Vassallo,et al.  A Low Cost Sensors Approach for Accurate Vehicle Localization and Autonomous Driving Application , 2017, Sensors.

[5]  Qing Wang,et al.  End-to-end driving simulation via angle branched network , 2018, ArXiv.

[6]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Mohak Shah,et al.  Is it Safe to Drive? An Overview of Factors, Challenges, and Datasets for Driveability Assessment in Autonomous Driving , 2018, ArXiv.

[9]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[10]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[11]  Cewu Lu,et al.  LiDAR-Video Driving Dataset: Learning Driving Policies Effectively , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[13]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[14]  Yadong Mu,et al.  Deep Steering: Learning End-to-End Driving Model from Spatial and Temporal Visual Cues , 2017, ArXiv.

[15]  Lander Usategui San Juan,et al.  Time-Sensitive Networking for robotics , 2018, ArXiv.

[16]  Jiman Kim,et al.  End-To-End Ego Lane Estimation Based on Sequential Transfer Learning for Self-Driving Cars , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Kate Saenko,et al.  Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Yang Gao,et al.  End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).