When to use what data set for your self-driving car algorithm: An overview of publicly available driving datasets

Data collection on public roads has been deemed a valuable activity along with the development of self-driving vehicles. The vehicle for data collection is typically equipped with a variety of sensors such as camera, LiDAR, radar, GPS, and IMU. The raw data of all sensors is logged on a disk while the vehicle is manually driven. The logged data can be subsequently used for training and testing different algorithms for autonomous driving, e.g., vehicle/pedestrian detection and tracking, SLAM, and motion estimation. Data collection is time-consuming and can sometimes be avoided by directly using existing datasets including sensor data collected by other researchers. A multitude of openly available datasets have been released to foster the research on automated driving. These datasets vary a lot in terms of traffic conditions, application focus, sensor setup, data format, size, tool support, and many other aspects. This paper presents an overview of 27 existing publicly available datasets containing data collected on public roads, compares each other from different perspectives, and provides guidelines for selecting the most suitable dataset for different purposes.

[1]  Julius Ziegler,et al.  StereoScan: Dense 3d reconstruction in real-time , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[2]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[3]  Luc Van Gool,et al.  Robust Multiperson Tracking from a Mobile Platform , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Reinhard Klette,et al.  Towards Ubiquitous Autonomous Driving: The CCSAD Dataset , 2015, CAIP.

[5]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Bernd Jähne,et al.  Outdoor stereo camera system for the generation of real-world benchmark data sets , 2012 .

[7]  Zhifeng Liu Performance Evaluation of Stereo and Motion Analysis on Rectified Image Sequences , 2007 .

[8]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[9]  John K. Tsotsos,et al.  Joint Attention in Autonomous Driving (JAAD) , 2016, ArXiv.

[10]  Sebastian Thrun,et al.  Towards 3D object recognition via classification of arbitrary object tracks , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Stefan Roth,et al.  Stixmantics: A Medium-Level Model for Real-Time Semantic Scene Understanding , 2014, ECCV.

[12]  Nicolas Pugeault,et al.  How Much of Driving Is Preattentive? , 2015, IEEE Transactions on Vehicular Technology.

[13]  Sanjiv Singh,et al.  The 2005 DARPA Grand Challenge: The Great Robot Race , 2007 .

[14]  Sebastian Ramos,et al.  The Cityscapes Dataset , 2015, CVPR 2015.

[15]  Eder Santana,et al.  Learning a Driving Simulator , 2016, ArXiv.

[16]  Stefan K. Gehrig,et al.  Exploiting the Power of Stereo Confidences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Ryan M. Eustice,et al.  Ford Campus vision and lidar data set , 2011, Int. J. Robotics Res..

[19]  Andreas Geiger,et al.  Joint 3D Estimation of Objects and Scene Layout , 2011, NIPS.

[20]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[21]  Michael Felsberg,et al.  A Multi-sensor Traffic Scene Dataset with Omnidirectional Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[22]  Francisco Angel Moreno,et al.  The Málaga urban dataset: High-rate stereo and LiDAR in a realistic urban scenario , 2014, Int. J. Robotics Res..

[23]  Hui Xiong,et al.  A new benchmark for vision-based cyclist detection , 2016, 2016 IEEE Intelligent Vehicles Symposium (IV).

[24]  Andrea Palazzi,et al.  DR(eye)VE: A Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).