The ParallelEye Dataset: A Large Collection of Virtual Images for Traffic Vision Research

Dataset plays an essential role in the training and testing of traffic vision algorithms. However, the collection and annotation of images from the real world is time-consuming, labor-intensive, and error-prone. Therefore, more and more researchers have begun to explore the virtual dataset, to overcome the disadvantages of real datasets. In this paper, we propose a systematic method to construct large-scale artificial scenes and collect a new virtual dataset (named “ParallelEye”) for the traffic vision research. The Unity3D rendering software is used to simulate environmental changes in the artificial scenes and generate ground-truth labels automatically, including semantic/instance segmentation, object bounding boxes, and so on. In addition, we utilize ParallelEye in combination with real datasets to conduct experiments. The experimental results show the inclusion of virtual data helps to enhance the per-class accuracy in object detection and semantic segmentation. Meanwhile, it is also illustrated that the virtual data with controllable imaging conditions can be used to design evaluation experiments flexibly.

[1]  Wang Fei-Yue,et al.  Parallel Control: A Method for Data-Driven and Computational Control , 2013 .

[2]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[3]  James M. Rehg,et al.  Joint Semantic Segmentation and 3D Reconstruction from Monocular Video , 2014, ECCV.

[4]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[5]  Ming-Ting Sun,et al.  Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[7]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[8]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[9]  Qiao Wang,et al.  VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[11]  Visvanathan Ramesh,et al.  Adversarially Tuned Scene Generation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Nanning Zheng,et al.  Parallel vision for perception and understanding of complex scenes: methods, framework, and perspectives , 2017, Artificial Intelligence Review.

[13]  Xuan Li,et al.  The ParallelEye Dataset: Constructing Large-Scale Artificial Scenes for Traffic Vision Research , 2017, ArXiv.

[14]  W. Bainbridge The Scientific Research Potential of Virtual Worlds , 2007, Science.

[15]  David Vázquez,et al.  Learning appearance in virtual scenarios for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Wang Fei-Yue,et al.  Parallel imaging: A unified theoretical framework for image generation , 2017, 2017 Chinese Automation Congress (CAC).

[17]  Markus Schoeler,et al.  Semantic Pose Using Deep Networks Trained on Synthetic RGB-D , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Brendan McCane,et al.  On Benchmarking Optical Flow , 2001, Comput. Vis. Image Underst..

[24]  Bernt Schiele,et al.  Articulated people detection and pose estimation: Reshaping the future , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26]  Fei-Yue Wang,et al.  Parallel Control and Management for Intelligent Transportation Systems: Concepts, Architectures, and Applications , 2010, IEEE Transactions on Intelligent Transportation Systems.

[27]  Shannon Mattern Mapping’s Intelligent Agents , 2017 .

[28]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Xuan Li,et al.  Training and testing object detectors with virtual images , 2017, IEEE/CAA Journal of Automatica Sinica.

[30]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.