Synscapes: A Photorealistic Synthetic Dataset for Street Scene Parsing

We introduce Synscapes -- a synthetic dataset for street scene parsing created using photorealistic rendering techniques, and show state-of-the-art results for training and validation as well as new types of analysis. We study the behavior of networks trained on real data when performing inference on synthetic data: a key factor in determining the equivalence of simulation environments. We also compare the behavior of networks trained on synthetic data and evaluated on real-world data. Additionally, by analyzing pre-trained, existing segmentation and detection models, we illustrate how uncorrelated images along with a detailed set of annotations open up new avenues for analysis of computer vision systems, providing fine-grain information about how a model's performance changes according to factors such as distance, occlusion and relative object orientation.

[1]  Vladlen Koltun,et al.  Playing for Benchmarks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[3]  Roberto Cipolla,et al.  MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving , 2016, 2018 IEEE Intelligent Vehicles Symposium (IV).

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  James T. Kajiya,et al.  The rendering equation , 1998 .

[6]  Qiao Wang,et al.  VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[10]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Magnus Wrenninge,et al.  Procedural Modeling and Physically Based Rendering for Synthetic Data Generation in Automotive Applications , 2017, ArXiv.

[12]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[13]  Zhenyi Liu,et al.  Optimizing Image Acquisition Systems for Autonomous Driving , 2018, Photography, Mobile, and Immersive Imaging.

[14]  Bastian Leibe,et al.  Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).