UnrealStereo: A Synthetic Dataset for Analyzing Stereo Vision

Stereo algorithm is important for robotics applications, such as quadcopter and autonomous driving. It needs to be robust enough to handle images of challenging conditions, such as raining or strong lighting. Textureless and specular regions of these images make feature matching difficult and smoothness assumption invalid. It is important to understand whether an algorithm is robust to these hazardous regions. Many stereo benchmarks have been developed to evaluate the performance and track progress. But it is not easy to quantize the effect of these hazardous regions. In this paper, we develop a synthetic image generation tool and build a benchmark with synthetic images. First, we manually tweak hazardous factors in a virtual world, such as making objects more specular or transparent, to simulate corner cases to test the robustness of stereo algorithms. Second, we use ground truth information, such as object mask, material property, to automatically identify hazardous regions and evaluate the accuracy of these regions. Our tool is based on a popular game engine Unreal Engine 4 and will be open-source. Many publicly available realistic game contents can be used by our tool which can provide an enormous resource for algorithm development and evaluation.

[1]  H. Hirschmüller Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information , 2005, CVPR.

[2]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ying Xiong,et al.  Low-level vision by consensus in a spatial hierarchy of regions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Qiao Wang,et al.  VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[7]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[9]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Enhua Wu,et al.  Constant Time Weighted Median Filtering for Stereo Matching and Beyond , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Ali Borji,et al.  iLab-20M: A Large-Scale Controlled Object Dataset to Investigate Deep Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Oliver Zendel,et al.  CV-HAZOP: Introducing Test Data Validation for Computer Vision , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[15]  Andrew J. Chosak,et al.  OVVV: Using Virtual Worlds to Design and Evaluate Surveillance Systems , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Alan L. Yuille,et al.  UnrealCV: Connecting Computer Vision to Unreal Engine , 2016, ECCV Workshops.

[17]  Andreas Geiger,et al.  Displets: Resolving stereo ambiguities using object knowledge , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Andreas Geiger,et al.  Efficient Large-Scale Stereo Matching , 2010, ACCV.

[19]  Raquel Urtasun,et al.  Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation , 2014, ECCV.

[20]  Yann LeCun,et al.  Computing the stereo matching cost with a convolutional neural network , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.