论文信息 - Training and testing object detectors with virtual images

Training and testing object detectors with virtual images

In the area of computer vision, deep learning has produced a variety of state-of-the-art models that rely on massive labeled data. However, collecting and annotating images from the real world is too demanding in terms of labor and money investments, and is usually inflexible to build datasets with specific characteristics, such as small area of objects and high occlusion level. Under the framework of Parallel Vision, this paper presents a purposeful way to design artificial scenes and automatically generate virtual images with precise annotations. A virtual dataset named ParallelEye is built, which can be used for several computer vision tasks. Then, by training the DPM U+0028 Deformable parts model U+0029 and Faster R-CNN detectors, we prove that the performance of models can be significantly improved by combining ParallelEye with publicly available real-world datasets during the training phase. In addition, we investigate the potential of testing the trained models from a specific aspect using intentionally designed virtual datasets, in order to discover the flaws of trained models. From the experimental results, we conclude that our virtual dataset is viable to train and test the object detectors.

[1] Vladlen Koltun,et al. Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[2] Wang Fei-Yue,et al. Parallel Control: A Method for Data-Driven and Computational Control , 2013 .

[3] Antonio M. López,et al. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Jiaolong Xu,et al. Learning a Part-Based Pedestrian Detector in a Virtual World , 2014, IEEE Transactions on Intelligent Transportation Systems.

[5] Nanning Zheng,et al. Parallel learning: a perspective and a framework , 2017, IEEE/CAA Journal of Automatica Sinica.

[6] Fei-Yue Wang,et al. A Multi-view Learning Approach to Foreground Detection for Traffic Surveillance Applications , 2016, IEEE Transactions on Vehicular Technology.

[7] Helmut Prendinger,et al. Tokyo Virtual Living Lab: Designing Smart Cities Based on the 3D Internet , 2013, IEEE Internet Computing.

[8] Yanjie Yao,et al. Vehicle License Plate Recognition Based on Extremal Regions and Restricted Boltzmann Machines , 2016, IEEE Transactions on Intelligent Transportation Systems.

[9] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[10] Yanjie Yao,et al. Video-Based Vehicle Detection Approach with Data-Driven Adaptive Neuro-Fuzzy Networks , 2015, Int. J. Pattern Recognit. Artif. Intell..

[11] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[12] Qiao Wang,et al. VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Fei-Yue Wang,et al. Generative adversarial networks: introduction and outlook , 2017, IEEE/CAA Journal of Automatica Sinica.

[14] Wang Fei-Yue,et al. Parallel imaging: A unified theoretical framework for image generation , 2017, 2017 Chinese Automation Congress (CAC).

[15] Antonio Torralba,et al. Evaluation of image features using a photorealistic virtual world , 2011, 2011 International Conference on Computer Vision.

[16] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Kate Saenko,et al. From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains , 2014, BMVC.

[18] Dayong Shen,et al. Visual Tracking Based on Dynamic Coupled Conditional Random Field Model , 2016, IEEE Transactions on Intelligent Transportation Systems.

[19] Kate Saenko,et al. Learning Deep Object Detectors from 3D Models , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[20] Liuqing Yang,et al. Where does AlphaGo go: from church-turing thesis to AlphaGo thesis and beyond , 2016, IEEE/CAA Journal of Automatica Sinica.

[21] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Wang Feiyue,et al. Parallel system methods for management and control of complex systems , 2004 .

[23] Xuan Li,et al. The ParallelEye Dataset: Constructing Large-Scale Artificial Scenes for Traffic Vision Research , 2017, ArXiv.

[24] Kunfeng Wang,et al. Measuring Driving Behaviors from Live Video , 2012, IEEE Intelligent Systems.

[25] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[26] W. Bainbridge. The Scientific Research Potential of Virtual Worlds , 2007, Science.

[27] David Vázquez,et al. Learning appearance in virtual scenarios for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[29] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30] Li Li,et al. Steps toward Parallel Intelligence , 2016 .

[31] Yoshua Bengio,et al. Generative Adversarial Networks , 2014, ArXiv.

[32] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33] Nanning Zheng,et al. Parallel vision for perception and understanding of complex scenes: methods, framework, and perspectives , 2017, Artificial Intelligence Review.

[34] Fei-Yue Wang,et al. Parallel Control and Management for Intelligent Transportation Systems: Concepts, Architectures, and Applications , 2010, IEEE Transactions on Intelligent Transportation Systems.

[35] Jie Zhang,et al. PDP: parallel dynamic programming , 2017, IEEE CAA J. Autom. Sinica.