论文信息 - ResearchDoom and CocoDoom: Learning Computer Vision with Games

ResearchDoom and CocoDoom: Learning Computer Vision with Games

In this short note we introduce ResearchDoom, an implementation of the Doom first-person shooter that can extract detailed metadata from the game. We also introduce the CocoDoom dataset, a collection of pre-recorded data extracted from Doom gaming sessions along with annotations in the MS Coco format. ResearchDoom and CocoDoom can be used to train and evaluate a variety of computer vision methods such as object recognition, detection and segmentation at the level of instances and categories, tracking, ego-motion estimation, monocular depth estimation and scene segmentation. The code and data are available at this http URL

Andrea Vedaldi | Hakan Bilen | João F. Henriques | Aravindh Mahendran

[1] Andrew Zisserman,et al. Faces in Places: compound query retrieval , 2016, BMVC.

[2] Yaroslav Bulatov,et al. Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks , 2013, ICLR.

[3] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[4] Ankush Gupta,et al. Synthetic Data for Text Localisation in Natural Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Alan L. Yuille,et al. UnrealCV: Connecting Computer Vision to Unreal Engine , 2016, ECCV Workshops.

[6] Andrew Zisserman,et al. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.

[7] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[8] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.