Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications

We present a photo-realistic training and evaluation simulator (Sim4CV) (http://www.sim4cv.org) with extensive applications across various fields of computer vision. Built on top of the Unreal Engine, the simulator integrates full featured physics based cars, unmanned aerial vehicles (UAVs), and animated human actors in diverse urban and suburban 3D environments. We demonstrate the versatility of the simulator with two case studies: autonomous UAV-based tracking of moving objects and autonomous driving using supervised learning. The simulator fully integrates both several state-of-the-art tracking algorithms with a benchmark evaluation tool and a deep neural network architecture for training vehicles to drive autonomously. It generates synthetic photo-realistic datasets with automatic ground truth annotations to easily extend existing real-world datasets and provides extensive synthetic data variety through its ability to reconfigure synthetic worlds on the fly using an automatic world generation tool.

[1]  Kam L. Wong Analysis or synthesis , 1985 .

[2]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[3]  Christos Dimitrakakis,et al.  TORCS, The Open Racing Car Simulator , 2005 .

[4]  Yann LeCun,et al.  Off-Road Obstacle Avoidance through End-to-End Learning , 2005, NIPS.

[5]  Robert T. Collins,et al.  An Open Source Tracking Testbed and Evaluation Web Site , 2005 .

[6]  Serge J. Belongie,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  David Vázquez,et al.  Learning appearance in virtual scenarios for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Toby P. Breckon,et al.  Real-time people and vehicle detection from UAV imagery , 2011, Electronic Imaging.

[10]  Widyawardana Adiprawita,et al.  Hardware‐in‐the‐loop simulation for visual target tracking of octorotor UAV , 2011 .

[11]  Peter V. Gehler,et al.  Teaching 3D geometry to deformable part models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Jeremiah Neubert,et al.  On-Board Visual Tracking with Unmanned Aircraft System (UAS) , 2011, ArXiv.

[13]  Matthew E. Antone,et al.  Detecting and tracking all moving objects in wide-area aerial video , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[14]  Jessica B. Hamrick,et al.  Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[15]  Jürgen Schmidhuber,et al.  Evolving large-scale neural networks for vision-based reinforcement learning , 2013, GECCO '13.

[16]  Jun-yong Noh,et al.  Data-driven control of flapping flight , 2013, TOGS.

[17]  Pascual Campoy Cervera,et al.  Vision based GPS-denied Object Tracking and following for unmanned aerial vehicles , 2013, 2013 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).

[18]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Daniel Cremers,et al.  FollowMe: Person following and gesture recognition with a quadrocopter , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Karl A. Stol,et al.  On-board object tracking control of a quadcopter with monocular vision , 2014, 2014 International Conference on Unmanned Aircraft Systems (ICUAS).

[21]  Jürgen Schmidhuber,et al.  Online Evolution of Deep Convolutional Network for Vision-Based Reinforcement Learning , 2014, SAB.

[22]  Roland Siegwart,et al.  People detection and tracking from aerial thermal views , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Helmut Grabner,et al.  Aerial object tracking from an airborne platform , 2014, 2014 International Conference on Unmanned Aircraft Systems (ICUAS).

[24]  Yaser Sheikh,et al.  3D Pose-by-Detection of Vehicles via Discriminatively Reduced Ensembles of Correlation Filters , 2014, BMVC.

[25]  Sehoon Ha,et al.  Iterative Training of Dynamic Skills Inspired by Human Coaching Techniques , 2014, ACM Trans. Graph..

[26]  Deva Ramanan,et al.  Analysis by Synthesis: 3D Object Recognition by Object Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Miguel A. Olivares-Méndez,et al.  Robust real-time vision-based aircraft tracking from Unmanned Aerial Vehicles , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[29]  C. Karen Liu,et al.  Learning bicycle stunts , 2014, ACM Trans. Graph..

[30]  Gérard G. Medioni,et al.  Persistent Tracking for Wide Area Aerial Surveillance , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Jaakko Lehtinen,et al.  Online motion synthesis using sequential Monte Carlo , 2014, ACM Trans. Graph..

[33]  Michael Felsberg,et al.  Learning Spatially Regularized Correlation Filters for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  C. Karen Liu,et al.  Online control of simulated humanoids using particle belief propagation , 2015, ACM Trans. Graph..

[35]  Bambang Riyanto Trilaksono,et al.  Hardware in-the-loop simulation for visual servoing of fixed wing UAV , 2015, 2015 International Conference on Electrical Engineering and Informatics (ICEEI).

[36]  Markus Schoeler,et al.  Semantic Pose Using Deep Networks Trained on Synthetic RGB-D , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Sudipta N. Sinha,et al.  Monocular Localization of a moving person onboard a Quadrotor MAV , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Tsuhan Chen,et al.  Deep Neural Network for Real-Time Autonomous Indoor Navigation , 2015, ArXiv.

[39]  Jianxiong Xiao,et al.  DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Erik Blasch,et al.  Encoding color information for visual tracking: Algorithms and benchmark , 2015, IEEE Transactions on Image Processing.

[41]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[42]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[43]  Roland Siegwart,et al.  RotorS—A Modular Gazebo MAV Simulator Framework , 2016 .

[44]  K. Madhava Krishna,et al.  DeepFly: towards complete autonomous navigation of MAVs with monocular camera , 2016, ICVGIP '16.

[45]  Bernard Ghanem,et al.  A Benchmark and Simulator for UAV Tracking , 2016, ECCV.

[46]  Luc Van Gool,et al.  Feature article: Robust Aerial Object Tracking from an Airborne platform , 2016, IEEE Aerospace and Electronic Systems Magazine.

[47]  Bernard Ghanem,et al.  Persistent Aerial Tracking system for UAVs , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[48]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[49]  Michael Felsberg,et al.  Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking , 2016, ECCV.

[50]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[51]  Shuicheng Yan,et al.  NUS-PRO: A New Visual Tracking Challenge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Rob Fergus,et al.  Learning Physical Intuition of Block Towers by Example , 2016, ICML.

[54]  Qiao Wang,et al.  VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[56]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[57]  Nikolai Smolyanskiy,et al.  Toward low-flying autonomous MAV trail navigation using deep neural networks for environmental awareness , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[58]  Bernard Ghanem,et al.  Context-Aware Correlation Filter Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Antonio Manuel López Peña,et al.  Procedural Generation of Videos to Train Deep Action Recognition Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[61]  Janet Elizabeth Hope Open Source , 2017, Encyclopedia of GIS.

[62]  Yi Zhang,et al.  UnrealCV: Virtual Worlds for Computer Vision , 2017, ACM Multimedia.

[63]  Patrick Doherty,et al.  Deep Learning Quadcopter Control via Risk-Aware Active Learning , 2017, AAAI.

[64]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.