BIRDSAI: A Dataset for Detection and Tracking in Aerial Thermal Infrared Videos

Monitoring of protected areas to curb illegal activities like poaching and animal trafficking is a monumental task. To augment existing manual patrolling efforts, unmanned aerial surveillance using visible and thermal infrared (TIR) cameras is increasingly being adopted. Automated data acquisition has become easier with advances in unmanned aerial vehicles (UAVs) and sensors like TIR cameras, which allow surveillance at night when poaching typically occurs. However, it is still a challenge to accurately and quickly process large amounts of the resulting TIR data. In this paper, we present the first large dataset collected using a TIR camera mounted on a fixed-wing UAV in multiple African protected areas. This dataset includes TIR videos of humans and animals with several challenging scenarios like scale variations, background clutter due to thermal reflections, large camera rotations, and motion blur. Additionally, we provide another dataset with videos synthetically generated with the publicly available Microsoft AirSim simulation platform using a 3D model of an African savanna and a TIR camera model. Through our benchmarking experiments on state-of-the-art detectors, we demonstrate that leveraging the synthetic data in a domain adaptive setting can significantly improve detection performance. We also evaluate various recent approaches for single and multi-object tracking. With the increasing popularity of aerial imagery for monitoring and surveillance purposes, we anticipate this unique dataset to be used to develop and evaluate techniques for object detection, tracking, and domain adaptation for aerial, TIR videos.

[1]  Michael Felsberg,et al.  Generating Visible Spectrum Images from Thermal Infrared , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2]  Michael Felsberg,et al.  The Visual Object Tracking VOT2015 Challenge Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[3]  Luc Van Gool,et al.  Domain Adaptive Faster R-CNN for Object Detection in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Joachim Denzler,et al.  Chimpanzee Faces in the Wild: Log-Euclidean CNNs for Predicting Identities and Attributes of Primates , 2016, GCPR.

[5]  Devis Tuia,et al.  Detecting Mammals in UAV Images: Best Practices to address a substantially Imbalanced Dataset with Deep Learning , 2018, Remote Sensing of Environment.

[6]  T. Leggett,et al.  World wildlife crime report 2016: trafficking in protected species , 2016 .

[7]  Ming-Hsuan Yang,et al.  Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Jin Tang,et al.  RGB-T Object Tracking: Benchmark and Baseline , 2018, Pattern Recognit..

[9]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[10]  Roland Siegwart,et al.  People detection and tracking from aerial thermal views , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Bernard Ghanem,et al.  A Benchmark and Simulator for UAV Tracking , 2016, ECCV.

[12]  Martin Lauer,et al.  UA-DETRAC 2017: Report of AVSS2017 & IWT4S Challenge on Advanced Traffic Monitoring , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Margrit Betke,et al.  A Thermal Infrared Video Benchmark for Visual Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[15]  Pietro Perona,et al.  Recognition in Terra Incognita , 2018, ECCV.

[16]  Bernard Ghanem,et al.  Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications , 2017, International Journal of Computer Vision.

[17]  Wei Wu,et al.  High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Silvio Savarese,et al.  Learning to Track: Online Multi-object Tracking by Decision Making , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Namil Kim,et al.  Multispectral pedestrian detection: Benchmark dataset and baseline , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Milind Tambe,et al.  VIOLA: Video Labeling Application for Security Domains , 2017, GameSec.

[21]  Claire L. Witham,et al.  Automated face recognition of rhesus macaques , 2018, Journal of Neuroscience Methods.

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  S. Tesfamichael,et al.  Comparison of a Fixed-Wing and Multi-Rotor Uav for Environmental Mapping Applications: a Case Study , 2017 .

[24]  Zhenyu He,et al.  Deep convolutional neural networks for thermal infrared object tracking , 2017, Knowl. Based Syst..

[25]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[26]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[27]  David W. Macdonald,et al.  Getting to the core: Internal body temperatures help reveal the ecological function and thermal implications of the lions’ mane , 2016, Ecology and evolution.

[28]  Jie Li,et al.  SPIGAN: Privileged Adversarial Learning from Simulation , 2018, ICLR.

[29]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Dit-Yan Yeung,et al.  Visual Object Tracking for Unmanned Aerial Vehicles: A Benchmark and New Motion Models , 2017, AAAI.

[31]  Zhenyu He,et al.  PTB-TIR: A Thermal Infrared Pedestrian Tracking Benchmark , 2018, IEEE Transactions on Multimedia.

[32]  Bernhard Rinner,et al.  A fast and mobile system for registration of low-altitude visual and thermal aerial images using multiple small-scale UAVs , 2015 .

[33]  Sanjay Kumar Singh,et al.  Visual animal biometrics: survey , 2017, IET Biom..

[34]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[36]  D. McCafferty The value of infrared thermography for research on mammals: previous applications and future directions , 2007 .

[37]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[38]  Peter Christiansen,et al.  Automated Detection and Recognition of Wildlife Using Thermal Cameras , 2014, Sensors.

[39]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Ashish Kapoor,et al.  AirSim-W: A Simulation Environment for Wildlife Conservation with UAVs , 2018, COMPASS.

[42]  Jin Young Choi,et al.  Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Qi Tian,et al.  The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking , 2018, ECCV.

[44]  Jürgen Beyerer,et al.  CNN-based thermal infrared person detection by domain adaptation , 2018, Defense + Security.

[45]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[46]  Shuicheng Yan,et al.  A survey on deep learning-based fine-grained object classification and semantic segmentation , 2017, Int. J. Autom. Comput..

[47]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Gullal Singh Cheema,et al.  Automatic Detection and Recognition of Individuals in Patterned Species , 2017, ECML/PKDD.

[49]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Jacques Klein,et al.  The NOAH Project: Giving a Chance to Threatened Species in Africa with UAVs , 2013, AFRICOMM.

[51]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[52]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Tobias Senst,et al.  Extending IOU Based Multi-Object Tracking by Visual Information , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[54]  Tanya Y. Berger-Wolf,et al.  HotSpotter — Patterned species instance recognition , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[55]  Xinkai Wu,et al.  Pedestrian Detection and Tracking from Low-Resolution Unmanned Aerial Vehicle Thermal Imagery , 2016, Sensors.

[56]  Michael Felsberg,et al.  A thermal Object Tracking benchmark , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[57]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  John R. Schott,et al.  Remote Sensing: The Image Chain Approach , 1996 .

[59]  Qiao Wang,et al.  VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).