Video Labeling for Automatic Video Surveillance in Security Domains

Beyond traditional security methods, unmanned aerial vehicles (UAVs) have become an important surveillance tool used in security domains to collect the required annotated data. However, collecting annotated data from videos taken by UAVs efficiently, and using these data to build datasets that can be used for learning payoffs or adversary behaviors in game-theoretic approaches and security applications, is an under-explored research question. This paper presents VIOLA, a novel labeling application that includes (i) a workload distribution framework to efficiently gather human labels from videos in a secured manner; (ii) a software interface with features designed for labeling videos taken by UAVs in the domain of wildlife security. We also present the evolution of VIOLA and analyze how the changes made in the development process relate to the efficiency of labeling, including when seemingly obvious improvements did not lead to increased efficiency. VIOLA enables collecting massive amounts of data with detailed information from challenging security videos such as those collected aboard UAVs for wildlife security. VIOLA will lead to the development of new approaches that integrate deep learning for real-time detection and response.

[1]  Bo An,et al.  Deploying PAWS: Field Optimization of the Protection Assistant for Wildlife Security , 2016, AAAI.

[2]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Rong Yan,et al.  Automatically labeling video data using multi-class active learning , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[5]  Kuk-Jin Yoon,et al.  Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Milind Tambe,et al.  Cloudy with a Chance of Poaching: Adversary Behavior Modeling and Forecasting with Real-World Poaching Data , 2017, AAMAS.

[7]  Milind Tambe,et al.  Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .

[8]  T. Basar,et al.  A game theoretic approach to decision and analysis in network intrusion detection , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[9]  Rob Miller,et al.  Generating annotations for how-to videos using crowdsourcing , 2013, CHI Extended Abstracts.

[10]  Carlos García Garino,et al.  An autonomous labeling approach to support vector machines algorithms for network traffic anomaly detection , 2012, Expert Syst. Appl..

[11]  Jarrod C Hodgson,et al.  Precision wildlife monitoring using unmanned aerial vehicles , 2016, Scientific Reports.

[12]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[13]  Milind Tambe,et al.  "A Game of Thrones": When Human Behavior Models Compete in Repeated Stackelberg Security Games , 2015, AAMAS.

[14]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[15]  Roberto Cipolla,et al.  Label propagation in video sequences , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Milind Tambe,et al.  CAPTURE: A New Predictive Anti-Poaching Tool for Wildlife Protection , 2016, AAMAS.

[17]  Ron Artstein,et al.  Crowdsourcing micro-level multimedia annotations: the challenges of evaluation and interface , 2012, CrowdMM '12.

[18]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[19]  Michael Felsberg,et al.  The Visual Object Tracking VOT2015 Challenge Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[20]  Amos Azaria,et al.  Analyzing the Effectiveness of Adversary Modeling in Security Games , 2013, AAAI.

[21]  Deva Ramanan,et al.  Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.

[22]  Ming-Hsuan Yang,et al.  Long-term correlation tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Nicola Basilico,et al.  A Security Game Combining Patrolling and Alarm-Triggered Responses Under Spatial and Detection Uncertainties , 2016, AAAI.

[24]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Gérard G. Medioni,et al.  Moving Object Detection on a Runway Prior to Landing Using an Onboard Infrared Camera , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Daniel Roggen,et al.  Tagging human activities in video by crowdsourcing , 2013, ICMR.

[27]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[28]  Ramakant Nevatia,et al.  Global data association for multi-object tracking using network flows , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.