Crowdsourcing System for Multi-object Annotation in Surveillance Videos

The collection and labeling of data is a labor-intensive task and this has given rise to a large market for data crowdsourcing transactions. While there are many publicly available video datasets, task-specific data is still scarce and requires Customized annotation services are required. Even with many excellent auxiliary models and tools, video annotation is still a lengthy and time-consuming task. To address these challenges, this paper provides a new and effective annotation method in which the annotator no longer just provides annotations, but also plays the role of a reviewer to review the annotation results of other annotators. This method focuses on surveillance video data, in addition, it also supports adding additional custom tasks (e.g., action tagging, person relationship recognition, video summarization, etc.). And in this paper we mainly consider the additional custom temporal action annotation task. In this paper, we develop rules for filtering frames or segments that need to be re-labeled based on the temporal information of the model inference results and rely on the correlation between target and time to determine the task relevance, and asynchronously assign the task to different annotators for and dynamically portray the ability of the annotators while annotation is in progress, so as to allocate tasks to achieve annotation and mutual review of annotators. We have experimentally demonstrated that this method can reduce costs and improve labeling accuracy.

[1]  Yunhao Du,et al.  StrongSORT: Make DeepSORT Great Again , 2022, IEEE Transactions on Multimedia.

[2]  Vassilis Kostakos,et al.  A Survey on Task Assignment in Crowdsourcing , 2021, ACM Comput. Surv..

[3]  Ping Luo,et al.  ByteTrack: Multi-Object Tracking by Associating Every Detection Box , 2021, ECCV.

[4]  Xiangyang Li,et al.  Crowdsourcing System for Numerical Tasks based on Latent Topic Aware Worker Reliability , 2021, IEEE INFOCOM 2021 - IEEE Conference on Computer Communications.

[5]  Noor M. Al-Shakarji,et al.  Semi-automatic System for Rapid Annotation of Moving Objects in Surveillance Videos using Deep Detection and Multi-object Tracking Techniques , 2020, 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR).

[6]  Francisco Herrera,et al.  Deep Learning in Video Multi-Object Tracking: A Survey , 2019, Neurocomputing.

[7]  Vikas Saxena,et al.  Video annotation tools: A Review , 2018, 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN).

[8]  Hua Yang,et al.  Online Multi-Object Tracking with Dual Matching Attention Networks , 2018, ECCV.

[9]  Hiroyuki Kitagawa,et al.  Skill-and-Stress-Aware Assignment of Crowd-Worker Groups to Task Streams , 2018, HCOMP.

[10]  Fan Yang,et al.  Trajectory Factory: Tracklet Cleaving and Re-Connection by Deep Siamese Bi-GRU for Multiple Object Tracking , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[11]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[12]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Konrad Schindler,et al.  Online Multi-Target Tracking Using Recurrent Neural Networks , 2016, AAAI.

[14]  Yu Liu,et al.  POI: Multiple Object Tracking with High Performance Detection and Appearance Feature , 2016, ECCV Workshops.

[15]  Tahir Nawaz,et al.  ViTBAT: Video tracking and behavior annotation tool , 2016, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[16]  David Gross-Amblard,et al.  Using Hierarchical Skills for Optimized Task Assignment in Knowledge-Intensive Crowdsourcing , 2016, WWW.

[17]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[18]  Anting Shen BeaverDam : Video Annotation Tool for Computer Vision Training Labels , 2016 .

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Paolo Napoletano,et al.  An interactive tool for manual, semi-automatic and automatic video annotation , 2015, Comput. Vis. Image Underst..

[21]  Gang Wang,et al.  Learning deep features for multiple object tracking by using a multi-task learning strategy , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[22]  Qiang Yang,et al.  Cross-task crowdsourcing , 2013, KDD.

[23]  Donald J. Patterson,et al.  Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.

[24]  Chien-Ju Ho,et al.  Online Task Assignment in Crowdsourcing Markets , 2012, AAAI.