论文信息 - Semi-Automatic Annotation of Objects in Visual-Thermal Video

Semi-Automatic Annotation of Objects in Visual-Thermal Video

Deep learning requires large amounts of annotated data. Manual annotation of objects in video is, regardless of annotation type, a tedious and time-consuming process. In particular, for scarcely used image modalities human annotation is hard to justify. In such cases, semi-automatic annotation provides an acceptable option. In this work, a recursive, semi-automatic annotation method for video is presented. The proposed method utilizes a state-of-the-art video object segmentation method to propose initial annotations for all frames in a video based on only a few manual object segmentations. In the case of a multi-modal dataset, the multi-modality is exploited to refine the proposed annotations even further. The final tentative annotations are presented to the user for manual correction. The method is evaluated on a subset of the RGBT-234 visual-thermal dataset reducing the workload for a human annotator with approximately 78% compared to full manual annotation. Utilizing the proposed pipeline, sequences are annotated for the VOT-RGBT 2019 challenge.

[1] Alexander G. Schwing,et al. VideoMatch: Matching based Video Object Segmentation , 2018, ECCV.

[2] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[3] Jiri Matas,et al. Pixel-Wise Object Segmentations for the VOT 2016 Dataset , 2017 .

[4] A. Berg,et al. Detection and Tracking in Thermal Infrared Imagery , 2016 .

[5] Kristin Branson,et al. JAABA: interactive machine learning for automatic annotation of animal behavior , 2013, Nature Methods.

[6] Antonio Torralba,et al. LabelMe video: Building a video database with human annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7] Zhenyu He,et al. The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results , 2016, ECCV Workshops.

[8] Paolo Napoletano,et al. An interactive tool for manual, semi-automatic and automatic video annotation , 2015, Comput. Vis. Image Underst..

[9] Andreas Geiger,et al. MOTS: Multi-Object Tracking and Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Luc Van Gool,et al. Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] David S. Doermann,et al. Tools and techniques for video performance evaluation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[12] Tahir Nawaz,et al. ViTBAT: Video tracking and behavior annotation tool , 2016, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[13] Luc Van Gool,et al. One-Shot Video Object Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Sanja Fidler,et al. Annotating Object Instances with a Polygon-RNN , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Bastian Leibe,et al. Online Adaptation of Convolutional Neural Networks for Video Object Segmentation , 2017, BMVC.

[16] Michael Felsberg,et al. A Generative Appearance Model for End-To-End Video Object Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Martin Jägersand,et al. ByLabel: A Boundary Based Semi-Automatic Image Annotation Tool , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18] Wei Liu,et al. CNN in MRF: Video Object Segmentation via Inference in a CNN-Based Higher-Order Spatio-Temporal MRF , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19] Shu Wang,et al. Multispectral Deep Neural Networks for Pedestrian Detection , 2016, BMVC.

[20] Michael Felsberg,et al. The Visual Object Tracking VOT2017 Challenge Results , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[21] Luc Van Gool,et al. The 2017 DAVIS Challenge on Video Object Segmentation , 2017, ArXiv.

[22] Jian Zhao,et al. Human segmentation by geometrically fusing visible-light and thermal imageries , 2012, Multimedia Tools and Applications.

[23] Xiaoxiao Li,et al. Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation , 2018, ECCV.

[24] Frank Keller,et al. Extreme Clicking for Efficient Object Annotation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25] Sanja Fidler,et al. Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++ , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26] Ming-Hsuan Yang,et al. SegFlow: Joint Learning for Video Object Segmentation and Optical Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27] Aggelos K. Katsaggelos,et al. Efficient Video Object Segmentation via Network Modulation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28] Michael Felsberg,et al. The Sixth Visual Object Tracking VOT2018 Challenge Results , 2018, ECCV Workshops.

[29] Gang Wang,et al. Motion-Guided Cascaded Refinement Network for Video Object Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30] Chung-Ta King,et al. A Semi-Automatic Video Labeling Tool for Autonomous Driving Based on Multi-Object Detector and Tracker , 2018, 2018 Sixth International Symposium on Computing and Networking (CANDAR).

[31] Jiayi Ma,et al. Infrared and visible image fusion methods and applications: A survey , 2018, Inf. Fusion.

[32] Yizhou Wang,et al. Video Object Segmentation by Learning Location-Sensitive Embeddings , 2018, ECCV.

[33] Rustam Stolkin,et al. Particle Filter Tracking of Camouflaged Targets by Adaptive Fusion of Thermal and Visible Spectra Camera Data , 2014, IEEE Sensors Journal.

[34] Kalyan Sunkavalli,et al. Fast Video Object Segmentation by Reference-Guided Mask Propagation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35] Haitao Yu,et al. A Semi-Automatic Annotation Technology for Traffic Scene Image Labeling Based on Deep Learning Preprocessing , 2017, 22017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC).

[36] Zhenyu He,et al. The Visual Object Tracking VOT2016 Challenge Results , 2016, ECCV Workshops.

[37] Santiago Manen,et al. PathTrack: Fast Trajectory Annotation with Path Supervision , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38] Heikki Huttunen,et al. Faster Bounding Box Annotation for Object Detection in Indoor Scenes , 2018, 2018 7th European Workshop on Visual Information Processing (EUVIP).

[39] Antonio Torralba,et al. Notes on image annotation , 2012, ArXiv.

[40] Patrícia J. Bota,et al. A Semi-Automatic Annotation Approach for Human Activity Recognition , 2019, Sensors.

[41] Bernt Schiele,et al. Learning Video Object Segmentation from Static Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42] CioccaGianluigi,et al. An interactive tool for manual, semi-automatic and automatic video annotation , 2015 .

[43] Michael Felsberg,et al. A thermal Object Tracking benchmark , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[44] Jin Tang,et al. RGB-T Object Tracking: Benchmark and Baseline , 2018, Pattern Recognit..

[45] Jun-Sik Kim,et al. Pixel-Level Matching for Video Object Segmentation Using Convolutional Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46] K.-K. Maninis,et al. Video Object Segmentation without Temporal Information , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.