CNN-based single object detection and tracking in videos and its application to drone detection

This paper presents convolutional neural network (CNN)-based single object detection and tracking algorithms. CNN-based object detection methods are directly applicable to static images, but not to videos. On the other hand, model-free visual object tracking methods cannot detect an object until a ground truth bounding box of the target is provided. Moreover, many annotated video datasets of the target object are required to train both the object detectors and visual trackers. In this work, three simple yet effective object detection and tracking algorithms for videos are proposed to efficiently combine a state-of-the-art object detector and visual tracker for circumstances in which only a few static images of the target are available for training. The proposed algorithms are tested using a drone detection task and the experimental results demonstrated their effectiveness.

[1]  Xiaogang Wang,et al.  Object Detection from Video Tubelets with Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Xin Pan,et al.  YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Xiao Yan Wu,et al.  A hand gesture recognition algorithm based on DC-CNN , 2019, Multimedia Tools and Applications.

[5]  Jiming Chen,et al.  Anti-Drone System with Multiple Surveillance Technologies: Architecture, Implementation, and Challenges , 2018, IEEE Communications Magazine.

[6]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Takeshi Hatanaka,et al.  Visual Surveillance of Human Activities via Gradient-Based Coverage Control on Matrix Manifolds , 2020, IEEE Transactions on Control Systems Technology.

[8]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[9]  Cemal Aker,et al.  Using deep networks for drone detection , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[10]  Petros Daras,et al.  Drone-vs-Bird Detection Challenge at IEEE AVSS2019 , 2019, 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[11]  Sameer Alam,et al.  Detection, Tracking and Classification of Aircraft and Drones in Digital Towers Using Machine Learning on Motion Patterns , 2019, 2019 Integrated Communications, Navigation and Surveillance Conference (ICNS).

[12]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Menglong Zhu,et al.  Mobile Video Object Detection with Temporally-Aware Feature Maps , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  P. Fua,et al.  Detecting Flying Objects Using a Single Moving Camera , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Wei Wu,et al.  High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Sandeep Saini,et al.  Image processing-based intelligent robotic system for assistance of agricultural crops , 2019, Int. J. Soc. Humanist. Comput..

[19]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Mohsen Guizani,et al.  Deep CNN-Based Real-Time Traffic Light Detector for Self-Driving Vehicles , 2020, IEEE Transactions on Mobile Computing.

[21]  Yong Wang,et al.  Robust visual tracking via a hybrid correlation filter , 2019, Multimedia Tools and Applications.

[22]  Sandeep Saini,et al.  Image processing-based intelligent robotic system for assistance of agricultural crops , 2019 .

[23]  Jihun Park,et al.  A comparison of convolutional object detectors for real-time drone tracking using a PTZ camera , 2017, 2017 17th International Conference on Control, Automation and Systems (ICCAS).

[24]  Qing Li,et al.  Action recognition from depth sequence using depth motion maps-based local ternary patterns and CNN , 2019, Multimedia Tools and Applications.

[25]  Fathi E. Abd El-Samie,et al.  A real-time approach for automatic defect detection from PCBs based on SURF features and morphological operations , 2019, Multimedia Tools and Applications.

[26]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Longtao Chen,et al.  Grid-based multi-object tracking with Siamese CNN based appearance edge and access region mechanism , 2020, Multimedia Tools and Applications.

[28]  Jiri Matas,et al.  Discriminative Correlation Filter with Channel and Spatial Reliability , 2017, CVPR.

[29]  Junzhi Yu,et al.  Temporally Identity-Aware SSD With Attentional LSTM , 2018, IEEE Transactions on Cybernetics.

[30]  Chong Luo,et al.  A Twofold Siamese Network for Real-Time Object Tracking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Xiaogang Wang,et al.  Object Detection in Videos with Tubelet Proposal Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Andrew Zisserman,et al.  Detect to Track and Track to Detect , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Zhenyu He,et al.  The Seventh Visual Object Tracking VOT2019 Challenge Results , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[34]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[35]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[37]  Michael Blumenstein,et al.  A study on detecting drones using deep convolutional neural networks , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).