Convolutional neural network–based person tracking using overhead views

In video surveillance, person tracking is considered as challenging task. Numerous computer vision, machine and deep learning–based techniques have been developed in recent years. Majority of these techniques are based on frontal view images/video sequences. The advancement of convolutional neural network reforms the way of object tracking. The network layers of convolutional neural network models trained on a number of images or video sequences improve speed and accuracy of object tracking. In this work, the generalization performance of existing pre-trained deep learning models have investigated for overhead view person detection and tracking, under different experimental conditions. The object tracking method Generic Object Tracking Using Regression Networks (GOTURN) which has been yielding outstanding tracking results in recent years is explored for person tracking using overhead views. This work mainly focused on overhead view person tracking using Faster region convolutional neural network (Faster-RCNN) in combination with GOTURN architecture. In this way, the person is first identified in overhead view video sequences and then tracked using a GOTURN tracking algorithm. Faster-RCNN detection model achieved the true detection rate ranging from 90% to 93% with a minimum false detection rate up to 0.5%. The GOTURN tracking algorithm achieved similar results with the success rate ranging from 90% to 94%. Finally, the discussion is made on output results along with future direction.

[1]  Fei Yang,et al.  Visual tracking via bag of features , 2012 .

[2]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  John N. Carter,et al.  A robust person detector for overhead views , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[6]  Qingming Huang,et al.  Learning Attribute-Specific Representations for Visual Tracking , 2019, AAAI.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[9]  Hanzi Wang,et al.  Incremental Learning of 3D-DCT Compact Representations for Robust Visual Tracking , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Manuel Mazzara,et al.  Spatial-prior generalized fuzziness extreme learning machine autoencoder-based active learning for hyperspectral image classification , 2020 .

[11]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Arif Ur Rahman,et al.  Rotation invariant person tracker using top view , 2019, Journal of Ambient Intelligence and Humanized Computing.

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Huchuan Lu,et al.  Deep visual tracking: Review and experimental comparison , 2018, Pattern Recognit..

[15]  Stefan Duffner,et al.  PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects , 2013, ICCV.

[16]  Imran Ahmed,et al.  Energy Efficient Camera Solution for Video Surveillance , 2019, International Journal of Advanced Computer Science and Applications.

[17]  Wongun Choi,et al.  Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Misbah Ahmad,et al.  Comparison of Person Tracking Algorithms Using Overhead View Implemented in OpenCV , 2019, 2019 9th Annual Information Technology, Electromechanical Engineering and Microelectronics Conference (IEMECON).

[19]  Misbah Ahmad,et al.  A Deep Neural Network Approach for Top View People Detection and Counting , 2019, 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON).

[20]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Yanning Zhang,et al.  Part-Based Visual Tracking with Online Latent Structural Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[23]  Luc Van Gool,et al.  Online Multiperson Tracking-by-Detection from a Single, Uncalibrated Camera , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Arun Kumar Sangaiah,et al.  A Robust Features-Based Person Tracker for Overhead Views in Industrial Environment , 2018, IEEE Internet of Things Journal.

[25]  Yang Lu,et al.  Online Object Tracking, Learning and Parsing with And-Or Graphs , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Wajahat Ali Khan,et al.  Fuzziness-based active learning framework to enhance hyperspectral image classification performance for discriminative and generative classifiers , 2018, PloS one.

[27]  Awais Adnan,et al.  Robust Background Subtraction Based Person’s Counting From Overhead View , 2018, 2018 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON).

[28]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[30]  Horst Bischof,et al.  PROST: Parallel robust online simple tracking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Gulraiz Khan,et al.  Multi-Person Tracking Based on Faster R-CNN and Deep Appearance Features , 2019, Visual Object Tracking with Deep Neural Networks.

[32]  Silvio Savarese,et al.  Learning to Track at 100 FPS with Deep Regression Networks , 2016, ECCV.

[33]  Luc Van Gool,et al.  Hough Forests for Object Detection, Tracking, and Action Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[35]  Fakhri Alam Khan,et al.  Towards reliable and trustful personal health record systems: a case of cloud-dew architecture based provenance framework , 2019, J. Ambient Intell. Humaniz. Comput..

[36]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Imran Ahmed,et al.  Person detector for different overhead views using machine learning , 2019, Int. J. Mach. Learn. Cybern..

[38]  Adam Van Etten,et al.  You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery , 2018, ArXiv.

[39]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[40]  Imran Ahmed,et al.  Person Detection from Overhead View: A Survey , 2019, International Journal of Advanced Computer Science and Applications.

[41]  Jin Wang,et al.  A Survey of Multi-object Video Tracking Algorithms , 2018 .

[42]  Yuan Li,et al.  Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Lifespans , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Sadia Din,et al.  Exploring Deep Learning Models for Overhead View Multiple Object Detection , 2020, IEEE Internet of Things Journal.

[44]  Fakhri Alam Khan,et al.  Computer-aided diagnosis for burnt skin images using deep convolutional neural network , 2020, Multimedia Tools and Applications.

[45]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Ying Wu,et al.  Scribble Tracker: A Matting-Based Approach for Robust Tracking , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Volker Eiselein,et al.  High-Speed tracking-by-detection without using image information , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[48]  Huchuan Lu,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Online Object Tracking with Sparse Prototypes , 2022 .

[49]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[50]  Fakhreddine Ababsa,et al.  Hybrid 3D–2D human tracking in a top view , 2014, Journal of Real-Time Image Processing.

[51]  Awais Adnan,et al.  Overhead View Person Detection Using YOLO , 2019, 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON).

[52]  Junseok Kwon,et al.  Tracking by Sampling and IntegratingMultiple Trackers , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Yiannis Kompatsiaris,et al.  VisDrone-VDT2018: The Vision Meets Drone Video Detection and Tracking Challenge Results , 2018, ECCV Workshops.

[54]  Adil Mehmood Khan,et al.  Multi-layer Extreme Learning Machine-based Autoencoder for Hyperspectral Image Classification , 2019, VISIGRAPP.

[55]  Qi Tian,et al.  The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking , 2018, ECCV.

[56]  Imran Ahmed,et al.  A robust algorithm for detecting people in overhead views , 2017, Cluster Computing.

[57]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[58]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).