Object Detection by Spatio-Temporal Analysis and Tracking of the Detected Objects in a Video with Variable Background

In this paper we propose a novel approach for detecting and tracking objects in videos with variable background i.e. videos captured by moving cameras without any additional sensor. In a video captured by a moving camera, both the background and foreground are changing in each frame of the image sequence. So for these videos, modeling a single background with traditional background modeling methods is infeasible and thus the detection of actual moving object in a variable background is a challenging task. To detect actual moving object in this work, spatio-temporal blobs have been generated in each frame by spatio-temporal analysis of the image sequence using a three-dimensional Gabor filter. Then individual blobs, which are parts of one object are merged using Minimum Spanning Tree to form the moving object in the variable background. The height, width and four-bin gray-value histogram of the object are calculated as its features and an object is tracked in each frame using these features to generate the trajectories of the object through the video sequence. In this work, problem of data association during tracking is solved by Linear Assignment Problem and occlusion is handled by the application of kalman filter. The major advantage of our method over most of the existing tracking algorithms is that, the proposed method does not require initialization in the first frame or training on sample data to perform. Performance of the algorithm has been tested on benchmark videos and very satisfactory result has been achieved. The performance of the algorithm is also comparable and superior with respect to some benchmark algorithms.

[1]  Jong-Hann Jean,et al.  Voting-Based Motion Estimation for Real-Time Video Transmission in Networked Mobile Camera Systems , 2013, IEEE Transactions on Industrial Informatics.

[2]  Jin Young Choi,et al.  Detection of moving objects with a moving camera using non-panoramic background model , 2012, Machine Vision and Applications.

[3]  Thomas Sikora,et al.  Motion-based object segmentation using hysteresis and bidirectional inter-frame change detection in sequences with moving camera , 2013, Signal Process. Image Commun..

[4]  Mustafa Akgül,et al.  The Linear Assignment Problem , 1992 .

[5]  Ling Shao,et al.  Learning Spatio-Temporal Representations for Action Recognition: A Genetic Programming Approach , 2016, IEEE Transactions on Cybernetics.

[6]  Lucia Maddalena,et al.  Neural Background Subtraction for Pan-Tilt-Zoom Cameras , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[7]  David J. Heeger,et al.  Optical flow using spatiotemporal filters , 2004, International Journal of Computer Vision.

[8]  Jean-Luc Dugelay,et al.  Spatio-temporal crowd density model in a human detection and tracking framework , 2015, Signal Process. Image Commun..

[9]  Soon Ki Jung,et al.  Improving OR-PCA via smoothed spatially-consistent low-rank modeling for background subtraction , 2017, SAC.

[10]  Junzhou Huang,et al.  Background Subtraction Using Low Rank and Group Sparsity Constraints , 2012, ECCV.

[11]  Ling Shao,et al.  Spatio-Temporal Laplacian Pyramid Coding for Action Recognition , 2014, IEEE Transactions on Cybernetics.

[12]  Li Song,et al.  Background subtraction based on phase feature and distance transform , 2012, Pattern Recognit. Lett..

[13]  Silvio Savarese,et al.  Ieee Transaction on Pattern Analysis and Machine Intelligence 1 a General Framework for Tracking Multiple People from a Moving Camera , 2022 .

[14]  Robert T. Collins,et al.  Multitarget data association with higher-order motion models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Bohyung Han,et al.  Modeling and segmentation of floating foreground and background in videos , 2012, Pattern Recognit..

[16]  Wei Zhang,et al.  Local-to-global background modeling for moving object detection from non-static cameras , 2017, Multimedia Tools and Applications.

[17]  Haibin Ling,et al.  Multi-target Tracking by Rank-1 Tensor Approximation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Michael Felsberg,et al.  The Visual Object Tracking VOT2015 Challenge Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[19]  Yong Jae Lee,et al.  Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Ivan V. Bajic,et al.  Video Object Tracking in the Compressed Domain Using Spatio-Temporal Markov Random Fields , 2013, IEEE Transactions on Image Processing.

[21]  Ashish Ghosh,et al.  Object Detection From Videos Captured by Moving Camera by Fuzzy Edge Incorporated Markov Random Field and Local Histogram Matching , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  David Zhang,et al.  Fast Visual Tracking via Dense Spatio-temporal Context Learning , 2014, ECCV.

[23]  R. Cattell Theory of fluid and crystallized intelligence: A critical experiment. , 1963 .

[24]  Qingming Huang,et al.  A pixel-wise local information-based background subtraction approach , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[25]  Hyung Jin Chang,et al.  Detection of Moving Objects with Non-stationary Cameras in 5.8ms: Bringing Motion Detection to Your Mobile Device , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[26]  Huchuan Lu,et al.  Visual tracking via adaptive structural local sparse appearance model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Jacob Scharcanski,et al.  A fast algorithm for tracking moving objects based on spatio-temporal video segmentation and cluster ensembles , 2015, 2015 IEEE International Instrumentation and Measurement Technology Conference (I2MTC) Proceedings.

[28]  Munchurl Kim,et al.  Moving Object Detection and Tracking Using a Spatio-Temporal Graph in H.264/AVC Bitstreams for Video Surveillance , 2012, IEEE Transactions on Multimedia.

[29]  Luc Van Gool,et al.  Beyond semi-supervised tracking: Tracking should be as simple as detection, but not simpler than recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[30]  C. Blair How similar are fluid cognition and general intelligence? A developmental neuroscience perspective on fluid cognition as an aspect of human cognitive ability. , 2006, The Behavioral and brain sciences.

[31]  Yuji Iwahori,et al.  Pixel-wise Background Segmentation with Moving Camera , 2013, PReMI.

[32]  Jongin Lim,et al.  Scene conditional background update for moving object detection in a moving camera , 2017, Pattern Recognit. Lett..

[33]  Jitendra Malik,et al.  Learning to segment moving objects in videos , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Hyun Wook Park,et al.  A Disparity-Based Adaptive Multihomography Method for Moving Target Detection Based on Global Motion Compensation , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Junseok Kwon,et al.  Wang-Landau Monte Carlo-Based Tracking Methods for Abrupt Motions , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Wu-Chih Hu,et al.  Effective Moving Object Detection from Videos Captured by a Moving Camera , 2014, ECC.

[37]  Dorothy Ndedi Monekosso,et al.  Refined particle swarm intelligence method for abrupt motion tracking , 2014, Inf. Sci..

[38]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[39]  Jenq-Neng Hwang,et al.  Deformable multiple-kernel based human tracking using a moving camera , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[40]  Jake K. Aggarwal,et al.  Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Atsushi Shimada,et al.  Evaluation of foreground detection methodology for a moving camera , 2015, 2015 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV).

[42]  Ko Nishino,et al.  Tracking Pedestrians Using Local Spatio-Temporal Motion Patterns in Extremely Crowded Scenes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Xiaowei Zhou,et al.  Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Brian C. Lovell,et al.  Spatio-temporal covariance descriptors for action and gesture recognition , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[45]  Luigi Cinque,et al.  A keypoint-based method for background modeling and foreground detection using a PTZ camera , 2017, Pattern Recognit. Lett..

[46]  Andrea Cavallaro,et al.  Video-Based Human Behavior Understanding: A Survey , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[47]  James W. Davis,et al.  A Multi-transformational Model for Background Subtraction with Moving Cameras , 2014, ECCV.

[48]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[49]  Paul E. Rybski,et al.  Real-time pedestrian detection with deformable part models , 2012, 2012 IEEE Intelligent Vehicles Symposium.

[50]  Wu-Chih Hu,et al.  Moving object detection and tracking from video captured by moving camera , 2015, J. Vis. Commun. Image Represent..

[51]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[52]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  David A. Forsyth,et al.  Video Event Detection: From Subvolume Localization to Spatiotemporal Path Search , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Yoshihiko Gotoh,et al.  A unified spatio-temporal human body region tracking approach to action recognition , 2015, Neurocomputing.