Deformable object tracking with spatiotemporal segmentation in big vision surveillance

Rapid development of worldwide networks have changed the traditional challenges in vision surveillance to a big data level. Accordingly, the video processing technologies also need to focus more on the new coming big vision problems such as efficient content understanding. As a fundamental and indispensable pre-step for high-level video analysis, e.g. behavior recognition for social security, accurate and robust object tracking can play an essential role because of its capability in extracting the salient information from the captured video dataset. Due to the complexity of the realistic application environments, accurate and robust tracking is not easy because the object appearance may continually change during its moving, especially for the deformable objects, it is difficult for the designed appearance model being adaptive to the heavy shape variations as rotation or distortion. In this paper, a novel object tracking based on spatial segmentation is proposed to handle the problem of drastic appearance changes of the deformable object. By using the motion information between the consecutive frames, the irregular areas of the deformable object can be segmented more accurately by energy function optimization with boundary convergence. In succession, the segmentation areas are modeled by a structural SVM as learning samples to achieve more effective online tracking. Based on the evaluation of the proposed tracking on the standard benchmark database containing the challenges of heavy intrinsic variations and occlusions, the experiment results have demonstrated a significant improvement in accuracy and robustness when compared with other state-of-art tracking approaches.

[1]  Hamid Reza Karimi,et al.  A mutual GrabCut method to solve co-segmentation , 2013, EURASIP J. Image Video Process..

[2]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[3]  Ling Shao,et al.  Recent advances and trends in visual tracking: A review , 2011, Neurocomputing.

[4]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[6]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[8]  Horst Bischof,et al.  On-line Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[9]  Patrick Pérez,et al.  Interactive Image Segmentation Using an Adaptive GMMRF Model , 2004, ECCV.

[10]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Shengping Zhang,et al.  A novel supervised level set method for non-rigid object tracking , 2011, CVPR 2011.

[12]  Misha Denil,et al.  Narrowing the Gap: Random Forests In Theory and In Practice , 2013, ICML.

[13]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[14]  Peter Meer,et al.  ROBUST TECHNIQUES FOR COMPUTER VISION , 2004 .

[15]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[16]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[17]  J. Weickert,et al.  Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods , 2005 .

[18]  Xuelong Li,et al.  General Tensor Discriminant Analysis and Gabor Features for Gait Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Laura Igual,et al.  Robust gait-based gender classification using depth cameras , 2013, EURASIP Journal on Image and Video Processing.

[20]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[21]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Lu Zhang,et al.  Structure Preserving Object Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Xian-Sheng Hua,et al.  Ensemble Manifold Regularization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Dacheng Tao,et al.  Slow Feature Analysis for Human Action Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Andrea Cavallaro,et al.  Accepted for Publication in Ieee Transactions on Image Processing Adaptive Appearance Modeling for Video Tracking: Survey and Evaluation , 2022 .

[27]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, CVPR.

[28]  Pietro Perona,et al.  Strong supervision from weak annotation: Interactive training of deformable part models , 2011, 2011 International Conference on Computer Vision.

[29]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[30]  Vladimir Kolmogorov,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..

[31]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[32]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[33]  Weifeng Liu,et al.  Multiview Hessian Regularization for Image Annotation , 2013, IEEE Transactions on Image Processing.

[34]  Philip H. S. Torr,et al.  Struck: Structured output tracking with kernels , 2011, ICCV.

[35]  Pushmeet Kohli,et al.  On Detection of Multiple Object Instances Using Hough Transforms , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Hichem Sahli,et al.  Non-rigid target tracking based on 'flow-cut' in pair-wise frames with online hough forests , 2013, MM '13.

[37]  Junseok Kwon,et al.  Visual tracking decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[39]  Dacheng Tao,et al.  Multi-Task Pose-Invariant Face Recognition , 2015, IEEE Transactions on Image Processing.

[40]  Dacheng Tao,et al.  Large-Margin Multi-ViewInformation Bottleneck , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Junseok Kwon,et al.  Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive Basin Hopping Monte Carlo sampling , 2009, CVPR.

[42]  Horst Bischof,et al.  Hough-based tracking of non-rigid objects , 2011, 2011 International Conference on Computer Vision.

[43]  Dacheng Tao,et al.  Double Shrinking Sparse Dimension Reduction , 2013, IEEE Transactions on Image Processing.

[44]  Jun Zhu,et al.  Online Bayesian Passive-Aggressive Learning , 2013, ICML.

[45]  Yanning Zhang,et al.  Part-Based Visual Tracking with Online Latent Structural Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.