论文信息 - Online Object Tracking, Learning and Parsing with And-Or Graphs

Online Object Tracking, Learning and Parsing with And-Or Graphs

This paper presents a method, called <italic>AOGTracker</italic>, for simultaneously tracking, learning and parsing (TLP) of unknown objects in video sequences with a hierarchical and compositional And-Or graph (AOG) representation. The TLP method is formulated in the Bayesian framework with a spatial and a temporal dynamic programming (DP) algorithms inferring object bounding boxes on-the-fly. During online learning, the AOG is discriminatively learned using latent SVM <xref ref-type="bibr" rid="ref1">[1]</xref> to account for appearance (e.g., lighting and partial occlusion) and structural (e.g., different poses and viewpoints) variations of a tracked object, as well as distractors (e.g., similar objects) in background. Three key issues in online inference and learning are addressed: (i) maintaining purity of positive and negative examples collected online, (ii) controling model complexity in latent structure learning, and (iii) identifying critical moments to re-learn the structure of AOG based on its intrackability. The intrackability measures uncertainty of an AOG based on its score maps in a frame. In experiments, our AOGTracker is tested on two popular tracking benchmarks with the same parameter setting: the TB-100/50/CVPR2013 benchmarks <xref ref-type="bibr" rid="ref2">[2]</xref> , <xref ref-type="bibr" rid="ref3">[3]</xref> , and the VOT benchmarks <xref ref-type="bibr" rid="ref4">[4]</xref> —VOT 2013, 2014, 2015 and TIR2015 (thermal imagery tracking). In the former, our AOGTracker outperforms state-of-the-art tracking algorithms including two trackers based on deep convolutional network <xref ref-type="bibr" rid="ref5">[5]</xref> , <xref ref-type="bibr" rid="ref6">[6]</xref> . In the latter, our AOGTracker outperforms all other trackers in VOT2013 and is comparable to the state-of-the-art methods in VOT2014, 2015 and TIR2015.

[1] Ming-Hsuan Yang,et al. Least Soft-thresold Squares Tracking , 2013 .

[2] Ales Leonardis,et al. Robust Visual Tracking Using an Adaptive Coupled-Layer Visual Model , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Haibin Ling,et al. Robust Visual Tracking and Vehicle Classification via Sparse Representation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4] Jiri Matas,et al. Long-Term Tracking through Failure Cases , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[5] S. Carey. The Origin of Concepts , 2000 .

[6] Pascal Fua,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Multiple Object Tracking Using K-shortest Paths Optimization , 2022 .

[7] Narendra Ahuja,et al. Robust Visual Tracking via Structured Multi-Task Sparse Learning , 2012, International Journal of Computer Vision.

[8] Rui Caseiro,et al. Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[9] Hanzi Wang,et al. Incremental Learning of 3D-DCT Compact Representations for Robust Visual Tracking , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] Gérard G. Medioni,et al. Context tracker: Exploring supporters and distracters in unconstrained environments , 2011, CVPR 2011.

[11] Song-Chun Zhu,et al. Intrackability: Characterizing Video Statistics and Pursuing Video Representations , 2012, International Journal of Computer Vision.

[12] François Fleuret,et al. Exact Acceleration of Linear Object Detectors , 2012, ECCV.

[13] Ming-Hsuan Yang,et al. Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Matti Pietikäinen,et al. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[15] Vibhav Vineet,et al. Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Seunghoon Hong,et al. Visual Tracking by Sampling Tree-Structured Graphical Models , 2014, ECCV.

[17] Zdenek Kalal,et al. Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Horst Bischof,et al. Real-Time Tracking via On-line Boosting , 2006, BMVC.

[19] Seunghoon Hong,et al. Orderless Tracking through Model-Averaged Posterior Estimation , 2013, 2013 IEEE International Conference on Computer Vision.

[20] Ming-Hsuan Yang,et al. Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[21] Horst Bischof,et al. Hough-based tracking of non-rigid objects , 2011, 2011 International Conference on Computer Vision.

[22] Narendra Ahuja,et al. Robust visual tracking via multi-task sparse learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Shai Avidan,et al. Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24] V G Andrew,et al. AN EFFICIENT IMPLEMENTATION OF A SCALING MINIMUM-COST FLOW ALGORITHM , 1997 .

[25] Charless C. Fowlkes,et al. Globally-optimal greedy algorithms for tracking a variable number of objects , 2011, CVPR 2011.

[26] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[27] Deva Ramanan,et al. N-best maximal decoders for part models , 2011, 2011 International Conference on Computer Vision.

[28] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[29] Robert T. Collins,et al. Mean-shift blob tracking through scale space , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[30] Bin Shen,et al. Online robust image alignment via iterative convex optimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31] Junseok Kwon,et al. Visual tracking decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32] Yanning Zhang,et al. Part-Based Visual Tracking with Online Latent Structural Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33] Lei Zhang,et al. Fast Compressive Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34] Junseok Kwon,et al. Tracking by Sampling Trackers , 2011, 2011 International Conference on Computer Vision.

[35] David A. McAllester,et al. Object Detection with Grammar Models , 2011, NIPS.

[36] Nuno Vasconcelos,et al. Biologically Inspired Object Tracking Using Center-Surround Saliency Mechanisms , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37] Alex Waibel,et al. Readings in speech recognition , 1990 .

[38] Xiaoqin Zhang,et al. Graph Embedding Based Semi-supervised Discriminative Tracker , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[39] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40] David A. McAllester,et al. Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41] Abhinav Gupta,et al. Transferring Rich Feature Hierarchies for Robust Visual Tracking , 2015, ArXiv.

[42] Yali Amit,et al. POP: Patchwork of Parts Models for Object Recognition , 2007, International Journal of Computer Vision.

[43] Laura Sevilla-Lara,et al. Distribution fields for tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44] Rynson W. H. Lau,et al. Visual Tracking via Locality Sensitive Histograms , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[45] Huchuan Lu,et al. Visual tracking via adaptive structural local sparse appearance model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[46] Jorge Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[47] Huchuan Lu,et al. Robust object tracking via sparsity-based collaborative model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[48] Andrew V. Goldberg,et al. An efficient implementation of a scaling minimum-cost flow algorithm , 1993, IPCO.

[49] Ehud Rivlin,et al. Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[50] Yanxi Liu,et al. Online selection of discriminative tracking features , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51] Song-Chun Zhu,et al. Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52] Michael Isard,et al. CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[53] Shai Avidan,et al. Locally Orderless Tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[54] Pedro F. Felzenszwalb. Object detection grammars , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[55] Gregory Shakhnarovich,et al. Diverse M-Best Solutions in Markov Random Fields , 2012, ECCV.

[56] Huchuan Lu,et al. Visual Tracking via Probability Continuous Outlier Model , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[57] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[58] Junseok Kwon,et al. Highly Nonrigid Object Tracking via Patch-Based Dynamic Appearance Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59] Lu Zhang,et al. Structure Preserving Object Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[60] Ramakant Nevatia,et al. Global data association for multi-object tracking using network flows , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[61] Kaiqi Huang,et al. An Adaptive Combination of Multiple Features for Robust Tracking in Real Scene , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[62] Deva Ramanan,et al. Self-Paced Learning for Long-Term Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[63] Seunghoon Hong,et al. Online Graph-Based Tracking , 2014, ECCV.

[64] Michael Felsberg,et al. The Visual Object Tracking VOT2013 Challenge Results , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[65] Huchuan Lu,et al. Least Soft-Threshold Squares Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[66] Jiri Matas,et al. A Novel Performance Evaluation Methodology for Single-Target Trackers , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67] Zhongfei Zhang,et al. A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[68] Ming-Hsuan Yang,et al. Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69] Simon Baker,et al. Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[70] Horst Bischof,et al. Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[71] Yunde Jia,et al. Discriminatively Trained And-Or Tree Models for Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[72] Song-Chun Zhu,et al. Image Parsing with Stochastic Scene Grammar , 2011, NIPS.

[73] Patrick Pérez,et al. Color-Based Probabilistic Tracking , 2002, ECCV.

[74] Ales Leonardis,et al. An Enhanced Adaptive Coupled-Layer LGTracker++ , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[75] Benjamin Z. Yao,et al. Learning and parsing video events with goal and intent prediction , 2013, Comput. Vis. Image Underst..

[76] Mohamed ElHelw,et al. Robust Real-Time Tracking with Diverse Ensembles and Random Projections , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[77] Alfredo Petrosino,et al. MATRIOSKA: A Multi-level Approach to Fast Tracking by Learning , 2013, ICIAP.

[78] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[79] Dorin Comaniciu,et al. Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[80] LiXi,et al. A survey of appearance models in visual object tracking , 2013 .

[81] Stefan Roth,et al. People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[82] Bohyung Han,et al. Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[83] Qingshan Liu,et al. Robust Visual Tracking via Convolutional Networks , 2015 .

[84] Yi Wu,et al. Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[85] Luc Van Gool,et al. Beyond semi-supervised tracking: Tracking should be as simple as detection, but not simpler than recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[86] Michael Felsberg,et al. Enhanced Distribution Field Tracking Using Channel Representations , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[87] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[88] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[89] Haibin Ling,et al. Real time robust L1 tracker using accelerated proximal gradient approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[90] Yang Lu,et al. Online Object Tracking, Learning, and Parsing with And-Or Graphs , 2014, CVPR.

[91] Junzhou Huang,et al. Robust tracking using local sparse appearance model and K-selection , 2011, CVPR 2011.

[92] Song-Chun Zhu,et al. A Numerical Study of the Bottom-Up and Top-Down Inference Processes in And-Or Graphs , 2011, International Journal of Computer Vision.

[93] Jiri Matas,et al. Robustifying the Flock of Trackers , 2011 .