The Visual Object Tracking VOT2013 Challenge Results

Visual tracking has attracted a significant attention in the last few decades. The recent surge in the number of publications on tracking-related problems have made it almost impossible to follow the developments in the field. One of the reasons is that there is a lack of commonly accepted annotated data-sets and standardized evaluation protocols that would allow objective comparison of different tracking methods. To address this issue, the Visual Object Tracking (VOT) workshop was organized in conjunction with ICCV2013. Researchers from academia as well as industry were invited to participate in the first VOT2013 challenge which aimed at single-object visual trackers that do not apply pre-learned models of object appearance (model-free). Presented here is the VOT2013 benchmark dataset for evaluation of single-object visual trackers as well as the results obtained by the trackers competing in the challenge. In contrast to related attempts in tracker benchmarking, the dataset is labeled per-frame by visual attributes that indicate occlusion, illumination change, motion change, size change and camera motion, offering a more systematic comparison of the trackers. Furthermore, we have designed an automated system for performing and evaluating the experiments. We present the evaluation protocol of the VOT2013 challenge and the results of a comparison of 27 trackers on the benchmark dataset. The dataset, the evaluation tools and the tracker rankings are publicly available from the challenge website (http://votchallenge.net).

[1]  Ales Leonardis,et al.  Is my new tracker really better than yours? , 2014, IEEE Winter Conference on Applications of Computer Vision.

[2]  Pietro Perona,et al.  The Fastest Pedestrian Detector in the West , 2010, BMVC.

[3]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[4]  Philip H. S. Torr,et al.  The Importance of Estimating Object Extent when Tracking with Correlation Filters , 2015 .

[5]  Abhinav Gupta,et al.  Transferring Rich Feature Hierarchies for Robust Visual Tracking , 2015, ArXiv.

[6]  Dit-Yan Yeung,et al.  Understanding and Diagnosing Visual Tracking Systems , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Tony P. Pridmore,et al.  MTS: A Multiple Temporal Scale Tracker Handling Occlusion and Abrupt Motion Variation , 2014, ACCV.

[8]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[9]  Michael Felsberg,et al.  Accurate Scale Estimation for Robust Visual Tracking , 2014, BMVC.

[10]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[11]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[12]  Michael Felsberg,et al.  Enhanced Distribution Field Tracking Using Channel Representations , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[13]  Fatih Murat Porikli,et al.  Changedetection.net: A new change detection benchmark dataset , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[14]  Shuicheng Yan,et al.  Dense Neighborhoods on Affinity Graph , 2011, International Journal of Computer Vision.

[15]  Michael Felsberg,et al.  Learning Spatially Regularized Correlation Filters for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Marc Van Droogenbroeck,et al.  ViBe: A Universal Background Subtraction Algorithm for Video Sequences , 2011, IEEE Transactions on Image Processing.

[18]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Ales Leonardis,et al.  A Two-Stage Dynamic Model for Visual Tracking , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  Jianke Zhu,et al.  A Scale Adaptive Kernel Correlation Filter Tracker with Feature Integration , 2014, ECCV Workshops.

[21]  Chunhua Shen,et al.  Real-time visual tracking using compressive sensing , 2011, CVPR 2011.

[22]  J.M. Ferryman,et al.  PETS Metrics: On-Line Performance Evaluation Service , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[23]  Alfredo Petrosino,et al.  Clustering Local Motion Estimates for Robust and Efficient Object Tracking , 2014, ECCV Workshops.

[24]  Jiri Matas,et al.  Forward-Backward Error: Automatic Detection of Tracking Failures , 2010, 2010 20th International Conference on Pattern Recognition.

[25]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[27]  Dorothy Ndedi Monekosso,et al.  SwATrack: A Swarm Intelligence-based Abrupt Motion Tracker , 2013, MVA.

[28]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[29]  David Zhang,et al.  Fast Visual Tracking via Dense Spatio-temporal Context Learning , 2014, ECCV.

[30]  Jiri Matas,et al.  Tracking the Untrackable: How to Track When Your Object Is Featureless , 2012, ACCV Workshops.

[31]  Bin Shen,et al.  Online robust image alignment via iterative convex optimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[33]  Jiri Matas,et al.  Robust scale-adaptive mean-shift for tracking , 2013, Pattern Recognit. Lett..

[34]  Huchuan Lu,et al.  Least Soft-Threshold Squares Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Matej Kristan Multivariate Online Kernel Density Estimation , 2010 .

[36]  Mohamed ElHelw,et al.  Robust Real-Time Tracking with Diverse Ensembles and Random Projections , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[37]  Ales Leonardis,et al.  Robust Visual Tracking Using an Adaptive Coupled-Layer Visual Model , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Jiri Matas,et al.  Long-Term Tracking through Failure Cases , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[39]  Cordelia Schmid,et al.  Online Object Tracking with Proposal Selection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Alfredo Petrosino,et al.  MATRIOSKA: A Multi-level Approach to Fast Tracking by Learning , 2013, ICIAP.

[41]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Junseok Kwon,et al.  Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive Basin Hopping Monte Carlo sampling , 2009, CVPR.

[43]  Thomas Mauthner,et al.  In defense of color-based model-free tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Haibin Ling,et al.  Real time robust L1 tracker using accelerated proximal gradient approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Bruce A. Draper,et al.  Visual object tracking using adaptive correlation filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[46]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Jing Zhang,et al.  Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[49]  Zhongfei Zhang,et al.  A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[50]  Ann B. Lee,et al.  Treelets--An adaptive multi-scale basis for sparse unordered data , 2007, 0707.0481.

[51]  Yuichi Matsumoto,et al.  Shrink boost for selecting multi-LBP histogram features in object detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[53]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[54]  Ming Tang,et al.  Multi-kernel Correlation Filter for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[55]  Haibin Ling,et al.  Finding the Best from the Second Bests - Inhibiting Subjective Bias in Evaluation of Visual Tracking Algorithms , 2013, 2013 IEEE International Conference on Computer Vision.

[56]  Zhe Chen,et al.  MUlti-Store Tracker (MUSTer): A cognitive psychology inspired approach to object tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Dit-Yan Yeung,et al.  Ensemble-Based Tracking: Aggregating Crowdsourced Structured Time Series Data , 2014, ICML.

[59]  Hyeonjoon Moon,et al.  The FERET evaluation methodology for face-recognition algorithms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[60]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[61]  M. Kristan,et al.  Entropy Based Measure of Camera Focus , 2004 .

[62]  Jiri Matas,et al.  P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[63]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[64]  Xiaoqin Zhang,et al.  Graph Embedding Based Semi-supervised Discriminative Tracker , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[65]  Jiri Matas,et al.  The VOT2013 challenge: overview and additional results , 2014 .

[66]  Laura Sevilla-Lara,et al.  Distribution fields for tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Jae-Yeong Lee,et al.  Visual tracking by partition-based histogram backprojection and maximum support criteria , 2011, 2011 IEEE International Conference on Robotics and Biomimetics.

[68]  Thiagalingam Kirubarajan,et al.  Estimation with Applications to Tracking and Navigation , 2001 .

[69]  Atilla Baskurt,et al.  Classifying Global Scene Context for On-line Multiple Tracker Selection , 2015, BMVC.

[70]  Stefanos Zafeiriou,et al.  Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Jin Gao,et al.  Transfer Learning Based Visual Tracking with Gaussian Processes Regression , 2014, ECCV.

[72]  Tony P. Pridmore,et al.  TRIC-track: Tracking by Regression with Incrementally Learned Cascades , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[73]  Hongdong Li,et al.  Tracking Randomly Moving Objects on Edge Box Proposals , 2015, ArXiv.

[74]  Guna Seetharaman,et al.  Robust Orientation and Appearance Adaptation for Wide-Area Large Format Video Object Tracking , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[75]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[76]  Jiri Matas,et al.  Robustifying the Flock of Trackers , 2011 .

[77]  Guna Seetharaman,et al.  Persistent target tracking using likelihood fusion in wide-area and full motion video sequences , 2012, 2012 15th International Conference on Information Fusion.

[78]  Jacques Verly,et al.  The State of the Art in Multiple Object Tracking Under Occlusion in Video Sequences , 2003 .

[79]  Ales Leonardis,et al.  An Enhanced Adaptive Coupled-Layer LGTracker++ , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[80]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[81]  Kaiqi Huang,et al.  An Adaptive Combination of Multiple Features for Robust Tracking in Real Scene , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[82]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[83]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[84]  Christopher O. Jaynes,et al.  An Open Development Environment for Evaluation of Video Surveillance Systems , 2002 .

[85]  Guna Seetharaman,et al.  Efficient feature extraction and likelihood fusion for vehicle tracking in low frame rate airborne video , 2010, 2010 13th International Conference on Information Fusion.

[86]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[87]  Jiri Matas,et al.  The Enhanced Flock of Trackers , 2014, Registration and Recognition in Images and Videos.

[88]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  Michael Felsberg,et al.  The Visual Object Tracking VOT2013 Challenge Results , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[90]  Rama Chellappa,et al.  Online Empirical Evaluation of Tracking Algorithms , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[91]  J. Pers,et al.  Multiple interacting targets tracking with application to team sports , 2005, ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005..

[92]  Zhenyu He,et al.  The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results , 2016, ECCV Workshops.

[93]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[94]  David S. Doermann,et al.  Tools and techniques for video performance evaluation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[95]  Horst Bischof,et al.  Hough-based tracking of non-rigid objects , 2011, 2011 International Conference on Computer Vision.

[96]  Michael Felsberg,et al.  Adaptive Color Attributes for Real-Time Visual Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[97]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[98]  Sebastien C. Wong,et al.  Combining online feature selection with adaptive shape estimation , 2010, 2010 25th International Conference of Image and Vision Computing New Zealand.

[99]  Richard C. Atkinson,et al.  Human Memory: A Proposed System and its Control Processes , 1968, Psychology of Learning and Motivation.

[100]  Andrea Cavallaro,et al.  A Protocol for Evaluating Video Trackers Under Real-World Conditions , 2013, IEEE Transactions on Image Processing.

[101]  Chun Chen,et al.  A Convolutional Treelets Binary Feature Approach to Fast Keypoint Recognition , 2012, ECCV.