论文信息 - Online Domain Adaptation for Multi-Object Tracking

Online Domain Adaptation for Multi-Object Tracking

Automatically detecting, labeling, and tracking objects in videos depends first and foremost on accurate category-level object detectors. These might, however, not always be available in practice, as acquiring high-quality large scale labeled training datasets is either too costly or impractical for all possible real-world application scenarios. A scalable solution consists in re-using object detectors pre-trained on generic datasets. This work is the first to investigate the problem of on-line domain adaptation of object detectors for causal multi-object tracking (MOT). We propose to alleviate the dataset bias by adapting detectors from category to instances, and back: (i) we jointly learn all target models by adapting them from the pre-trained one, and (ii) we also adapt the pre-trained model on-line. We introduce an on-line multi-task learning algorithm to efficiently share parameters and reduce drift, while gradually improving recall. Our approach is applicable to any linear object detector, and we evaluate both cheap "mini-Fisher Vectors" and expensive "off-the-shelf" ConvNet features. We quantitatively measure the benefit of our domain adaptation strategy on the KITTI tracking benchmark and on a new dataset (PASCAL-to-KITTI) we introduce to study the domain mismatch problem in MOT.

Eleonora Vig | Adrien Gaidon | Adrien Gaidon | E. Vig

[1] Pramod Sharma,et al. Unsupervised incremental learning for improved object detection in a video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Konrad Schindler,et al. Detection- and Trajectory-Level Exclusion in Multiple Object Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Ming-Hsuan Yang,et al. Bayesian Multi-object Tracking Using Motion Context from Multiple Objects , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[4] Florent Perronnin,et al. Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Harold W. Kuhn,et al. The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[6] Wenhan Luo,et al. Bi-label Propagation for Generic Multiple Object Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Eric Moulines,et al. Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) , 2013, NIPS.

[8] Rainer Stiefelhagen,et al. Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[9] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10] Charless C. Fowlkes,et al. Globally-optimal greedy algorithms for tracking a variable number of objects , 2011, CVPR 2011.

[11] Florent Perronnin,et al. Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13] C. Lawrence Zitnick,et al. Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[14] Fei-Fei Li,et al. Shifting Weights: Adapting Object Detectors from Image to Video , 2012, NIPS.

[15] Luc Van Gool,et al. Online Multiperson Tracking-by-Detection from a Single, Uncalibrated Camera , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Alexei A. Efros,et al. Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[17] Cordelia Schmid,et al. DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[18] Ming Yang,et al. Regionlets for Generic Object Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[19] Lu Zhang,et al. Preserving Structure in Model-Free Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20] Robert T. Collins,et al. Hybrid Stochastic / Deterministic Optimization for Tracking Sports Players and Pedestrians , 2014, ECCV.

[21] Martin Lauer,et al. 3D Traffic Scene Understanding From Movable Platforms , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Bernt Schiele,et al. How good are detection proposals, really? , 2014, BMVC.

[23] Katerina Fragkiadaki,et al. Two-Granularity Tracking: Mediating Trajectory and Detection Graphs for Tracking under Occlusions , 2012, ECCV.

[24] Ramakant Nevatia,et al. Online Learned Discriminative Part-Based Appearance Models for Multi-human Tracking , 2012, ECCV.

[25] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[26] Meng Wang,et al. Transferring a generic pedestrian detector towards specific scenes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27] Zdenek Kalal,et al. Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[29] Zhongfei Zhang,et al. A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[30] Meng Wang,et al. Scene-Specific Pedestrian Detection for Static Video Surveillance , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Simon J. Godsill,et al. On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[33] Cordelia Schmid,et al. Occlusion and Motion Reasoning for Long-Term Tracking , 2014, ECCV.

[34] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Wenhan Luo,et al. Multiple Object Tracking: A Review , 2014, ArXiv.

[36] David Stutz,et al. Neural Codes for Image Retrieval , 2015 .

[37] Konrad Schindler,et al. Discrete-continuous optimization for multi-target tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38] Alexei A. Efros,et al. Unbiased look at dataset bias , 2011, CVPR 2011.

[39] Pietro Perona,et al. Online, Real-Time Tracking Using a Category-to-Individual Detector , 2014, ECCV.

[40] Michael Isard,et al. CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[41] Wenhan Luo,et al. Multiple object tracking: A literature review , 2014, Artif. Intell..

[42] José A. Rodríguez-Serrano,et al. Self-Learning Camera: Autonomous Adaptation of Object Detectors to Unlabeled Video Streams , 2014, ArXiv.

[43] Vibhav Vineet,et al. Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44] Koen E. A. van de Sande,et al. Segmentation as selective search for object recognition , 2011, 2011 International Conference on Computer Vision.

[45] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[46] Ramakant Nevatia,et al. Global data association for multi-object tracking using network flows , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[47] Massimiliano Pontil,et al. Regularized multi--task learning , 2004, KDD.

[48] Thomas Mensink,et al. Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[49] Pramod Sharma,et al. Efficient Detector Adaptation for Object Detection in a Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[50] Horst Bischof,et al. Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[51] Cordelia Schmid,et al. Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52] Cordelia Schmid,et al. Segmentation Driven Object Detection with Fisher Vectors , 2013, 2013 IEEE International Conference on Computer Vision.

[53] Konrad Schindler,et al. Continuous Energy Minimization for Multitarget Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.