Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions

We describe an end-to-end framework for learning parameters of min-cost flow multi-target tracking problem with quadratic trajectory interactions including suppression of overlapping tracks and contextual cues about co-occurrence of different objects. Our approach utilizes structured prediction with a tracking-specific loss function to learn the complete set of model parameters. In this learning framework, we evaluate two different approaches to finding an optimal set of tracks under a quadratic model objective, one based on an linear program (LP) relaxation and the other based on novel greedy variants of dynamic programming that handle pairwise interactions. We find the greedy algorithms achieve almost equivalent accuracy to the LP relaxation while being up to 10$$\times $$× faster than a commercial LP solver. We evaluate trained models on three challenging benchmarks. Surprisingly, we find that with proper parameter learning, our simple data association model without explicit appearance/motion reasoning is able to achieve comparable or better accuracy than many state-of-the-art methods that use far more complex motion features or appearance affinity metric learning.

[1]  Margrit Betke,et al.  Coupling detection and data association for multiple object tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Mohamed R. Amer,et al.  Multiobject tracking as maximum weight independent set , 2011, CVPR 2011.

[3]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Charless C. Fowlkes,et al.  Learning Optimal Parameters For Multi-target Tracking , 2015, BMVC.

[5]  Ming-Hsuan Yang,et al.  Bayesian Multi-object Tracking Using Motion Context from Multiple Objects , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[6]  Silvio Savarese,et al.  A Unified Framework for Multi-target Tracking and Collective Activity Recognition , 2012, ECCV.

[7]  Thorsten Joachims,et al.  Training structural SVMs when exact inference is intractable , 2008, ICML '08.

[8]  Andreas Geiger,et al.  FollowMe: Efficient Online Min-Cost Flow Tracking with Bounded Memory and Computation , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Ramakant Nevatia,et al.  Global data association for multi-object tracking using network flows , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Jan Feyereisl,et al.  Online Multi-target Tracking by Large Margin Structured Learning , 2012, ACCV.

[11]  Afshin Dehghan,et al.  Target Identity-aware Network Flow for online multiple target tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Fei-Fei Li,et al.  Efficient Image and Video Co-localization with Frank-Wolfe Algorithm , 2014, ECCV.

[13]  Derek Hoiem,et al.  Learning CRFs Using Graph Cuts , 2008, ECCV.

[14]  Ramakant Nevatia,et al.  An online learned CRF model for multi-target tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Ramakant Nevatia,et al.  Learning to associate: HybridBoosted multi-target tracker for crowded scene , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[17]  Konrad Schindler,et al.  Detection- and Trajectory-Level Exclusion in Multiple Object Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Ian D. Reid,et al.  Latent Data Association: Bayesian Model Selection for Multi-target Tracking , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[20]  Charless C. Fowlkes,et al.  Discriminative Models for Multi-Class Object Layout , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Robert T. Collins,et al.  Multi-target Tracking by Lagrangian Relaxation to Min-cost Network Flow , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Francesco Solera,et al.  Learning to Divide and Conquer for Online Multi-target Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  MilanAnton,et al.  Multi-Target Tracking by Discrete-Continuous Energy Minimization , 2016 .

[25]  Bernt Schiele,et al.  Learning People Detectors for Tracking in Crowded Scenes , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Ian D. Reid,et al.  Joint tracking and segmentation of multiple targets , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Konrad Schindler,et al.  Multi-Target Tracking by Discrete-Continuous Energy Minimization , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Abdel Nasser H. Zaied,et al.  A Survey of Quadratic Assignment Problems , 2014 .

[30]  Ravindra K. Ahuja,et al.  Network Flows: Theory, Algorithms, and Applications , 1993 .

[31]  Bernt Schiele,et al.  Subgraph decomposition for multi-target tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[33]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[34]  Fred A. Hamprecht,et al.  Structured Learning for Cell Tracking , 2011, NIPS.

[35]  Ernesto Brau,et al.  Bayesian 3D Tracking from Monocular Video , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Kuk-Jin Yoon,et al.  Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Wongun Choi,et al.  Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Ming Yang,et al.  Regionlets for Generic Object Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[39]  Martin Lauer,et al.  3D Traffic Scene Understanding From Movable Platforms , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Ivan Laptev,et al.  On pairwise costs for network flow multi-object tracking , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Konrad Schindler,et al.  Discrete-continuous optimization for multi-target tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Ben Taskar,et al.  Word Alignment via Quadratic Assignment , 2006, NAACL.

[43]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[44]  Konrad Schindler,et al.  Continuous Energy Minimization for Multitarget Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  S. Savarese,et al.  Learning an Image-Based Motion Context for Multiple People Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Charless C. Fowlkes,et al.  Globally-optimal greedy algorithms for tracking a variable number of objects , 2011, CVPR 2011.

[47]  Gang Wang,et al.  Tracklet Association with Online Target-Specific Metric Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Stefan Roth,et al.  MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking , 2015, ArXiv.

[49]  James M. Rehg,et al.  Multiple Hypothesis Tracking Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[50]  Silvio Savarese,et al.  Learning to Track: Online Multi-object Tracking by Decision Making , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).