Improvements to Frank-Wolfe optimization for multi-detector multi-object tracking

This paper proposes a novel formulation for the multiobject tracking-by-detection paradigm for two (or more) input detectors. Using full-body and heads detections, the fusion helps to recover heavily occluded persons and to reduce false positives. The assignment of the two input features to a person and the extraction of the trajectories is commonly solved from one binary quadratic program (BQP). Due to the computational complexity of the NP-hard QP, we approximate the solution using the Frank-Wolfe algorithm. We propose several improvements to this solver affecting better minimization and shorter computations, compared to off-the-shelf BQP-solvers and the standard FrankWolfe algorithm. Evaluation on pedestrian tracking is provided for multiple scenarios, showing improved tracking quality over single input feature trackers and standard QPsolvers. Finally we present the performance of our tracker on the challenging MOT16 benchmark, being comparable to state-of-the-art trackers.

[1]  Charless C. Fowlkes,et al.  Globally-optimal greedy algorithms for tracking a variable number of objects , 2011, CVPR 2011.

[2]  Hilke Kieritz,et al.  Online multi-person tracking using Integral Channel Features , 2016, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[3]  Martin Jaggi,et al.  On the Global Linear Convergence of Frank-Wolfe Optimization Variants , 2015, NIPS.

[4]  Afshin Dehghan,et al.  GMCP-Tracker: Global Multi-object Tracking Using Generalized Minimum Clique Graphs , 2012, ECCV.

[5]  Mark W. Schmidt,et al.  Block-Coordinate Frank-Wolfe Optimization for Structural SVMs , 2012, ICML.

[6]  Thomas Brox,et al.  Joint Graph Decomposition & Node Labeling: Problem, Algorithms, Applications , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Wongun Choi,et al.  Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Alan Fern,et al.  Multi-object Tracking via Constrained Sequential Labeling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Haroon Idrees,et al.  Re-identification of Humans in Crowds using Personal, Social and Environmental Constraints , 2016, ArXiv.

[10]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[11]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Afshin Dehghan,et al.  Binary Quadratic Programing for Online Tracking of Hundreds of People in Extremely Crowded Scenes , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Ian D. Reid,et al.  Stable multi-target tracking in real-time surveillance video , 2011, CVPR 2011.

[14]  Frank Harary,et al.  Graph Theory , 2016 .

[15]  Alain Billionnet,et al.  Extending the QCR method to general mixed-integer programs , 2010, Mathematical Programming.

[16]  Mario Sznaier,et al.  The Way They Move: Tracking Multiple Targets with Similar Appearance , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Bodo Rosenhahn,et al.  Efficient Multiple People Tracking Using Minimum Cost Arborescences , 2014, GCPR.

[18]  Mohamed R. Amer,et al.  Multiobject tracking as maximum weight independent set , 2011, CVPR 2011.

[19]  Cordelia Schmid,et al.  DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[20]  Konrad Schindler,et al.  Continuous Energy Minimization for Multitarget Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Bernt Schiele,et al.  Multiple People Tracking by Lifted Multicut and Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Romaric Audigier,et al.  Improving Multi-frame Data Association with Sparse Representations for Robust Near-online Multi-object Tracking , 2016, ECCV.

[23]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[24]  Andrew Y. Ng,et al.  End-to-End People Detection in Crowded Scenes , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Bodo Rosenhahn,et al.  Everybody needs somebody: Modeling social and grouping behavior on a linear programming multiple people tracker , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[26]  Sartaj Sahni,et al.  Computationally Related Problems , 1974, SIAM J. Comput..

[27]  J. Mitchell Branch-and-Cut Algorithms for Combinatorial Optimization Problems , 1988 .

[28]  Ailsa H. Land,et al.  An Automatic Method of Solving Discrete Programming Problems , 1960 .

[29]  Simon Lacoste-Julien,et al.  Convergence Rate of Frank-Wolfe for Non-Convex Objectives , 2016, ArXiv.

[30]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[32]  Panos M. Pardalos,et al.  Quadratic programming with one negative eigenvalue is NP-hard , 1991, J. Glob. Optim..

[33]  Young-min Song,et al.  Online multiple object tracking with the hierarchically adopted GM-PHD filter using motion and appearance , 2016, 2016 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia).

[34]  Radu Horaud,et al.  Tracking Multiple Persons Based on a Variational Bayesian Model , 2016, ECCV Workshops.

[35]  Bodo Rosenhahn,et al.  Solving Multiple People Tracking in a Minimum Cost Arborescence , 2015, 2015 IEEE Winter Applications and Computer Vision Workshops.

[36]  Bernt Schiele,et al.  Multi-person Tracking by Multicut and Deep Matching , 2016, ECCV Workshops.

[37]  Ian D. Reid,et al.  Joint Probabilistic Data Association Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Katerina Fragkiadaki,et al.  Detection free tracking: Exploiting motion and topology for segmenting and tracking under entanglement , 2011, CVPR 2011.

[39]  Ian D. Reid,et al.  Joint tracking and segmentation of multiple targets , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Ivan Laptev,et al.  On pairwise costs for network flow multi-object tracking , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Alexandre Heili,et al.  Long-Term Time-Sensitive Costs for CRF-Based Tracking by Detection , 2016, ECCV Workshops.

[42]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[44]  Silvio Savarese,et al.  Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Martin Lauer,et al.  3D Traffic Scene Understanding From Movable Platforms , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Robert T. Collins,et al.  Multi-target Tracking by Lagrangian Relaxation to Min-cost Network Flow , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Jitendra Malik,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Segmentation of Moving Objects by Long Term Video Analysis , 2022 .

[48]  Bernt Schiele,et al.  Subgraph decomposition for multi-target tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Fei-Fei Li,et al.  Efficient Image and Video Co-localization with Frank-Wolfe Algorithm , 2014, ECCV.

[50]  James J. Little,et al.  A Linear Programming Approach for Multiple Object Tracking , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Ivan Laptev,et al.  Instance-Level Video Segmentation from Object Tracks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  S. Savarese,et al.  Learning an Image-Based Motion Context for Multiple People Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  John W. Fisher,et al.  A Video Representation Using Temporal Superpixels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Adam N. Letchford,et al.  Non-convex mixed-integer nonlinear programming: A survey , 2012 .

[55]  Pascal Fua,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Multiple Object Tracking Using K-shortest Paths Optimization , 2022 .

[56]  Ramakant Nevatia,et al.  Global data association for multi-object tracking using network flows , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[58]  Fabio Poiesi,et al.  Online Multi-target Tracking with Strong and Weak Detections , 2016, ECCV Workshops.

[59]  Afshin Dehghan,et al.  GMMCP tracker: Globally optimal Generalized Maximum Multi Clique problem for multiple object tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Katerina Fragkiadaki,et al.  Two-Granularity Tracking: Mediating Trajectory and Detection Graphs for Tracking under Occlusions , 2012, ECCV.