Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Despite the recent advances in multiple object tracking (MOT), achieved by joint detection and tracking, dealing with long occlusions remains a challenge. This is due to the fact that such techniques tend to ignore the long-term motion information. In this paper, we introduce a probabilistic autoregressive motion model to score tracklet proposals by directly measuring their likelihood. This is achieved by training our model to learn the underlying distribution of natural tracklets. As such, our model allows us not only to assign new detections to existing tracklets, but also to inpaint a tracklet when an object has been lost for a long time, e.g., due to occlusion, by sampling tracklets so as to fill the gap caused by misdetections. Our experiments demonstrate the superiority of our approach at tracking objects in challenging sequences; it outperforms the state of the art in most standard MOT metrics on multiple MOT benchmark datasets, including MOT16, MOT17, and MOT20.

[1]  Feiyue Huang,et al.  Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking , 2020, ECCV.

[2]  Bodo Rosenhahn,et al.  Everybody needs somebody: Modeling social and grouping behavior on a linear programming multiple people tracker , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[3]  Volker Eiselein,et al.  High-Speed tracking-by-detection without using image information , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[4]  Laura Leal-Taixé,et al.  Tracking Without Bells and Whistles , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Pascal Fua,et al.  Non-Markovian Globally Consistent Multi-object Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[8]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Bernt Schiele,et al.  Multiple People Tracking by Lifted Multicut and Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Volker Eiselein,et al.  Sequential sensor fusion combining probability hypothesis density and kernelized correlation filters for multi-object tracking in video data , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[11]  Kwangjin Yoon,et al.  Multiple hypothesis tracking algorithm for multi-target multi-camera tracking with disjoint views , 2018, IET Image Process..

[12]  Tobias Senst,et al.  Extending IOU Based Multi-Object Tracking by Visual Information , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[13]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[14]  Zhaoxin Li,et al.  STGAT: Modeling Spatial-Temporal Interactions for Human Trajectory Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[16]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[17]  S.S. Blackman,et al.  Multiple hypothesis tracking for multiple target tracking , 2004, IEEE Aerospace and Electronic Systems Magazine.

[18]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Marco Pavone,et al.  The Trajectron: Probabilistic Multi-Agent Trajectory Modeling With Dynamic Spatiotemporal Graphs , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Ameya Prabhu,et al.  Simple Unsupervised Multi-Object Tracking , 2020, ArXiv.

[21]  Silvio Savarese,et al.  Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks , 2019, NeurIPS.

[22]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Georgios D. Evangelidis,et al.  Parametric Image Alignment Using Enhanced Correlation Coefficient Maximization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Haibin Ling,et al.  FAMNet: Joint Learning of Feature, Affinity and Multi-Dimensional Assignment for Online Multiple Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Kwangjin Yoon,et al.  Online Multi-Object Tracking with Historical Appearance Matching and Scene Adaptive Detection Filtering , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[27]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Ramakant Nevatia,et al.  Online Learned Discriminative Part-Based Appearance Models for Multi-human Tracking , 2012, ECCV.

[29]  James M. Rehg,et al.  Multi-object Tracking with Neural Gating Using Bilinear LSTM , 2018, ECCV.

[30]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[31]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[32]  Yue Zhou,et al.  LSTM Multiple Object Tracker Combining Multiple Cues , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[33]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[35]  Nenghai Yu,et al.  Online Multi-object Tracking Using CNN-Based Single Object Tracker with Spatial-Temporal Attention Mechanism , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[37]  Marco Pavone,et al.  Trajectron++: Multi-Agent Generative Trajectory Forecasting With Heterogeneous Data for Control , 2020, ArXiv.

[38]  S. Kosslyn,et al.  Visual mental imagery induces retinotopically organized activation of early visual areas. , 2005, Cerebral cortex.

[39]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[40]  Ramakant Nevatia,et al.  Multi-target tracking by online learning of non-linear motion patterns and robust appearance models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Santiago Manen,et al.  PathTrack: Fast Trajectory Annotation with Path Supervision , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Konrad Schindler,et al.  Online Multi-Target Tracking Using Recurrent Neural Networks , 2016, AAAI.

[44]  Kwangjin Yoon,et al.  Online Multiple Pedestrian Tracking using Deep Temporal Appearance Matching Association , 2019, Inf. Sci..

[45]  Mario Sznaier,et al.  The Way They Move: Tracking Multiple Targets with Similar Appearance , 2013, 2013 IEEE International Conference on Computer Vision.

[46]  Laura Leal-Taix'e,et al.  Learning a Neural Solver for Multiple Object Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Long Chen,et al.  Real-Time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[48]  Silvio Savarese,et al.  Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Gennady Erlikhman,et al.  Decoding information about dynamically occluded objects in visual cortex , 2017, NeuroImage.

[50]  Vladlen Koltun,et al.  Tracking Objects as Points , 2020, ECCV.

[51]  Fabio Poiesi,et al.  Online Multi-target Tracking with Strong and Weak Detections , 2016, ECCV Workshops.

[52]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[53]  Ian D. Reid,et al.  Joint Probabilistic Data Association Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[54]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[55]  James M. Rehg,et al.  Multiple Hypothesis Tracking Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[56]  Xavier Alameda-Pineda,et al.  How to Train Your Deep Multi-Object Tracker , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Pascal Fua,et al.  Eliminating Exposure Bias and Metric Mismatch in Multiple Object Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Haibin Ling,et al.  Online Multi-Object Tracking With Instance-Aware Tracker and Dynamic Model Refreshment , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[59]  Yunhong Wang,et al.  A Robust Multi-Athlete Tracking Algorithm by Exploiting Discriminant Features and Long-Term Dependencies , 2018, MMM.

[60]  Jinjun Wang,et al.  An Online and Flexible Multi-object Tracking Framework Using Long Short-Term Memory , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[61]  Silvio Savarese,et al.  SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Konrad Schindler,et al.  Learning by Tracking: Siamese CNN for Robust Target Association , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[63]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[64]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[65]  Ruigang Yang,et al.  A Unified Object Motion and Affinity Model for Online Multi-Object Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Daniel Cremers,et al.  MOT20: A benchmark for multi object tracking in crowded scenes , 2020, ArXiv.

[67]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[68]  Silvio Savarese,et al.  Recurrent Autoregressive Networks for Online Multi-object Tracking , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[69]  Yang Zhang,et al.  Heterogeneous Association Graph Fusion for Target Association in Multiple Object Tracking , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[70]  J. Malik,et al.  It Is Not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction , 2020, ECCV.

[71]  Dietrich Paulus,et al.  Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[72]  Hua Yang,et al.  Online Multi-Object Tracking with Dual Matching Attention Networks , 2018, ECCV.