Towards a Principled Integration of Multi-camera Re-identification and Tracking Through Optimal Bayes Filters

With the rise of end-to-end learning through deep learning, person detectors and re-identification (ReID) models have recently become very strong. Multi-target multicamera (MTMC) tracking has not fully gone through this transformation yet. We intend to take another step in this direction by presenting a theoretically principled way of integrating ReID with tracking formulated as an optimal Bayes filter. This conveniently side-steps the need for dataassociation and opens up a direct path from full images to the core of the tracker. While the results are still sub-par, we believe that this new, tight integration opens many interesting research opportunities and leads the way towards full end-to-end tracking from raw pixels. Code and models for all experiments are publicly available.

[1]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[2]  Konrad Schindler,et al.  Learning by Tracking: Siamese CNN for Robust Target Association , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3]  Rita Cucchiara,et al.  People reidentification in surveillance and forensics , 2013, ACM Comput. Surv..

[4]  Francesco Solera,et al.  Tracking Social Groups Within and Across Cameras , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Daniel Cremers,et al.  SPENCER: A Socially Aware Service Robot for Passenger Guidance and Help in Busy Airports , 2015, FSR.

[6]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  James M. Rehg,et al.  Multiple Hypothesis Tracking Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Konrad Schindler,et al.  Online Multi-Target Tracking Using Recurrent Neural Networks , 2016, AAAI.

[9]  Ramakant Nevatia,et al.  How does person identity recognition help multi-person tracking? , 2011, CVPR 2011.

[10]  Ramakant Nevatia,et al.  Learning to associate: HybridBoosted multi-target tracker for crowded scene , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  S. Shankar Sastry,et al.  Markov Chain Monte Carlo Data Association for Multi-Target Tracking , 2009, IEEE Transactions on Automatic Control.

[12]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[13]  Yaakov Bar-Shalom,et al.  Multi-target tracking using joint probabilistic data association , 1980, 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[14]  Wenhan Luo,et al.  Multiple Object Tracking: A Review , 2014, ArXiv.

[15]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[16]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[17]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[19]  Ian D. Reid,et al.  Unsupervised learning of a scene-specific coarse gaze estimator , 2011, 2011 International Conference on Computer Vision.

[20]  Yi Yang,et al.  A Discriminatively Learned CNN Embedding for Person Reidentification , 2016, ACM Trans. Multim. Comput. Commun. Appl..

[21]  Kai Oliver Arras,et al.  On multi-modal people tracking from mobile platforms in very crowded and dynamic environments , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[23]  Silvio Savarese,et al.  Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Bastian Leibe,et al.  Exploring bounding box context for multi-object tracker fusion , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[25]  Lucas Beyer,et al.  Biternion Nets: Continuous Head Pose Regression from Discrete Training Labels , 2015, GCPR.

[26]  Stefan Roth,et al.  MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking , 2015, ArXiv.

[27]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[28]  Ramakant Nevatia,et al.  An online learned CRF model for multi-target tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Bineng Zhong,et al.  CNNTracker: Online discriminative object tracking via deep convolutional neural network , 2016, Appl. Soft Comput..

[30]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Abhinav Gupta,et al.  Transferring Rich Feature Hierarchies for Robust Visual Tracking , 2015, ArXiv.

[32]  Carlo Tomasi,et al.  Tracking Multiple People Online and in Real Time , 2014, ACCV.

[33]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  Yi Li,et al.  DeepTrack: Learning Discriminative Feature Representations Online for Robust Visual Tracking , 2015, IEEE Transactions on Image Processing.

[35]  Lain L. MacDonald,et al.  Hidden Markov and Other Models for Discrete- valued Time Series , 1997 .

[36]  J. Ferryman,et al.  PETS2009: Dataset and challenge , 2009, 2009 Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance.

[37]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[38]  Tao Xiang,et al.  Deep Transfer Learning for Person Re-Identification , 2016, 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM).

[39]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[40]  Lucas Beyer,et al.  The STRANDS Project: Long-Term Autonomy in Everyday Environments , 2016, IEEE Robotics Autom. Mag..

[41]  Konrad Schindler,et al.  Challenges of Ground Truth Evaluation of Multi-target Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[42]  Ian D. Reid,et al.  Joint Probabilistic Data Association Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).