Online Multi-object Visual Tracking using a GM-PHD Filter with Deep Appearance Learning

We propose a new online multi-object visual tracker based on a Gaussian mixture Probability Hypothesis Density (GM-PHD) filter in combination with a similarity Convolutional Neural Network (CNN). The GM-PHD filter estimates the states and cardinality of an unknown and time varying number of targets in the scene handling target birth, death, clutter (false alarms) and missing detections in a unified framework, and has a linear complexity with the number of targets. However, it lacks the identity of targets. We combine spatio-temporal and visual similarities obtained from object bounding boxes and deep CNN appearance features, respectively, to alleviate its shortcoming of labelling targets across frames. We apply this developed method for tracking multiple targets in video sequences acquired under varying environmental conditions and targets density using a tracking-by-detection approach. Finally, we carry out extensive experiments on Multiple Object Tracking 2016 (MOTI6) and 2017 (MOTI7) benchmark datasets and find out that our tracker significantly outperforms several state-of-the-art trackers in terms of tracking accuracy and precision.

[1]  Yu Liu,et al.  POI: Multiple Object Tracking with High Performance Detection and Appearance Feature , 2016, ECCV Workshops.

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  Volker Eiselein,et al.  Sequential sensor fusion combining probability hypothesis density and kernelized correlation filters for multi-object tracking in video data , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[4]  Nathanael L. Baisa,et al.  Single to multiple target, multiple type visual tracking , 2017 .

[5]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  B. Vo,et al.  Data Association and Track Management for the Gaussian Mixture Probability Hypothesis Density Filter , 2009, IEEE Transactions on Aerospace and Electronic Systems.

[8]  Jing Zhang,et al.  Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Bernt Schiele,et al.  Multiple People Tracking by Lifted Multicut and Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[11]  Young-min Song,et al.  Online multiple object tracking with the hierarchically adopted GM-PHD filter using motion and appearance , 2016, 2016 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia).

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Ian D. Reid,et al.  Joint Probabilistic Data Association Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  R. Mahler Multitarget Bayes filtering via first-order multitarget moments , 2003 .

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Andrew M. Wallace,et al.  Long-term Correlation Tracking using Multi-layer Hybrid Features in Sparse and Dense Environments , 2017, J. Vis. Commun. Image Represent..

[17]  Nathanael L. Baisa,et al.  Multiple target, multiple type filtering in the RFS framework , 2017, Digit. Signal Process..

[18]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[19]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[20]  Konrad Schindler,et al.  Continuous Energy Minimization for Multitarget Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Andrew M. Wallace,et al.  Development of a N-type GM-PHD Filter for Multiple Target, Multiple Type Visual Tracking , 2019, J. Vis. Commun. Image Represent..

[22]  Ihsan Ullah,et al.  Survey on Deep Learning Techniques for Person Re-Identification Task , 2018, ArXiv.

[23]  Volker Eiselein,et al.  Real-Time Multi-human Tracking Using a Probability Hypothesis Density Filter and Multiple Detectors , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[24]  Mario Sznaier,et al.  The Way They Move: Tracking Multiple Targets with Similar Appearance , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  François Bourgeois,et al.  An extension of the Munkres algorithm for the assignment problem to rectangular matrices , 1971, CACM.

[26]  Fabio Poiesi,et al.  Online Multi-target Tracking with Strong and Weak Detections , 2016, ECCV Workshops.

[27]  Charless C. Fowlkes,et al.  Globally-optimal greedy algorithms for tracking a variable number of objects , 2011, CVPR 2011.

[28]  Stefan Roth,et al.  MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking , 2015, ArXiv.

[29]  Ramakant Nevatia,et al.  Learning to associate: HybridBoosted multi-target tracker for crowded scene , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Konrad Schindler,et al.  Learning by Tracking: Siamese CNN for Robust Target Association , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Ba-Ngu Vo,et al.  The Gaussian Mixture Probability Hypothesis Density Filter , 2006, IEEE Transactions on Signal Processing.

[33]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[34]  J. L. Roux An Introduction to the Kalman Filter , 2003 .

[35]  James M. Rehg,et al.  A multiple hypothesis approach to figure tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[36]  Yi Yang,et al.  A Discriminatively Learned CNN Embedding for Person Reidentification , 2016, ACM Trans. Multim. Comput. Commun. Appl..

[37]  Ba-Ngu Vo,et al.  Adaptive Target Birth Intensity for PHD and CPHD Filters , 2012, IEEE Transactions on Aerospace and Electronic Systems.

[38]  Gregory D. Hager,et al.  Probabilistic Data Association Methods for Tracking Complex Visual Objects , 2001, IEEE Trans. Pattern Anal. Mach. Intell..