Multi-object Tracking Cascade with Multi-Step Data Association and Occlusion Handling

Multi-object tracking is a fundamental computer vision task with a wide variety of real-life applications ranging from surveillance and monitoring to biomedical video analysis. Multi-object tracking is a challenging task due to complications caused by object appearance changes, complex object dynamics, clutter in the environment, and partial or full occlusions. In this paper, we propose a time-efficient detection-based multi-object tracking system using a three-step cascaded data association scheme that combines a fast spatial distance only short-term data association, a robust tracklet linking step using discriminative object appearance models, and an explicit occlusion handling unit relying not only on tracked objects’ motion patterns but also on environmental constraints such as presence of potential occlud-ers in the scene. Our experiments on UA-DETRAC multi-object tracking benchmark dataset consisting of challenging real-world traffic videos show promising results against state-of-the-art trackers.

[1]  Hilke Kieritz,et al.  Online multi-person tracking using Integral Channel Features , 2016, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[2]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Nando de Freitas,et al.  An Introduction to Sequential Monte Carlo Methods , 2001, Sequential Monte Carlo Methods in Practice.

[5]  Mario Sznaier,et al.  The Way They Move: Tracking Multiple Targets with Similar Appearance , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Dietrich Paulus,et al.  Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[7]  Ram Nevatia,et al.  Learning to associate: HybridBoosted multi-target tracker for crowded scene , 2009, CVPR.

[8]  Ramakant Nevatia,et al.  Robust Object Tracking by Hierarchical Association of Detection Responses , 2008, ECCV.

[9]  Erik Blasch,et al.  Using maximum consistency context for multiple target association in wide area traffic scenes , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Qinghua Hu,et al.  Vision Meets Drones: A Challenge , 2018, ArXiv.

[11]  Charless C. Fowlkes,et al.  Globally-optimal greedy algorithms for tracking a variable number of objects , 2011, CVPR 2011.

[12]  Kannappan Palaniappan,et al.  Moving Object Segmentation Using the Flux Tensor for Biological Video Microscopy , 2007, PCM.

[13]  Thomas Mauthner,et al.  Occlusion Geodesics for Online Multi-object Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Guna Seetharaman,et al.  CS-LoFT: Color and scale adaptive tracking using max-pooling with bhattacharyya distance , 2016, 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR).

[15]  Martin Lauer,et al.  UA-DETRAC 2017: Report of AVSS2017 & IWT4S Challenge on Advanced Traffic Monitoring , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[16]  Ramakant Nevatia,et al.  Global data association for multi-object tracking using network flows , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[18]  Liu Liu,et al.  Robust multi-object tracking based on higher-order graph and min-cost flow network , 2017, 2017 4th International Conference on Systems and Informatics (ICSAI).

[19]  Ramakant Nevatia,et al.  Learning affinities and dependencies for multi-target tracking using a CRF model , 2011, CVPR 2011.

[20]  Matteo Munaro,et al.  Tracking people within groups with RGB-D data , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Konrad Schindler,et al.  Multi-target tracking by continuous energy minimization , 2011, CVPR 2011.

[22]  Junjie Yan,et al.  Multiple Target Tracking Based on Undirected Hierarchical Relation Hypergraph , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Konrad Schindler,et al.  Continuous Energy Minimization for Multitarget Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Luca Bertinetto,et al.  Staple: Complementary Learners for Real-Time Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ales Leonardis,et al.  Single target tracking using adaptive clustered decision trees and dynamic multi-level appearance models , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Guna Seetharaman,et al.  Persistent target tracking using likelihood fusion in wide-area and full motion video sequences , 2012, 2012 15th International Conference on Information Fusion.

[27]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[28]  Ming-Hsuan Yang,et al.  UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking , 2015, Comput. Vis. Image Underst..

[29]  Kuk-Jin Yoon,et al.  Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[31]  Qingming Huang,et al.  Online multiple object tracking via exchanging object context , 2018, Neurocomputing.

[32]  Guna Seetharaman,et al.  Robust multi-object tracking with semantic color correlation , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[33]  Daniel Wolf,et al.  Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Ramakant Nevatia,et al.  How does person identity recognition help multi-person tracking? , 2011, CVPR 2011.

[35]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[36]  Nuno Vasconcelos,et al.  Learning Complexity-Aware Cascades for Deep Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Konrad Schindler,et al.  Online Multi-Target Tracking Using Recurrent Neural Networks , 2016, AAAI.

[38]  Silvio Savarese,et al.  Learning to Track: Online Multi-object Tracking by Decision Making , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Volker Eiselein,et al.  Sequential sensor fusion combining probability hypothesis density and kernelized correlation filters for multi-object tracking in video data , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).