A Framework to Combine Multi-Object Video Segmentation and Tracking

Multi-object video segmentation and multi-object tracking are very similar in the aspect that both determine the locations and maintain the identities of the objects of interest (targets) in each frame of the video. Our approach takes advantage of this fact and uses the strengths of one task to improve the accuracy of the other. In our framework, the multi-object tracking and segmentation modules initially produce results on our dataset independently. The tracking module enforces higher-order smoothness constraints on the object trajectories and uses Lagrangian relaxation to get an iterative solution method. The segmentation module forms superpixels through clustering, trains a linear SVM using Lab color to obtain the foreground and background segmentation and assigns ID labels based on color and optical flow. The results of these two modules are then jointly processed and updated. The locations of the tracking bounding boxes are refined with the help of the segmentation results, so that they are more precisely centered on the targets. The tracking module is more accurate in terms of ID assignment and hence, its results are used to correct errors in ID labeling in the segmentation module. Both modules identify and add any target detections they initially missed to their results using the results of the other component. Hence, this joint processing increases the accuracy of both the tracking and the segmentation results as can be seen from our experimental results. Our approach is comparable to state-of-the-art tracking and segmentation techniques.

[1]  Mubarak Shah,et al.  On Duality Of Multiple Target Tracking and Segmentation , 2016, ArXiv.

[2]  Shin'ichi Satoh,et al.  VabCut: A video extension of GrabCut for unsupervised video foreground object segmentation , 2014, 2014 International Conference on Computer Vision Theory and Applications (VISAPP).

[3]  Robert T. Collins,et al.  Multi-target Tracking by Lagrangian Relaxation to Min-cost Network Flow , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Ming-Hsuan Yang,et al.  Exploiting Hierarchical Dense Structures on Hypergraphs for Multi-Object Tracking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Zhenyu He,et al.  Connected Component Model for Multi-Object Tracking , 2016, IEEE Transactions on Image Processing.

[6]  Ian D. Reid,et al.  Joint tracking and segmentation of multiple targets , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Shuo Tang,et al.  An Online LC-KSVD Based Dictionary Learning for Multi-target Tracking , 2016, 2016 International Conference on Information System and Artificial Intelligence (ISAI).

[8]  Anis Rahman,et al.  Video segmentation using spectral clustering on superpixels , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[9]  Thomas Brox,et al.  Video Segmentation with Just a Few Strokes , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Jitendra Malik,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Segmentation of Moving Objects by Long Term Video Analysis , 2022 .

[11]  Nanning Zheng,et al.  Video object segmentation by clustering region trajectories , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[12]  Cordelia Schmid,et al.  Learning to detect Motion Boundaries , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Kai Xu,et al.  Multiple object tracking by multi-feature combination based on min-cost network flow , 2016, 2016 IEEE 13th International Conference on Signal Processing (ICSP).

[14]  Hayko Riemenschneider,et al.  Hough Regions for Joining Instance Localization and Segmentation , 2012, ECCV.

[15]  Jitendra Malik,et al.  Learning to segment moving objects in videos , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[17]  Ming-Hsuan Yang,et al.  JOTS: Joint Online Tracking and Segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Hongzhi Wang,et al.  Real-Time Tracking Combined with Object Segmentation , 2014, 2014 22nd International Conference on Pattern Recognition.

[19]  Jana Trojanová,et al.  Multi-object tracking of pedestrian driven by context , 2016, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[20]  Ivan Laptev,et al.  On pairwise costs for network flow multi-object tracking , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Thomas Brox,et al.  A Multi-cut Formulation for Joint Segmentation and Tracking of Multiple Objects , 2016, ArXiv.

[22]  Afshin Dehghan,et al.  Target Identity-aware Network Flow for online multiple target tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).