Online Appearance-Motion Coupling for Multi-Person Tracking in Videos

Multi-person tracking in videos is a promising but challenging visual task. Recent progress in this field is introducing deep convolutional features as appearance models, which achieves robust tracking results when coupled with proper motion models. However, model failures that often cause severe tracking problems, have not been well discussed and addressed in previous work. In this paper, we propose a solution by online detecting such failures and accordingly adjusting the coupling between appearance and motion models. The strategy is letting the functional models take over when certain model faces data association ambiguity, and at the same time suppressing the influence of inappropriate observations during model update. Experimental results prove the benefit of our proposed improvement. Multiple object tracking; deep neural network; online learning; tracking-by-detection; multiple hypothesis tracking (key words)

[1]  Bernt Schiele,et al.  Multiple People Tracking by Lifted Multicut and Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Christophe Garcia,et al.  Deep Siamese Network for Multiple Object Tracking , 2018, 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP).

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.