Multiple Human Tracking in Non-Specific Coverage with Wearable Cameras

Compared to fixed cameras, wearable cameras have time-varying non-specific view coverage and can be used to alternately observe people at different sites by varying the camera views. However, such view change of wearable cameras may introduce intervals of transitional frames without useful information, which brings new challenge for the important multiple object tracking (MOT) task – existing MOT methods can not handle well frequent disappearing/reappearing targets in the field of view, especially in the presence of informationless transitional sequences of frames. To address this problem, in this paper we propose a Markov Decision Process with jump state (JMDP) to model the target’s lifetime in tracking, and use optical flow of the camera motion and the statistical information of the targets to model the camera state transition. We further develop a frame-level classification algorithm to locate the transitional sequence. By combining all of them, we formulate the proposed non-specific-coverage MOT problem as a joint state transition problem, which can be solved by the state transfer mechanism of the targets and the camera. We collect a new dataset for performance evaluation and the experimental results show the effectiveness of the proposed method.

[1]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Stefan Roth,et al.  MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking , 2015, ArXiv.

[3]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[4]  Silvio Savarese,et al.  Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Afshin Dehghan,et al.  GMCP-Tracker: Global Multi-object Tracking Using Generalized Minimum Clique Graphs , 2012, ECCV.

[6]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Carlo Tomasi,et al.  Features for Multi-target Multi-camera Tracking and Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Georgios D. Evangelidis,et al.  Parametric Image Alignment Using Enhanced Correlation Coefficient Maximization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Andrea Cavagna,et al.  GReTA-A Novel Global and Recursive Tracking Algorithm in Three Dimensions , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Wei Wu,et al.  End-to-End Flow Correlation Tracking with Spatial-Temporal Attention , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Wei Feng,et al.  Complementary-View Co-Interest Person Detection , 2020, ACM Multimedia.

[12]  Qing Guo,et al.  Content-Related Spatial Regularization for Visual Object Tracking , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[13]  Hua Yang,et al.  Online Multi-Object Tracking with Dual Matching Attention Networks , 2018, ECCV.

[14]  Kwangjin Yoon,et al.  Online Multi-Object Tracking Using Selective Deep Appearance Matching , 2018, 2018 IEEE International Conference on Consumer Electronics - Asia (ICCE-Asia).

[15]  Laura Leal-Taixé,et al.  Tracking Without Bells and Whistles , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Silvio Savarese,et al.  Learning to Track: Online Multi-object Tracking by Decision Making , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[18]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[19]  Jiewen Zhao,et al.  Complementary-View Multiple Human Tracking , 2020, AAAI.

[20]  Jiewen Zhao,et al.  Multiple Human Association between Top and Horizontal Views by Matching Subjects' Spatial Distributions , 2019, ArXiv.

[21]  Wei Feng,et al.  Human Identification and Interaction Detection in Cross-View Multi-Person Videos with Wearable Cameras , 2020, ACM Multimedia.

[22]  Wei Feng,et al.  Fast Learning of Spatially Regularized and Content Aware Correlation Filter for Visual Tracking , 2020, IEEE Transactions on Image Processing.