NCA-Net for Tracking Multiple Objects across Multiple Cameras

Tracking multiple pedestrians across multi-camera scenarios is an important part of intelligent video surveillance and has great potential application for public security, which has been an attractive topic in the literature in recent years. In most previous methods, artificial features such as color histograms, HOG descriptors and Haar-like feature were adopted to associate objects among different cameras. But there are still many challenges caused by low resolution, variation of illumination, complex background and posture change. In this paper, a feature extraction network named NCA-Net is designed to improve the performance of multiple objects tracking across multiple cameras by avoiding the problem of insufficient robustness caused by hand-crafted features. The network combines features learning and metric learning via a Convolutional Neural Network (CNN) model and the loss function similar to neighborhood components analysis (NCA). The loss function is adapted from the probability loss of NCA aiming at object tracking. The experiments conducted on the NLPR_MCT dataset show that we obtain satisfactory results even with a simple matching operation. In addition, we embed the proposed NCA-Net with two existing tracking systems. The experimental results on the corresponding datasets demonstrate that the extracted features using NCA-net can effectively make improvement on the tracking performance.

[1]  James J. Little,et al.  A Boosted Particle Filter: Multitarget Detection and Tracking , 2004, ECCV.

[2]  Luc Van Gool,et al.  Robust tracking-by-detection using a detector confidence particle filter , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Kuk-Jin Yoon,et al.  Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[5]  Alexandre Bernardino,et al.  A multi-camera video dataset for research on high-definition surveillance , 2014 .

[6]  Bastian Leibe,et al.  Multi-person Tracking with Sparse Detection and Continuous Segmentation , 2010, ECCV.

[7]  Zhihan Lv,et al.  Analysis of Camera Arrays Applicable to the Internet of Things , 2016, Sensors.

[8]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[10]  Ivan Laptev,et al.  Data-driven crowd analysis in videos , 2011, ICCV.

[11]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Sabina Jeschke,et al.  Smart Cities: Foundations, Principles, and Applications , 2017 .

[13]  Houbing Song,et al.  A Temporal-Spatial Method for Group Detection, Locating and Tracking , 2016, IEEE Access.

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Tingfa Xu,et al.  Multi-View Structural Local Subspace Tracking , 2017, Sensors.

[16]  Alexandre Heili,et al.  Long-Term Time-Sensitive Costs for CRF-Based Tracking by Detection , 2016, ECCV Workshops.

[17]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[18]  Bin Jiang,et al.  A distributed image-retrieval method in multi-camera system of smart city based on cloud computing , 2018, Future Gener. Comput. Syst..

[19]  Pascal Fua,et al.  Tracking Interacting Objects Using Intertwined Flows , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Pascal Fua,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Multiple Object Tracking Using K-shortest Paths Optimization , 2022 .

[22]  Biswajit Bose,et al.  Multi-class object tracking algorithm that handles fragmentation and grouping , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  A. M. Khalili,et al.  Quantum particle filter: a multiple mode method for low delay abrupt pedestrian motion tracking , 2015 .

[24]  Gérard G. Medioni,et al.  Exploring context information for inter-camera multiple target tracking , 2014, IEEE Winter Conference on Applications of Computer Vision.

[25]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[26]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[27]  Xiaoqin Zhang,et al.  Single and Multiple Object Tracking Using Log-Euclidean Riemannian Subspace and Block-Division Appearance Model , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Ramakant Nevatia,et al.  Robust Object Tracking by Hierarchical Association of Detection Responses , 2008, ECCV.

[29]  Bin Jiang,et al.  3D Panoramic Virtual Reality Video Quality Assessment Based on 3D Convolutional Neural Networks , 2018, IEEE Access.

[30]  Liang Lin,et al.  Deep feature learning with relative distance comparison for person re-identification , 2015, Pattern Recognit..

[31]  Kaiqi Huang,et al.  A novel solution for multi-camera object tracking , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[32]  George Chen,et al.  Pedestrian Detection and Tracking Using HOG and Oriented-LBP Features , 2011, NPC.

[33]  Bastian Leibe,et al.  Real-time multi-person tracking with detector assisted structure propagation , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[34]  Kaiqi Huang,et al.  Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Bi Song,et al.  A Stochastic Graph Evolution Framework for Robust Multi-target Tracking , 2010, ECCV.

[36]  Ming-Hsuan Yang,et al.  Bayesian Multi-object Tracking Using Motion Context from Multiple Objects , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[37]  Jizhou Zhang,et al.  Real-Time Tracking Framework with Adaptive Features and Constrained Labels , 2016, Sensors.

[38]  Cláudio Rosito Jung,et al.  Combining patch matching and detection for robust pedestrian tracking in monocular calibrated cameras , 2014, Pattern Recognit. Lett..

[39]  Richard D. Green,et al.  Object Recognition by Stochastic Metric Learning , 2014, SEAL.

[40]  Stan Sclaroff,et al.  Online Multi-person Tracking by Tracker Hierarchy , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[41]  Takahiro Okabe,et al.  Using individuality to track individuals: Clustering individual trajectories in crowds using local appearance and frequency trait , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[42]  Silvio Savarese,et al.  Learning to Track: Online Multi-object Tracking by Decision Making , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[43]  Kaiqi Huang,et al.  An Equalized Global Graph Model-Based Approach for Multicamera Object Tracking , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[44]  Hironobu Fujiyoshi,et al.  A Method for Visualizing Pedestrian Traffic Flow Using SIFT Feature Point Tracking , 2009, PSIVT.

[45]  Pascal Fua,et al.  Multi-Commodity Network Flow for Tracking Multiple People , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Pascal Fua,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Lu Zhang,et al.  Structure Preserving Object Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[49]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.