Multi-Object Tracking Algorithm for RGB-D Images Based on Asymmetric Dual Siamese Networks

Currently, intelligent security systems are widely deployed in indoor buildings to ensure the safety of people in shopping malls, banks, train stations, and other indoor buildings. Multi-Object Tracking (MOT), as an important component of intelligent security systems, has received much attention from many researchers in recent years. However, existing multi-objective tracking algorithms still suffer from trajectory drift and interruption problems in crowded scenes, which cannot provide valuable data for managers. In order to solve the above problems, this paper proposes a Multi-Object Tracking algorithm for RGB-D images based on Asymmetric Dual Siamese networks (ADSiamMOT-RGBD). This algorithm combines appearance information from RGB images and target contour information from depth images. Furthermore, the attention module is applied to repress the redundant information in the combined features to overcome the trajectory drift problem. We also propose a trajectory analysis module, which analyzes whether the head movement trajectory is correct in combination with time-context information. It reduces the number of human error trajectories. The experimental results show that the proposed method in this paper has better tracking quality on the MICC, EPFL, and UMdatasets than the previous work.

[1]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Hua Yang,et al.  Online Multi-Object Tracking with Dual Matching Attention Networks , 2018, ECCV.

[4]  Ruigang Yang,et al.  A Unified Object Motion and Affinity Model for Online Multi-Object Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jiun-In Guo,et al.  A smart surveillance system with multiple people detection, tracking, and behavior analysis , 2016, 2016 International Symposium on VLSI Design, Automation and Test (VLSI-DAT).

[6]  Mubarak Shah,et al.  Deep Affinity Network for Multiple Object Tracking , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Qijun Zhao,et al.  Siamese Network for RGB-D Salient Object Detection and Beyond , 2020, ArXiv.

[8]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[9]  Volker Eiselein,et al.  Real-Time Multi-human Tracking Using a Probability Hypothesis Density Filter and Multiple Detectors , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[10]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[11]  Ahmad Jalal,et al.  Multi-Person Tracking in Smart Surveillance System for Crowd Counting and Normal/Abnormal Events Detection , 2019, 2019 International Conference on Applied and Engineering Mathematics (ICAEM).

[12]  Daniel Cremers,et al.  MOT20: A benchmark for multi object tracking in crowded scenes , 2020, ArXiv.

[13]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[15]  Ronghua Xu,et al.  Real-Time Human Objects Tracking for Smart Surveillance at the Edge , 2018, 2018 IEEE International Conference on Communications (ICC).

[16]  Wei Wu,et al.  Multi-Object Tracking with Multiple Cues and Switcher-Aware Classification , 2019, ArXiv.

[17]  Pascal Fua,et al.  Probability occupancy maps for occluded depth images , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Suchi Saria,et al.  Deformable Distributed Multiple Detector Fusion for Multi-Person Tracking , 2015, ArXiv.

[19]  D. Avitzour Stochastic simulation Bayesian approach to multitarget tracking , 1995 .

[20]  Zhipeng Zhang,et al.  Deeper and Wider Siamese Networks for Real-Time Visual Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[22]  Shijun Liu,et al.  Employing Shadows for Multi-Person Tracking Based on a Single RGB-D Camera , 2020, Sensors.

[23]  V. Beran,et al.  Depth-Based Filtration for Tracking Boost , 2015, ACIVS.

[24]  In So Kweon,et al.  Convolutional Block Attention Module , 2018, ECCV 2018.

[25]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Volker Eiselein,et al.  High-Speed tracking-by-detection without using image information , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[27]  Shahram Payandeh,et al.  Deep Attention Models for Human Tracking Using RGBD , 2019, Sensors.

[28]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[30]  Haibin Ling,et al.  Rank-1 Tensor Approximation for High-Order Association in Multi-target Tracking , 2019, International Journal of Computer Vision.

[31]  Yang Zhang,et al.  Heterogeneous Association Graph Fusion for Target Association in Multiple Object Tracking , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Zhaoyang Lu,et al.  Joint Deep and Depth for Object-Level Segmentation and Stereo Tracking in Crowds , 2019, IEEE Transactions on Multimedia.

[33]  Ying Cui,et al.  Real-time human detection and tracking in complex environments using single RGBD camera , 2013, 2013 IEEE International Conference on Image Processing.

[34]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[35]  Silvio Savarese,et al.  Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Qixiang Ye,et al.  Context-aware network for RGB-D salient object detection , 2021, Pattern Recognit..

[37]  N. Gordon A hybrid bootstrap filter for target tracking in clutter , 1995, IEEE Transactions on Aerospace and Electronic Systems.

[38]  Sergiu Nedevschi,et al.  Tracking multiple objects using particle filters and digital elevation maps , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[39]  Bin Hui,et al.  Pedestrian Flow Tracking and Statistics of Monocular Camera Based on Convolutional Neural Network and Kalman Filter , 2019, Applied Sciences.

[40]  Silvio Savarese,et al.  Ieee Transaction on Pattern Analysis and Machine Intelligence 1 a General Framework for Tracking Multiple People from a Moving Camera , 2022 .

[41]  Haibin Ling,et al.  FAMNet: Joint Learning of Feature, Affinity and Multi-Dimensional Assignment for Online Multiple Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Dietrich Paulus,et al.  Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[43]  Yao Lu,et al.  RGB-D Object Tracking with Occlusion Detection , 2019, 2019 15th International Conference on Computational Intelligence and Security (CIS).

[44]  Konrad Schindler,et al.  Multi-Target Tracking by Discrete-Continuous Energy Minimization , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Ye Liu,et al.  An ultra-fast human detection method for color-depth camera , 2015, J. Vis. Commun. Image Represent..

[47]  Kwang-Eun Ko,et al.  Deep convolutional framework for abnormal behavior detection in a smart surveillance system , 2018, Eng. Appl. Artif. Intell..

[48]  Chao Deng,et al.  Hierarchical multi-modal fusion FCN with attention model for RGB-D tracking , 2019, Inf. Fusion.

[49]  M. Omair Ahmad,et al.  Online multi-object tracking via robust collaborative model and sample selection , 2017 .

[50]  Qi Wang,et al.  Multi-cue based tracking , 2014, Neurocomputing.

[51]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[52]  Xue-Bai Zhang,et al.  Eye Tracking Based Control System for Natural Human-Computer Interaction , 2017, Comput. Intell. Neurosci..

[53]  Ye Liu,et al.  Detecting and tracking people in real time with RGB-D camera , 2015, Pattern Recognit. Lett..